Closed Bug 1630050 Opened 5 years ago Closed 4 years ago

A race condition on startup can lead to the wrong engine being set as default

Categories

(Firefox :: Search, defect, P2)

defect
Points:
3

Tracking

()

VERIFIED FIXED
Firefox 79
Iteration:
79.1 - June 1 - June 14
Tracking Status
firefox78 --- verified
firefox79 --- verified

People

(Reporter: standard8, Assigned: standard8)

References

Details

Attachments

(1 file)

Whilst working on another bug, I found a case where a race condition would lead to the wrong engine being set as default on startup.

Practically speaking, I think this is rare and unlikely, though I have seen reports in the wild of related things happening. The STR I have are ones that wouldn't happen for a normal user.

Manufactured STR:

  1. In code in fetchRegion, arrange for the region to be overridden to set it to RU. The network call should still happen at this point.
  2. Set up a fresh Firefox profile, add the search bar to the toolbar.
  3. Shutdown Firefox.
  4. Remove the browser.search.region preference from the profile's prefs.js
  5. Remove search.json.mozlz4 from the profile.
  6. Start Firefox.

I can reproduce this on builds back to release, I haven't gone further as I suspect it isn't really a regression. On Mac I can reproduce that Amazon.com gets set as default 100% of the time with the above STR. If I avoid the network fetch, so that the region is set before list.json (or the modern config) is read, then it correctly sets the default engine.

This is unlikely to be reproduced by a user as we don't normally reset the browser.search.region preference, and new profiles don't have the toolbar on automatically.

The reason we don't have an issue if the network call doesn't happen, is that the configuration is read with the correct region, so we'll already have the default engine loaded.

When the region hasn't been set already, then we kick in a _maybeReloadEngines function after initialisation. During that, this sequence of events occur:

  • We null out the currently cached default engine (this._currentEngine).
  • We set the list of engines to the new value, and store a WebExtension id/locale reference for the new default (in this._searchDefault).
  • We start to load the new list of engines. In the RU case this starts with Yandex.
  • We load the extension, and create the new SearchEngine object.
  • We then call engine._initFromMetadata to initalise the engine's data.
  • This calls _setIcon which in turn notifies observers that an engine-changed.
  • engine-changed is monitored by ContentSearch which rebuilds it caches when it gets notifications.
  • As a result, ContentSearch calls back to the search service to get the default engine.
  • The search service doesn't have a cached engine, it can't find the new _searchDefault as it isn't inserted into the list of engines yet, and so it falls back to the first engine in the list of search engines.
  • In my tests this is either amazon.com or Google. amazon.com was set as the profile originally had useDBForOrder set to true. Google was chosen when I set it to false. In both cases, these are wrong as the default should be Yandex due to the RU region.

Although these STR are unlikely to happen normally, I suspect that slow disks, and other timing issues might just occasionally kick this in. I think a short term fix is to stop notifications happening during _maybeReloadEngines and use the existing reloaded engines notification to update displays where appropriate.

The longer term fix is to rewrite _maybeReloadEngines to only provide incremental updates (already covered by bug 1610293), and only change the defaults once the new engines have actually been loaded.

(In reply to Mark Banner (:standard8) from comment #1)

Although these STR are unlikely to happen normally, I suspect that slow disks, and other timing issues might just occasionally kick this in. I think a short term fix is to stop notifications happening during _maybeReloadEngines and use the existing reloaded engines notification to update displays where appropriate.

Kudos for the analysis, Mark! I'm in favour of this approach, TBH, as it seems like a solution to more possible issues. Let's hold off on notifying anything when we're rebuilding the world. I do suspect this'll break assumption baked into our tests ;-P

The longer term fix is to rewrite _maybeReloadEngines to only provide incremental updates (already covered by bug 1610293), and only change the defaults once the new engines have actually been loaded.

Yeah, that'd be nice... however, why not do both? One other possibility: buffer notifications whilst rebuilding and send them all when it's done in idle time?

(In reply to Mike de Boer [:mikedeboer] from comment #2)

The longer term fix is to rewrite _maybeReloadEngines to only provide incremental updates (already covered by bug 1610293), and only change the defaults once the new engines have actually been loaded.

Yeah, that'd be nice... however, why not do both? One other possibility: buffer notifications whilst rebuilding and send them all when it's done in idle time?

If we do our incremental updates right, then I think it won't matter if we batch notifications or not. However, I'm starting to think it might be worth re-looking at our notifications and subscribers to them and figure out what we really need. I'm sure we're sending out way more than necessary at the moment.

Iteration: 77.1 - Apr 6 - Apr 19 → 77.2 - Apr 20 - May 3
Iteration: 77.2 - Apr 20 - May 3 → 78.1 - May 4 - May 17

Dropping down priority as we fixed the things I was concerned about in a different way. We should still fix this soon, but at the moment it doesn't affect everyone as much as expected.

Severity: -- → S3
Priority: P1 → P2
Iteration: 78.1 - May 4 - May 17 → 78.2 - May 18 - May 31
Iteration: 78.2 - May 18 - May 31 → ---
Pushed by mbanner@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/63debcb0ed5f Don't notify about changes to search engine icons before the search engine has been initialised and added to the list. r=daleharvey
Backout by malexandru@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/5d427f1ae338 Backed out changeset 63debcb0ed5f for causing browser-chrome failures.

Eventually figured out the failures here. The first one, browser_ContentSearch.js was easy as I'd missed it was there, and it was counting notifications.

browser_oneOffContextMenu.js was harder - running on its own it would pass. However, running after browser_426329.js would fail. With help from Marco, we eventually tracked this down to an initialisation issue for the search-one-offs. Removing the engine-changed notification stopped triggering the initialisation early, and so it would start it when the engine-loaded/engine-added notifications happened.

browser_426329.js was structured so that the test would do searchbar.value = 'test' whilst still in the engine-added and engine-default observer notifications. The same event loops that were also separately triggering the search-one-off initialisations.

Somehow this was causing the search popup to be initialised twice, and that was just failing.

I've incorporated a couple of fixes:

  1. Don't set searchbar.value from within the observer notifications, this wouldn't be able to happen in real life, so it isn't necessary.
  2. In searchbar.js when handling onBeforeValueSet we will now only set the query if the one-off-buttons are already initialised. If they aren't initialised then there is no point. We don't need to worry about when they do get initialised, as all the query is actually used for is to let the search bar know to clear the settings button if it is selected. The code there looks more complicated than that, but it should get simplified at some stage - I've just filed bug 1645411 on this.
Flags: needinfo?(standard8)
Pushed by mbanner@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/d9370f9279da Don't notify about changes to search engine icons before the search engine has been initialised and added to the list. r=daleharvey
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 79

Comment on attachment 9155211 [details]
Bug 1630050 - Don't notify about changes to search engine icons before the search engine has been initialised and added to the list. r?daleharvey!

Beta/Release Uplift Approval Request

  • User impact if declined: Users may be a higher risk of getting the wrong default search engine - most likely after first install, but potentially after upgrade.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: Fundamentally, this is a race condition, but I think potentially starting a new profile with an en-US build whilst VPNed into Russia or region where Google isn't default might reproduce this.

Other testing would be to generally check that adding a search engine (WebExtension or OpenSearch) still works the same.

  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Small change to how notifications work to how they always should have worked. All the observers have been reviewed for possible issues.
  • String changes made/needed: None
Attachment #9155211 - Flags: approval-mozilla-beta?
Flags: qe-verify+
Iteration: --- → 79.1 - June 1 - June 14
Blocks: 606886

Comment on attachment 9155211 [details]
Bug 1630050 - Don't notify about changes to search engine icons before the search engine has been initialised and added to the list. r?daleharvey!

approved for 78.0b8

Attachment #9155211 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
QA Whiteboard: [qa-triaged]

Environment: Windows 10 64-bit

Reproduced the initial issue with Firefox 78 beta 1 en-US build - VPNed into Russia, on Firefox start-up. After restart the default search engine was correctly set.
The issue no longer occurs with Firefox 78RC build and latest Nightly 79.0a1 2020-06-24. Also, I saw no issues while adding search engines (WebExtension or OpenSearch), thus confirming the fix.

Status: RESOLVED → VERIFIED
QA Whiteboard: [qa-triaged]
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: