Closed Bug 1640884 Opened 5 years ago Closed 4 years ago

Preset staging main - collection sync issues

Categories

(Firefox :: Search, defect)

78 Branch
defect

Tracking

()

RESOLVED INVALID
Tracking Status
firefox-esr68 --- unaffected
firefox76 --- unaffected
firefox77 --- unaffected
firefox78 --- wontfix

People

(Reporter: phorea, Unassigned)

References

Details

(Keywords: regression)

[Description:]

When using any localized build with an overridden region, the search-config bucket is not synced with the main staging collection. The RS sync appears to be successful but the RS successful poll is not reflected in the correct update of search-config.

With dump false all good, populates directly the expected staging main collection.

[Environments:]

Windows 10
Ubuntu 18.04
Mac 10.13.6

 reproducible using                  Nightly 78.0a1      2020-05-25
 *not reproducible* using            Nightly 78.0a1      2020-05-11 
[Pre-conditions:]

user.js containing:
user_pref("services.settings.server", "https://settings.stage.mozaws.net/v1");
user_pref("security.content.signature.root_hash", "DB:74:CE:58:E4:F9:D0:9E:E0:42:36:BE:6C:C5:C4:F6:6A:E7:74:7D:C0:21:42:7A:03:BC:2F:57:0C:8B:9B:90");
user_pref("services.settings.load_dump", true);
user_pref("services.settings.default_bucket", "main");
user_pref("browser.search.log", true);
+
user_pref("browser.search.region", "chosen_region");

locale/region used to reproduce
zn-cn/CN
en-US/US
wo/CN
ru/RU
es-ES/ES

[Steps:]
  1. Download the localized build.
  2. Create a new profile and add the specified user.js in the profile folder before start-up.
  3. Start-up Firefox.
  4. Check browser console for SearchEngineSelector: Search configuration updated remotely
  5. Check about:config values for services.settings.last_update_seconds and services.settings.last_etag
  6. Restart or leave idle for 5’ after either step 4 or step 5 are completed.
[Actual Results:]

Step 4 never shows the log for Search configuration updated remotely

services.settings.last_update_seconds: 1590480828 ~ 05/26/2020 @ 8:13am (UTC)
services.settings.last_etag : "1590451272056" ~ 05/26/2020 @ 12:01am (UTC)
browser console log: https://pastebin.com/BRBn8f9v

The loaded engines are the production main engines, not the staging main engines and the search config update is not happening no matter the wait time.

[Expected Result:]

Even though it would be expected that the main production collection is initially first used since the dump only contains production main, at the RS sync, it would be expected that the collection gets updated and the user.js staging string is applied to acquire the staging main collection and engines to be updated to reflect the staging main.

[Note:]

dump false - issue is not reproducible since it will acquire the main staging engines @ start-up
Although this seems more like an RS issue rather than a search-config issue, these updates problems might be relevant to production environment updates, which QA cannot use to test and confirm the issue.

[Regression Range:]

The issue doesn’t reproduce with Nightly 78.0a1 2020-05-11 (tried with two locales: es-ES and en-US), therefore it appears to be a regression.
Unfortunately due to the fact that the preconditions are rather difficult to use with a moz-regression, I don’t think it is going to be reliable to try to get one.

Managed to use mozregression and to find the pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=90fc07d2d16275ac74f4d55bb3aa2ee20887c79c&tochange=dec39daedd043c2d82d9e8ebd593a62b38947571

Mozregression points to bug 1636495, but the pushlog contains several SM bugs:

 Found commit message:
 Backed out 3 changesets (bug 1636495) for SM bustages at huge-01.binjs. CLOSED TREE

 Backed out changeset 5be0a4315674 (bug 1636495)
 Backed out changeset 7ac33283a786 (bug 1636495)
 Backed out changeset 4b98c08423c9 (bug 1636495)
Has Regression Range: --- → yes
Has STR: --- → yes

Dale, is the severity for this correct? The determined regression range looks suspect.

Flags: needinfo?(dharvey)

I'm looking at this next, I'm not totally sure it is an issue, I think it is one of timing.

Flags: needinfo?(dharvey) → needinfo?(standard8)

Mathieu and I think the test case isn't right here, we're going to look at it tomorrow, for now putting back into triage.

Severity: S2 → --
Flags: needinfo?(standard8) → needinfo?(mathieu)

Mathieu and I have been looking at this and doing some testing. stage/main was updated yesterday, and the previously failing builds are now passing.

The fundamental issue is that the dump data was newer than what the server was publishing. As a result when checking for any changes from the server, none would be found.

So test requirements should ensure that the staging server has a newer dump than the local data.

Since bug 1640023 is now fixed, there is also an additional protection: after a restart, we will obtain the metadata and changes, find they don't match with the local dump and then fetch fresh data.

Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(mathieu)
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.