Closed Bug 989045 Opened 11 years ago Closed 11 years ago

[tps] Registering of observer topics happens too late so 'weave:engine:start-tracking' can be missed during startup

Categories

(Testing Graveyard :: TPS, defect)

defect
Not set
major

Tracking

(firefox29 wontfix, firefox30 fixed, firefox31 fixed)

RESOLVED FIXED
mozilla31
Tracking Status
firefox29 --- wontfix
firefox30 --- fixed
firefox31 --- fixed

People

(Reporter: whimboo, Assigned: whimboo)

References

Details

(Keywords: hang)

Attachments

(2 files)

Executing the add-on tests some are failing after a restart in waiting for 'weave:engine:start-tracking'. Not sure yet, if that is a problem with the tests or the tps framework. I will investigate. Here the hanging lines: CROSSWEAVE INFO: Waiting for weave:service:ready... CROSSWEAVE INFO: weave:service:ready observed! CROSSWEAVE INFO: Starting phase 3/8 CROSSWEAVE INFO: setting client.name to profile1 CROSSWEAVE INFO: ----------event observed: sessionstore-windows-restored CROSSWEAVE INFO: starting action: EnsureTracking CROSSWEAVE INFO: Waiting for weave:engine:start-tracking...
Actually we hang in those phases when EnsureTracking() is called as first action. I'm not exactly sure yet what's the exact purpose of it. Jonathan, any idea?
Summary: [tps] Some add-on tests are hanging after a browser restart → [tps] Some add-on tests hang in EnsureTracking()
Ok, this method has been implemented along with bug 712715 to ensure that sync is definitely active before starting the steps in the phase. So it might be a regression in Sync 1.5. I will do some more testing.
fxaccounts: CROSSWEAVE INFO: Firefox Accounts enabled: true 1395953994883 addons.manager WARN Exception calling callback CROSSWEAVE INFO: Waiting for weave:service:ready... CROSSWEAVE INFO: weave:service:ready observed! CROSSWEAVE INFO: Starting phase 5/16 CROSSWEAVE INFO: setting client.name to profile1 CROSSWEAVE INFO: ----------event observed: sessionstore-windows-restored CROSSWEAVE INFO: starting action: EnsureTracking CROSSWEAVE INFO: Waiting for weave:engine:start-tracking... old: CROSSWEAVE INFO: Firefox Accounts enabled: false CROSSWEAVE INFO: Waiting for weave:service:ready... 1395953825539 addons.manager WARN Exception calling callback CROSSWEAVE INFO: weave:service:ready observed! CROSSWEAVE INFO: Starting phase 5/16 CROSSWEAVE INFO: setting client.name to profile1 CROSSWEAVE INFO: ----------event observed: sessionstore-windows-restored CROSSWEAVE INFO: starting action: EnsureTracking CROSSWEAVE INFO: Setting client credentials and login. CROSSWEAVE INFO: Logging in user: crossweaveservices@mozilla.com CROSSWEAVE INFO: ----------event observed: weave:service:setup-complete CROSSWEAVE INFO: ----------event observed: weave:engine:start-tracking So with fxaccounts we do not login anymore for that phase given that the user is logged in, while for the old sync auth was logged out? Edwin mind checking test_addon_nonrestartless_xpi.js? Do you see something suspicious? It might be that Sync 1.1 was broken in a way that we needed this method to explicitely login and to wait for tracking. For fxaccounts I can remove this call and the test works fine. So I think we can remove this call for fxaccounts but not for the old sync authentication.
Flags: needinfo?(edwong)
Summary: [tps] Some add-on tests hang in EnsureTracking() → [tps] Tests calling EnsureTracking() as first step in a phase will hang with fxaccount authentication
I can look at this tomorrow.
Blocks: 981706
No longer depends on: 981706
For reference these are the affected tests: > test_addon_restartless_xpi.js > test_addon_nonrestartless_xpi.js
So from the other bug 712715: A new action "EnsureTracking" is available and utilized by the addons tests. This action ensures Sync is logged in and that tracking is active. This works around the original issue in this bug. It is required by the addons engine only because the addons reconciler waits for start-tracking before becoming active. Before, we had a race condition between Sync and TPS and the reconciler wasn't fully initialized before TPS starting performing actions. Since it wasn't initialized, performed actions were getting lost. These were discovered on the 2nd sync (after start-tracking was called for the first time) and the engine regained a proper view on the world. So, I'm confident there was no bug inside the addons engine. So as it looks like we are missing the 'weave:engine:start-tracking' observer notification sent by Sync when you are logged in via Firefox Accounts.
Depends on: 989282
> So with fxaccounts we do not login anymore for that phase given that the > user is logged in, while for the old sync auth was logged out? For Sync 1.5, we auth, then we get a token 1 time with ttl of ~12hrs. We don't fetch token again until there's a browser restart or token expiry. > Edwin mind checking test_addon_nonrestartless_xpi.js? Do you see something > suspicious? It might be that Sync 1.1 was broken in a way that we needed > this method to explicitely login and to wait for tracking. Without digging too deep, it's likely sync doesn't start till we have a sync token so tracking isn't used anymore? Just guessing. > > For fxaccounts I can remove this call and the test works fine. So I think we > can remove this call for fxaccounts but not for the old sync authentication. this sounds good
Flags: needinfo?(edwong)
(In reply to Edwin Wong [:edwong] from comment #7) > > Edwin mind checking test_addon_nonrestartless_xpi.js? Do you see something > > suspicious? It might be that Sync 1.1 was broken in a way that we needed > > this method to explicitely login and to wait for tracking. > > Without digging too deep, it's likely sync doesn't start till we have a sync > token so tracking isn't used anymore? Just guessing. Tracking is still used. Not sure why it's not send here. Also given the above comment the addons module waits for it to start the reconciler. I imagine it is a problem with Sync. Whom can we ask for more detailed information here? > > For fxaccounts I can remove this call and the test works fine. So I think we > > can remove this call for fxaccounts but not for the old sync authentication. > > this sounds good Meanwhile I'm not hesitated to do that right now. As said above I assume there might be a problem with Sync 1.5 here.
With enabled Trace mode I can see the following: CROSSWEAVE INFO: starting action: EnsureTracking 1396042757354 Sync.Tracker.Tabs TRACE onTab event: pageshow CROSSWEAVE INFO: Waiting for weave:engine:start-tracking... 1396042757790 Sync.RESTResponse TRACE Processing response headers. 1396042757794 Sync.BrowserIDManager DEBUG Getting a token 1396042759032 Sync.RESTResponse TRACE Processing response headers. 1396042759033 Sync.RESTResponse TRACE Processing response headers. 1396042759033 Sync.BrowserIDManager DEBUG Successfully got a sync token 1396042759035 Sync.BulkKeyBundle INFO BulkKeyBundle being created for undefined 1396042759036 Sync.BrowserIDManager INFO Background fetch for key bundle done 1396042759036 Sync.Status DEBUG Status.login: success.login => success.login 1396042759036 Sync.Status DEBUG Status.service: success.status_ok => success.status_ok 1396042759748 Sync.Status DEBUG Status.service: success.status_ok => success.status_ok 1396042766513 Sync.Status DEBUG Status.service: success.status_ok => success.status_ok So the sync status is set to ok, but we actually never fire start-tracking. Brian do you have an idea or is that part outside of your expertise?
Flags: needinfo?(warner-bugzilla)
Blocks: 989249
Attached file tps.log (trace) (deleted) —
Trace log for the test_addon_nonrestartless_xpi.js test. I will dig through all that by tomorrow if I find the time.
I don't know how Sync's internals work.. my knowledge ends at the point where we fire the onlogin event and offer the keys+assertion to Sync. Sorry :).
Flags: needinfo?(warner-bugzilla)
So 'weave:engine:start-tracking' is currently getting fired at two places: 1. Very early during the start-up phase in onStartup(): http://mxr.mozilla.org/mozilla-central/source/services/sync/modules/service.js#356 Given that with FxAccounts we are logged in, we would have to get into this code and the notification fired. It might be that we are subscribing too late for that notification in TPS. 2. By the 'weave:service:setup-complete' observer: http://mxr.mozilla.org/mozilla-central/source/services/sync/modules/service.js#469 Given that we run through those steps only once after the fxaccount login, it will never be send again. We usually see this after login, so this should not be the code we are looking for. I will do some debugging in that module and check why we don't see this notification.
Ok, so I was right with my assumption that TPS is registering for those observer notifications too late. Moving the registering to the new _init() method makes it passing our tests, and no more hang occurs. I will do some more tests and also check with the old sync authentication before uploading the patch.
Component: Server: Sync → TPS
Product: Mozilla Services → Testing
Summary: [tps] Tests calling EnsureTracking() as first step in a phase will hang with fxaccount authentication → [tps] Registering of observer topics happens too late so 'weave:engine:start-tracking' can be missed during startup
Attachment #8399561 - Flags: review?(jgriffin)
Keywords: hang
Comment on attachment 8399561 [details] [diff] [review] Early registering of observer topics v1 Review of attachment 8399561 [details] [diff] [review]: ----------------------------------------------------------------- Good detective work!
Attachment #8399561 - Flags: review?(jgriffin) → review+
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Comment on attachment 8399561 [details] [diff] [review] Early registering of observer topics v1 [Approval Request Comment] Bug caused by (feature/regressing bug #): None User impact if declined: None, just testing code Testing completed (on m-c, etc.): m-c Risk to taking this patch (and alternatives if risky): tps tests can hang String or IDL/UUID changes made by this patch: None
Attachment #8399561 - Flags: approval-mozilla-aurora?
The following changeset is now in Firefox Nightly: > ecda58bb90a7 Bug 989045 - [tps] Registering of observer topics happens too late so 'weave:engine:start-tracking' can be missed during startup. r=jgriffin DONTBUILD Nightly Build Information: ID: 20140402030201 Changeset: 4941a2ac0786109b08856738019b016a6c5a66a6 Version: 31.0a1 TBPL: https://tbpl.mozilla.org/?rev=4941a2ac0786 URL: https://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/04/2014-04-02-03-02-01-mozilla-central Download Links: > Linux x86: https://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/04/2014-04-02-03-02-01-mozilla-central/firefox-31.0a1.en-US.linux-i686.tar.bz2 > Linux x86_64: https://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/04/2014-04-02-03-02-01-mozilla-central/firefox-31.0a1.en-US.linux-x86_64.tar.bz2 > Linux x86_64 ASAN: https://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/04/2014-04-02-03-02-01-mozilla-central/firefox-31.0a1.en-US.linux-x86_64-asan.tar.bz2 > Mac: https://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/04/2014-04-02-03-02-01-mozilla-central/firefox-31.0a1.en-US.mac.dmg > Win32: https://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/04/2014-04-02-03-02-01-mozilla-central/firefox-31.0a1.en-US.win32.installer.exe > Win64: https://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/04/2014-04-02-03-02-01-mozilla-central/firefox-31.0a1.en-US.win64-x86_64.installer.exe Previous Nightly Build Information: ID: 20140401030203 Changeset: 1417d180a1d8665b1a91b897d1cc4cc31e7980d4 Version: 31.0a1 TBPL: https://tbpl.mozilla.org/?rev=1417d180a1d8 URL: https://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014/04/2014-04-01-03-02-03-mozilla-central
Comment on attachment 8399561 [details] [diff] [review] Early registering of observer topics v1 Given the low risk here and the affected on Beta, please also uplift there if it can land cleanly otherwise put up a new branch-specific patch for approval.
Attachment #8399561 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: