Closed Bug 1017443 Opened 11 years ago Closed 9 years ago

Sync servers configured for hard-eol responses causes Fx to say there is a "server error" rather than displaying the EOL messaging.

Categories

(Firefox :: Sync, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: markh, Unassigned)

References

Details

(Whiteboard: [qa+])

See bug 1014411 comment 6 - Ryan setup a "hard-eol" server which returns a 513 on every request. Sadly this causes the login process to fail (due to info/collections seeing s 513) and even though the EOL handling code does see the message etc, that login error effectively "trumps" the EOL messaging - with the end result being an infobar saying there is a server error, and the EOL infobar isn't seen. (A quick experiment with having info/collections returning an empty JSON blob means the next failure has to do with meta/global getting a 513, and an empty JSON blob for that response will cause the client to get upset in various creative ways) This could probably be fixed client-side, but that still leaves earlier Fx versions screwed. rnewman, any ideas/thoughts? Log below (with timestamps removed) Sync.Service INFO Logging in user vusnj3h2crfvpzuam7mbtmdyyrx5rm5n Sync.Service DEBUG Caching URLs under storage user base: https://sync-hard-eol.stage.mozaws.net/1.1/vusnj3h2crfvpzuam7mbtmdyyrx5rm5n/ Sync.Resource DEBUG mesg: GET fail 513 https://sync-hard-eol.stage.mozaws.net/1.1/vusnj3h2crfvpzuam7mbtmdyyrx5rm5n/info/collections Sync.Resource DEBUG GET fail 513 https://sync-hard-eol.stage.mozaws.net/1.1/vusnj3h2crfvpzuam7mbtmdyyrx5rm5n/info/collections Sync.Status DEBUG Status.login: success.login => error.login.reason.server Sync.Status DEBUG Status.service: success.status_ok => error.login.failed Sync.ErrorHandler ERROR X-Weave-Alert: hard-eol: SYNC HAS SUNK Sync.Service TRACE Event: weave:service:login:error Sync.SyncScheduler TRACE Handling weave:service:login:error Sync.SyncScheduler DEBUG Clearing sync triggers and the global score. Sync.SyncScheduler TRACE _checkSync returned "". Sync.SyncScheduler DEBUG Next sync in 86400000 ms. Sync.ErrorHandler TRACE Handling weave:service:login:error Sync.ErrorHandler DEBUG Flushing file log. Sync.ErrorHandler TRACE Beginning stream copy to error-1401345291948.txt: 1401345291949 Sync.Service DEBUG Exception: Login failed: error.login.reason.server No traceback available Sync.Service DEBUG Not syncing: login returned false. Sync.ErrorHandler TRACE Notifying weave:ui:login:error. Status.login is error.login.reason.server. Status.sync is success.sync
Whiteboard: [qa+]
The server-side solution to this appears to be to only send hard-eol on some requests, but not on others. Which we can do, but it could mean leaving actual functioning (albeit read-only) sync nodes in place, which sounds expensive. Ideally, we'd like to redirect all sync-1.1 traffic to a simple static "EOLinator" service that just returns the appropriate error responses. So: can we return some fake static data in /info/collections, /storage/meta/global and so-on what will help coerce the clients into the correct state? (I'd like to fix it server-side if it's simple enough, so that FF28 holdouts get good messaging)
Flags: needinfo?(rnewman)
After some experimentation, the real trick here seems to be /crypto/keys. I can serve a fake /meta/global to the client, but as soon as it detects something is not quite right, it heads into its "wipe the server" routine and tried to upload a new set of keys. It wont proceed with the actual sync until it can verify that the keys are correctly uploaded, which precludes serving static data for this item. I successfully activated the "Your Firefox Sync service is no longer available" message by accepting writes to /meta/global and /crypto/keys, but having all the other URLs return the 513 error.
Flags: needinfo?(rnewman)
So it sounds like services are able to help us work around this client problem. Fixing the problem on the client sounds worthwhile at face value, but in practice, we can't fix it for all affected versions - so having, say, Fx 32 and up handle this correctly when none of the earlier versions do doesn't solve anything worthwhile. Should we close this (or at least have it not block bug 1014406)?
This doesn't block the migration but instead it blocks our strategy for decommissioning after migration has been in place for a while.
Blocks: 1008066
No longer blocks: migratesync
Given how far we are into the migration, I'm betting this is WONTFIX
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox
You need to log in before you can comment on or make changes to this bug.