805386 - history failed: NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS

Reporter

Description

•

12 years ago

Attached file last log (deleted) — Details

I started a clean profile, and registered firefox with my sync account, and get the message saying that sync encountered an error ("in french: Sync a rencontré une erreur lors de la synchronisation : Erreur inconnue. Sync essaiera à nouveau cette action automatiquement") This errors occur each time a new sync occurs. in the logs, i see some NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS. Note, I have tested several times with a new clean profile, and it occurs every time.

CircleCode

Reporter

Comment 1

•

12 years ago

Attached file first log (deleted) — Details

in case you need intermediate logs, I can attach them too

CircleCode

Reporter

Updated

•

12 years ago

Component: General → Firefox Sync: Backend

Product: Firefox → Mozilla Services

Version: 18 Branch → unspecified

CircleCode

Reporter

Comment 2

•

12 years ago

By the way, I'm using aurora (18)

Richard Newman [:rnewman]

Comment 3

•

12 years ago

See also Bug 725785, Bug 798433, Bug 789109.

Status: UNCONFIRMED → NEW

Ever confirmed: true

Gregory Szorc [:gps]

Comment 4

•

12 years ago

Oh lovely - another history engine error. I'm a bit concerned by the number of people that have been reporting this in Firefox 18... What's interesting about this report is that it is reproducible on a fresh profile. I don't believe we've had such luck with the similar bugs. So, the Sync data should contain everything necessary to reproduce. The easiest way for us to look into this would be to get our hands on the actual Sync records. But, that would involve you turning over your private data to us. I'm not comfortable asking for that. rnewman: what do you think about writing up a quick script to mass download history records, decrypt, scramble the URIs, and rewrite to a file that can be uploaded to this bug? We could then inject those records in a fresh profile and try to repro.

Richard Newman [:rnewman]

Comment 5

•

12 years ago

(In reply to Gregory Szorc [:gps] from comment #4) > rnewman: what do you think about writing up a quick script to mass download > history records, decrypt, scramble the URIs, and rewrite to a file that can > be uploaded to this bug? We could then inject those records in a fresh > profile and try to repro. I'm going to try to repro with my own Sync account first. That might be quicker :)

Richard Newman [:rnewman]

Comment 6

•

12 years ago

Just synced 8,000 history records down to a new Aurora profile. No errors at all. The chain of failures at the end of the first log is weird.

Richard Newman [:rnewman]

Comment 7

•

12 years ago

CircleCode: as you can repro, it would be very helpful if you could do the following. 1. Create a fresh profile as normal. 2. Set preferences in about:config as follows: services.sync.log.appender.file.logOnSuccess = true services.sync.log.logger.engine.history = Trace [ capital letter is important ] services.sync.log.logger.engine.tabs = Trace services.sync.log.logger.network.resources = Trace services.sync.log.logger.status = Trace 3. Relaunch Aurora. 4. Set up Sync, wait for the sync to fail. 5. Open about:sync-log, find all of the error logs. 6. Zip them up and send them to me at my email address (visible in Bugzilla). I encourage you to sanitize or trim down these logs as much as you wish, because they will contain your history URLs, but please preserve the originals in case the sanitizing removes something important. I'm particularly looking for the period immediately before the first error. Any help you can provide would be much appreciated.

Flags: needinfo?(codronm+circlecode)

Richard Newman [:rnewman]

Updated

•

12 years ago

Summary: sync: NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS → history failed: NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS

Richard Newman [:rnewman]

Updated

•

12 years ago

Depends on: 805693

Richard Newman [:rnewman]

Comment 8

•

12 years ago

The failure is in our nsIStreamListener/nsIRequestObserver's onStopRequest. We process all of the incoming items successfully, then hit our completion handler, which logs: Sync.Engine.History TRACE No duplicate found for incoming item: RRRRRRRRRRRR Sync.Engine.History TRACE Applying incoming because local item does not exist and was not deleted. Sync.Collection TRACE Channel for GET https://phx-syncXXX.services.mozilla.com/1.1/username/storage/history?full=1: isSuccessCode(2153185313)? false isSuccessCode is being passed the channel status, which is 2153185313. That's 0x80570021, which <http://james-ross.co.uk/mozilla/misc/nserror> tells me is XPCONNECT/Failure/33. That value gets turned into an exception and passed to onComplete, which causes the sync to report as failed. The next sync fails in exactly the same way, but all of the logging indicates that items were successfully applied in the last sync, so we're just re-applying the same: Sync.Engine.History TRACE Reconciling RRRRRRRRRRRR. exists=true; modified=true; local age=XXX0000000.000; incoming age=XXX.0000000000000 Sync.Engine.History TRACE Ignoring incoming item because the local item is identical. Sync.Engine.History TRACE Skipping reconciled incoming item RRRRRRRRRRRR Sync.Collection TRACE Channel for GET https://phx-syncXXX.services.mozilla.com/1.1/username/storage/history?full=1: isSuccessCode(2153185313)? false Sync.Collection TRACE In _onComplete. Error is [Exception... "NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS" nsresult: "0x80570021 (NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS)" location: "JS frame :: resource://gre/modules/services-sync/resource.js :: Channel_onStopRequest :: line 502" data: no]. So Sync is working, right up until it comes to look at the channel, which says it failed horribly. The question is: why, in some OS/time/phase of the moon situations, our channel callback gets called with a serious failure status? And how do we stop it? :D Network folks, any idea? (CCing the experts in this section per hg blame…)

Severity: normal → major

Flags: needinfo?(codronm+circlecode) → needinfo?(cbiesinger)

Christian :Biesinger (don't email me, ping me on IRC)

Comment 9

•

12 years ago

Well, that means that one of your callbacks threw an exception. When onStartRequest or onDataAvailable throw an exception, necko cancels the channel with the status from the exception. So, this is coming from your code, somewhere...

Flags: needinfo?(cbiesinger)

Richard Newman [:rnewman]

Comment 10

•

12 years ago

I kinda nixed that avenue of attack, because we've been through this before -- we already wrap our inner onProgress handler to avoid exactly this situation: onDataAvailable: ... try { this._onProgress(); } catch (ex) { this._log.warn("Got exception calling onProgress handler during fetch of " + req.URI.spec); this._log.debug(Utils.exceptionStr(ex)); this._log.trace("Rethrowing; expect a failure code from the HTTP channel."); throw ex; } ... and we do not see those log items in our log. The only statements that are not wrapped in logging try..catches are: onDataAvailable: let siStream = Cc["@mozilla.org/scriptableinputstream;1"]. createInstance(Ci.nsIScriptableInputStream); siStream.init(stream); ... Utils.namedTimer(this.abortRequest, this._timeout, this, "abortTimer"); The timer code also runs in the constructor, so I see no reason why it would throw here. The entirety of onStartRequest, apart from one log statement, one constant assignment, and the same timer touch, is wrapped in logging try..catches. So the _only_ code that looks like it could throw an exception is that stream code. Is there somewhere else I should be looking, or does that stream code look dodgy?

Honza Bambas (:mayhemer)

Comment 11

•

12 years ago

CircleCode, could you please set network logging as described at https://developer.mozilla.org/en-US/docs/HTTP_Logging and submit the log here or send it to my bugzilla email address if you are concerned about any info leak? It might give a clue, thanks.

Christian :Biesinger (don't email me, ping me on IRC)

Comment 12

•

12 years ago

That paste looks good, afaict... I guess all I can suggest is wrapping the entirety of onStartRequest and onDataAvailable in a try..catch and logging that. One other idea: Perhaps you are setting an nsIChannelEventSink and the exception comes from onChannelRedirect...?

Richard Newman [:rnewman]

Comment 13

•

12 years ago

We do have a redirect handler: --- asyncOnChannelRedirect: function asyncOnChannelRedirect(oldChannel, newChannel, flags, callback) { this._log.debug("Channel redirect: " + oldChannel.URI.spec + ", " + newChannel.URI.spec + ", " + flags); // For internal redirects, copy the headers that our caller set. try { if ((flags & Ci.nsIChannelEventSink.REDIRECT_INTERNAL) && newChannel.URI.equals(oldChannel.URI)) { this._log.trace("Copying headers for safe internal redirect."); for (let header of this._headersToCopy) { let value = oldChannel.getRequestHeader(header); if (value) { newChannel.setRequestHeader(header, value); } } } } catch (ex) { this._log.error("Error copying headers: " + CommonUtils.exceptionStr(ex)); } // We let all redirects proceed. callback.onRedirectVerifyCallback(Cr.NS_OK); } --- but that looks like it'll only throw if either the old or new channel doesn't have a URI, right?

Honza Bambas (:mayhemer)

Comment 14

•

12 years ago

I've got a privately disclosed log from CircleCode where I can see that OnDataAvailable handler may throw this error. I can see that request https://phx-sync054.services.mozilla.com/1.1/<STRIPPED>/storage/history?full=1 fails (stops processing) after we processed some 786432 of 1799681 byes.

Richard Newman [:rnewman]

Comment 15

•

12 years ago

(In reply to Honza Bambas (:mayhemer) from comment #14) > I've got a privately disclosed log from CircleCode where I can see that > OnDataAvailable handler may throw this error. Do you mean OnDataAvailable, or our own implementation of the handler for that? See Comment 10 for why the latter would surprise me.

Flags: needinfo?(honzab.moz)

Honza Bambas (:mayhemer)

Comment 16

•

12 years ago

I can see in the log that OnDataAvailable handler of the channel (which ever implementation it is) is highly probable to throw the exception. I cannot say more at the moment, until I have an environment to reproduce.

Flags: needinfo?(honzab.moz) needinfo?(honzab.moz) → needinfo+

Richard Newman [:rnewman]

Updated

•

12 years ago

Depends on: 808750

Richard Newman [:rnewman]

Comment 17

•

12 years ago

Bug 808750 adds some ridiculously paranoid logging and catches — if we throw even creating the stream or logging, we'll get a log.

Nobody; OK to take it and work on it

Assignee

Updated

•

6 years ago

Component: Firefox Sync: Backend → Sync

Product: Cloud Services → Firefox

Hani Yacoub, Desktop QA

Comment 18

•

3 years ago

I couldn't manage to reproduce this issue. Since the bug was logged 9 years ago, most likely isn't reproducible anymore for the reporter also.
Closing this bug as resolved: Worksforme

Status: NEW → RESOLVED

Closed: 3 years ago

Resolution: --- → WORKSFORME

Jira Integration Bot

Updated

•

3 years ago

See Also: → https://mozilla-hub.atlassian.net/browse/SYNC-2859

last log 12 years ago CircleCode (deleted), text/plain		Details
first log 12 years ago CircleCode (deleted), text/plain		Details