Closed Bug 805386 Opened 12 years ago Closed 3 years ago

history failed: NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS

Categories

(Firefox :: Sync, defect)

x86
Linux
defect
Not set
major

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: m-firefox, Unassigned)

References

Details

Attachments

(2 files)

Attached file last log (deleted) —
I started a clean profile, and registered firefox with my sync account, and get the message saying that sync encountered an error ("in french: Sync a rencontré une erreur lors de la synchronisation : Erreur inconnue. Sync essaiera à nouveau cette action automatiquement") This errors occur each time a new sync occurs. in the logs, i see some NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS. Note, I have tested several times with a new clean profile, and it occurs every time.
Attached file first log (deleted) —
in case you need intermediate logs, I can attach them too
Component: General → Firefox Sync: Backend
Product: Firefox → Mozilla Services
Version: 18 Branch → unspecified
By the way, I'm using aurora (18)
Status: UNCONFIRMED → NEW
Ever confirmed: true
Oh lovely - another history engine error. I'm a bit concerned by the number of people that have been reporting this in Firefox 18... What's interesting about this report is that it is reproducible on a fresh profile. I don't believe we've had such luck with the similar bugs. So, the Sync data should contain everything necessary to reproduce. The easiest way for us to look into this would be to get our hands on the actual Sync records. But, that would involve you turning over your private data to us. I'm not comfortable asking for that. rnewman: what do you think about writing up a quick script to mass download history records, decrypt, scramble the URIs, and rewrite to a file that can be uploaded to this bug? We could then inject those records in a fresh profile and try to repro.
(In reply to Gregory Szorc [:gps] from comment #4) > rnewman: what do you think about writing up a quick script to mass download > history records, decrypt, scramble the URIs, and rewrite to a file that can > be uploaded to this bug? We could then inject those records in a fresh > profile and try to repro. I'm going to try to repro with my own Sync account first. That might be quicker :)
Just synced 8,000 history records down to a new Aurora profile. No errors at all. The chain of failures at the end of the first log is weird.
CircleCode: as you can repro, it would be very helpful if you could do the following. 1. Create a fresh profile as normal. 2. Set preferences in about:config as follows: services.sync.log.appender.file.logOnSuccess = true services.sync.log.logger.engine.history = Trace [ capital letter is important ] services.sync.log.logger.engine.tabs = Trace services.sync.log.logger.network.resources = Trace services.sync.log.logger.status = Trace 3. Relaunch Aurora. 4. Set up Sync, wait for the sync to fail. 5. Open about:sync-log, find all of the error logs. 6. Zip them up and send them to me at my email address (visible in Bugzilla). I encourage you to sanitize or trim down these logs as much as you wish, because they will contain your history URLs, but please preserve the originals in case the sanitizing removes something important. I'm particularly looking for the period immediately before the first error. Any help you can provide would be much appreciated.
Flags: needinfo?(codronm+circlecode)
Summary: sync: NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS → history failed: NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS
Depends on: 805693
The failure is in our nsIStreamListener/nsIRequestObserver's onStopRequest. We process all of the incoming items successfully, then hit our completion handler, which logs: Sync.Engine.History TRACE No duplicate found for incoming item: RRRRRRRRRRRR Sync.Engine.History TRACE Applying incoming because local item does not exist and was not deleted. Sync.Collection TRACE Channel for GET https://phx-syncXXX.services.mozilla.com/1.1/username/storage/history?full=1: isSuccessCode(2153185313)? false isSuccessCode is being passed the channel status, which is 2153185313. That's 0x80570021, which <http://james-ross.co.uk/mozilla/misc/nserror> tells me is XPCONNECT/Failure/33. That value gets turned into an exception and passed to onComplete, which causes the sync to report as failed. The next sync fails in exactly the same way, but all of the logging indicates that items were successfully applied in the last sync, so we're just re-applying the same: Sync.Engine.History TRACE Reconciling RRRRRRRRRRRR. exists=true; modified=true; local age=XXX0000000.000; incoming age=XXX.0000000000000 Sync.Engine.History TRACE Ignoring incoming item because the local item is identical. Sync.Engine.History TRACE Skipping reconciled incoming item RRRRRRRRRRRR Sync.Collection TRACE Channel for GET https://phx-syncXXX.services.mozilla.com/1.1/username/storage/history?full=1: isSuccessCode(2153185313)? false Sync.Collection TRACE In _onComplete. Error is [Exception... "NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS" nsresult: "0x80570021 (NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS)" location: "JS frame :: resource://gre/modules/services-sync/resource.js :: Channel_onStopRequest :: line 502" data: no]. So Sync is working, right up until it comes to look at the channel, which says it failed horribly. The question is: why, in some OS/time/phase of the moon situations, our channel callback gets called with a serious failure status? And how do we stop it? :D Network folks, any idea? (CCing the experts in this section per hg blame…)
Severity: normal → major
Flags: needinfo?(codronm+circlecode) → needinfo?(cbiesinger)
Well, that means that one of your callbacks threw an exception. When onStartRequest or onDataAvailable throw an exception, necko cancels the channel with the status from the exception. So, this is coming from your code, somewhere...
Flags: needinfo?(cbiesinger)
I kinda nixed that avenue of attack, because we've been through this before -- we already wrap our inner onProgress handler to avoid exactly this situation: onDataAvailable: ... try { this._onProgress(); } catch (ex) { this._log.warn("Got exception calling onProgress handler during fetch of " + req.URI.spec); this._log.debug(Utils.exceptionStr(ex)); this._log.trace("Rethrowing; expect a failure code from the HTTP channel."); throw ex; } ... and we do not see those log items in our log. The only statements that are not wrapped in logging try..catches are: onDataAvailable: let siStream = Cc["@mozilla.org/scriptableinputstream;1"]. createInstance(Ci.nsIScriptableInputStream); siStream.init(stream); ... Utils.namedTimer(this.abortRequest, this._timeout, this, "abortTimer"); The timer code also runs in the constructor, so I see no reason why it would throw here. The entirety of onStartRequest, apart from one log statement, one constant assignment, and the same timer touch, is wrapped in logging try..catches. So the _only_ code that looks like it could throw an exception is that stream code. Is there somewhere else I should be looking, or does that stream code look dodgy?
CircleCode, could you please set network logging as described at https://developer.mozilla.org/en-US/docs/HTTP_Logging and submit the log here or send it to my bugzilla email address if you are concerned about any info leak? It might give a clue, thanks.
That paste looks good, afaict... I guess all I can suggest is wrapping the entirety of onStartRequest and onDataAvailable in a try..catch and logging that. One other idea: Perhaps you are setting an nsIChannelEventSink and the exception comes from onChannelRedirect...?
We do have a redirect handler: --- asyncOnChannelRedirect: function asyncOnChannelRedirect(oldChannel, newChannel, flags, callback) { this._log.debug("Channel redirect: " + oldChannel.URI.spec + ", " + newChannel.URI.spec + ", " + flags); // For internal redirects, copy the headers that our caller set. try { if ((flags & Ci.nsIChannelEventSink.REDIRECT_INTERNAL) && newChannel.URI.equals(oldChannel.URI)) { this._log.trace("Copying headers for safe internal redirect."); for (let header of this._headersToCopy) { let value = oldChannel.getRequestHeader(header); if (value) { newChannel.setRequestHeader(header, value); } } } } catch (ex) { this._log.error("Error copying headers: " + CommonUtils.exceptionStr(ex)); } // We let all redirects proceed. callback.onRedirectVerifyCallback(Cr.NS_OK); } --- but that looks like it'll only throw if either the old or new channel doesn't have a URI, right?
I've got a privately disclosed log from CircleCode where I can see that OnDataAvailable handler may throw this error. I can see that request https://phx-sync054.services.mozilla.com/1.1/<STRIPPED>/storage/history?full=1 fails (stops processing) after we processed some 786432 of 1799681 byes.
(In reply to Honza Bambas (:mayhemer) from comment #14) > I've got a privately disclosed log from CircleCode where I can see that > OnDataAvailable handler may throw this error. Do you mean OnDataAvailable, or our own implementation of the handler for that? See Comment 10 for why the latter would surprise me.
Flags: needinfo?(honzab.moz)
I can see in the log that OnDataAvailable handler of the channel (which ever implementation it is) is highly probable to throw the exception. I cannot say more at the moment, until I have an environment to reproduce.
Flags: needinfo?(honzab.moz) needinfo?(honzab.moz) → needinfo+
Depends on: 808750
Bug 808750 adds some ridiculously paranoid logging and catches — if we throw even creating the stream or logging, we'll get a log.
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox

I couldn't manage to reproduce this issue. Since the bug was logged 9 years ago, most likely isn't reproducible anymore for the reporter also.
Closing this bug as resolved: Worksforme

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: