566049 - Intermittent failure in test_crashing2.html (window.onerror An error occurred - uncaught exception NS_ERROR_FAILURE)

Reporter

Description

•

15 years ago

This started failing today it seems: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1273865513.1273865980.11501.gz Linux mozilla-central opt test mochitests-5/5, Started 15:31, finished 15:41, took 10mins http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1273870984.1273872286.8663.gz WINNT 5.2 mozilla-central debug test mochitests-5/5, Started 17:03, finished 17:25, took 23mins http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1273871540.1273872068.7785.gz WINNT 5.2 mozilla-central opt test mochitests-5/5, Started 17:12, finished 17:23, took 12mins http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1273872842.1273873606.14988.gz Linux mozilla-central opt test mochitests-5/5, Started 17:34, finished 17:47, took 14mins http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1273872773.1273873900.16213.gz WINNT 5.2 mozilla-central debug test mochitests-5/5, Started 17:32, finished 17:52, took 20mins

Dave Townsend [:mossop]

Reporter

Updated

•

15 years ago

Keywords: regression

Comment hidden (Legacy TBPL/Treeherder Robot)

David Baron :dbaron:

Comment 6

•

15 years ago

There were a bunch of these starred as bug 541707 before this was filed. The earliest push I see in which this happened was: http://hg.mozilla.org/mozilla-central/pushloghtml?changeset=f7560e802592 However, I didn't look at any pushes more than one or two prior to that because tinderboxpushlog reports "Parsing tinderbox failed" I think the most likely cause is: http://hg.mozilla.org/mozilla-central/rev/6a9d0521a319 which was two pushes prior to that one.

Comment hidden (Legacy TBPL/Treeherder Robot)

Mike Shaver (:shaver -- probably not reading bugmail closely)

Comment 8

•

15 years ago

Josh: can you look at this? Thanks!

Comment hidden (Legacy TBPL/Treeherder Robot)

Mike Beltzner [:beltzner, not reading bugmail]

Comment 11

•

15 years ago

This test was last modified on May 15th by bug 561596, Dave Yeo - can you please take a look?

Assignee: nobody → daveryeo

Reed Loden [:reed]

Comment 12

•

15 years ago

(In reply to comment #11) > This test was last modified on May 15th by bug 561596, Dave Yeo - can you > please take a look? http://hg.mozilla.org/mozilla-central/rev/6f0fe1119d4b doesn't modify any test. This bug was filed on the 14th, but the patch in bug 561596 didn't land until the 15th.

Mike Beltzner [:beltzner, not reading bugmail]

Comment 13

•

15 years ago

I took my information from: http://hg.mozilla.org/mozilla-central/annotate/6f0fe1119d4b/modules/plugin/test/mochitest/test_crashing2.html apologies if I misinterpreted, but that seems to indicate the bug I mentioned.

Mike Beltzner [:beltzner, not reading bugmail]

Comment 14

•

15 years ago

(That link is what "HG Blame" links to for the file, fwiw)

David Baron :dbaron:

Comment 15

•

15 years ago

(In reply to comment #13) > I took my information from: > > http://hg.mozilla.org/mozilla-central/annotate/6f0fe1119d4b/modules/plugin/test/mochitest/test_crashing2.html > > apologies if I misinterpreted, but that seems to indicate the bug I mentioned. That was just telling you what was the tipmost revision at the time you looked (independent of whether it touched that file). (I'm not saying the UI makes sense, I'm just saying that's what the information you were looking at was trying to tell you.)

David Baron :dbaron:

Comment 16

•

15 years ago

The test hasn't changed since January: http://hg.mozilla.org/mozilla-central/log/6f0fe1119d4b/modules/plugin/test/mochitest/test_crashing2.html I think this is most likely related to josh's removal of the default plugin. Josh, could you look into this?

Assignee: daveryeo → joshmoz

Comment hidden (Legacy TBPL/Treeherder Robot)

Josh Aas

Comment 20

•

15 years ago

I'm looking into it.

Josh Aas

Comment 21

•

15 years ago

I wasn't able to reproduce but this code in the test appears to be the problem: // The plugin will now crash when it is instantiated, but // that happens asynchronously. HACK: this should use an // event instead of nested pending runnables, but I don't // know of any DOM event that's fired when a plugin is // instantiated. SimpleTest.executeSoon(function() { SimpleTest.executeSoon(function() { ok(p.setColor === undefined, "Plugin should have crashed on creation"); window.frameLoaded = reloaded1; iframe.contentWindow.location.replace('crashing_subpage.html'); }) }); There is a race here, and I suspect the test fails when the race results in "p.setColor" throwing an exception rather than simply an undefined value. Prior to my patch both possible race conditions probably resulted in the undefined value. The difference is probably the fact that "nsObjectLoadingContent::GetTypeOfContent" can now return 'eType_Null' for a crashed plugin, instead of eType_Plugin. The "ok(p.setColor === undefined..." test appears to simply be verifying that the plugin crashed. If that is the case, it probably needs to be a better test. We could simply consider an exception to be an acceptable alternative to the undefined value, either one indicating a crashed plugin. It shouldn't be too hard to fix this problem one way or another. If we can't decide on a solution quickly we should just disable it and file a bug.

Comment hidden (Legacy TBPL/Treeherder Robot)

Benjamin Smedberg

Comment 23

•

15 years ago

.setColor should definitely be undefined eventually: the prototype object should be unhooked when the plugin crashes, and no matter whether it's eType_Plugin or eType_Null there's nothing that should throw an exception by the time we're in a stable state. What exactly is racing, in your scenario? I really believe that the nested .executeSoon, although ugly, should be deterministic.

Josh Aas

Comment 24

•

15 years ago

Attached file manual test 1.0 (deleted) — Details

I think this HTML test case shows what is going wrong on the tinderbox, but I can't be sure because I can't reproduce with the actual test. First it prints whether or not 'setFocus' is equal to 'undefined' for two plugins, the test plugin and an unknown plugin. It is always false for the test plugin, true for the unknown plugin. The it crashes the test plugin and immediately tests whether 'setFocus' is equal to 'undefined' again. This always throws an exception. WARNING: Failed to send message!: file /Volumes/MozillaBuilds/10_6/ff_193_debug_32/dom/plugins/PluginScriptableObjectParent.cpp, line 219 JavaScript error: file:///Users/josh/Desktop/test.html, line 9: Error calling method on NPObject! After that, you can run the test again by clicking the button on the page. When clicking the button there is no exception, the comparison comes out true every time. This difference has to do with timing or scope, I don't know which one. Strangely, if you navigate away from this test HTML after clicking the button a few times, the browser will crash after about 10 seconds, usually with a bad memory access. This seems like a different bug.

Josh Aas

Updated

•

15 years ago

Attachment #445748 - Attachment is patch: false

Attachment #445748 - Attachment mime type: text/plain → text/html

Josh Aas

Comment 25

•

15 years ago

My manual test case shows the same disparity (exception vs. undefined) in builds from before the default plugin patch landed.

Comment hidden (Legacy TBPL/Treeherder Robot)

Josh Aas

Comment 27

•

15 years ago

Test disabled for now. http://hg.mozilla.org/mozilla-central/rev/1a62eaf77754

Comment hidden (Legacy TBPL/Treeherder Robot)

Josh Aas

Comment 30

•

15 years ago

Attached patch fix v1.0 (obsolete) (deleted) — Details — Splinter Review

My manual test shows that there is a period after a crash during which the method will not be undefined, accessing it will throw an exception. We should allow either case to indicate a crashed plugin.

Josh Aas

Updated

•

15 years ago

Attachment #448683 - Flags: review?(benjamin)

Benjamin Smedberg

Comment 31

•

15 years ago

But that's not what we're testing. The test wants to test that the value eventually ends up being undefined (after the PluginCrashed event fully propagates). This just wallpapers over the problem instead of actually figuring out why the dual .executeSoon isn't sufficient.

Josh Aas

Comment 32

•

15 years ago

Attached patch (unlikely) possibility v2.0 (deleted) — Details — Splinter Review

I'm not sure this is really a fix, I can't reproduce the problem, but this patch reverts the only change in my default plugin patch that seems like a potential cause for this bug. If this isn't a fix then the double executeSoon is probably not deterministic - that is the only other potential cause I can identify. Benjamin - why exactly are you convinced that the double executeSoon is deterministic?

Josh Aas

Updated

•

15 years ago

Attachment #448683 - Flags: review?(benjamin)

Josh Aas

Updated

•

15 years ago

Attachment #448683 - Attachment is obsolete: true

Josh Aas

Updated

•

15 years ago

Attachment #449793 - Attachment description: fix v2.0 → (unlikely) possibility v2.0

Josh Aas

Comment 33

•

15 years ago

I don't see why a double executeSoon should be deterministic, and if it isn't then it is possible that the check will run before the PluginCrashed event is handled. That seems like the most likely cause for this bug. Most of the time the double executeSoon should delay the check long enough to avoid failure, which is in line with this failure being intermittent. I think we have 3 options here: 1. Find a deterministic way to make sure the PluginCrashed notification fires and that the method value ends up being undefined after processing. I'm not sure we can do this from content. 2. Get rid of the test on the grounds that it is inherently flawed. 3. Add another executeSoon level and make the bad race condition even less likely to occur. Unfortunate that the race would remain, but realistically this would probably work fine.

Assignee: joshmoz → nobody

Hardware: x86 → All

Benjamin Smedberg

Comment 34

•

14 years ago

Argh, I've written a reply twice and it hasn't submitted. It definitely should be deterministic: there aren't any other threads in play, really, and the main thread even order should be: * iframe.contentDocument.body.appendChild(p); ** posts plugininstantiate event * .executeSoon #1 * plugininstantiate event runs, calls NPP_New ** plugin crashes ** ActorDestroy is called ** posts PluginCrashed event * .executeSoon #1 runs, posts .executeSoon #2 * PluginCrashed runs ** calls into nsNPAPIPluginInstance::Stop ** calls into nsJSNPRuntime::OnPluginDestroy(NPP npp) ** unhooks the plugin prototype object * .executeSoon #2 runs, prototype should already be unhooked, .setColor should be undefined

Ed Morley [:emorley]

Comment 35

•

12 years ago

Mass marking whiteboard:[orange] bugs WFM (to clean up TBPL bug suggestions) that: * Haven't changed in > 6months * Whose whiteboard contains none of the strings: {disabled,marked,random,fuzzy,todo,fails,failing,annotated,leave open,time-bomb} * Passed a (quick) manual inspection of bug summary/whiteboard to ensure they weren't a false positive. I've also gone through and searched for cases where the whiteboard wasn't labelled correctly after test disabling, by using attachment description & basic comment searches. However if the test for which this bug was about has in fact been disabled/annotated/..., please accept my apologies & reopen/mark the whiteboard appropriately so this doesn't get re-closed in the future (and please ping me via IRC or email so I can try to tweak the saved searches to avoid more edge cases). Sorry for the spam! Filter on: #FFA500

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → WORKSFORME

Nobody; OK to take it and work on it

Assignee

Updated

•

12 years ago

Keywords: intermittent-failure

Nobody; OK to take it and work on it

Assignee

Updated

•

12 years ago

Whiteboard: [orange]

BMO Automation

Updated

•

2 years ago

Product: Core → Core Graveyard

manual test 1.0 15 years ago Josh Aas (deleted), text/html		Details
fix v1.0 15 years ago Josh Aas (deleted), patch		Details \| Diff \| Splinter Review
(unlikely) possibility v2.0 15 years ago Josh Aas (deleted), patch		Details \| Diff \| Splinter Review