Closed Bug 1055004 Opened 10 years ago Closed 10 years ago

crash in mozilla::a11y::DocAccessible::UpdateTree(mozilla::a11y::Accessible*, nsIContent*, bool)

Categories

(Core :: Disability Access APIs, defect)

All
Android
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla34
blocking-b2g 2.1+
Tracking Status
b2g-v2.1 --- verified

People

(Reporter: Bebe, Assigned: eeejay)

References

Details

(4 keywords, Whiteboard: [xfail])

Crash Data

Attachments

(2 files, 2 obsolete files)

This bug was filed from the Socorro interface and is report bp-f3fae058-15cc-4f53-aad3-1d5c52140818. ============================================================= Build ID: Gaia aa8aace12d65956dd9525da5dac66e0d3b28597f Gecko https://hg.mozilla.org/mozilla-central/rev/0aaa2d3d15cc BuildID 20140818040201 Version 34.0a1 ro.build.version.incremental=110 ro.build.date=Fri Jun 27 15:57:58 CST 2014 B1TC00011230 STR: 1. Add a smart collection to the home screen ex: "Around Me" 2. Open the smart collection Expected: 2. the smart collection should open Actual: 2. The phone crashes We can reproduce this manually and with automation test to run to reproduce this: https://github.com/mozilla-b2g/gaia/blob/master/tests/python/gaia-ui-tests/gaiatest/tests/functional/everythingme/test_everythingme_add_collection_save_bookmark.py
Component: Gaia::Everything.me → Disability Access APIs
Product: Firefox OS → Core
Version: unspecified → Trunk
Is this consistently reproducible (i.e. can you reproduce this with the same STR more than once)?
blocking-b2g: --- → backlog
Flags: needinfo?(florin.strugariu)
Keywords: qawanted, regression
I have been able to reproduce it every time while running that test locally.
Okay. Also that today's automation report shows this not happening on 2.0, so this is a regression from 2.0 --> 2.1. [Blocking Requested - why for this release]: QA blocking crash since it causes one of automated tests to permafail & a regression.
blocking-b2g: backlog → 2.1?
Flags: needinfo?(florin.strugariu)
Whiteboard: [xfail]
Keywords: smoketest
Keywords: smoketest
jsmith Yes I can reproduce the crash every time I launch that smart collection with both manual and automation. From what I can see in the B2G-i builds: Last good build: application_buildid: 20140815024915 application_changeset: f0fa964f10f5 application_display_name: B2G application_name: B2G application_repository: https://hg.mozilla.org/integration/b2g-inbound application_version: 34.0a1 build_changeset: 3aa6abd313f965a84aa86c6b213dc154e4875139 device_firmware_date: 1403855878 device_firmware_version_incremental: 110 device_firmware_version_release: 4.3 device_id: flame gaia_changeset: 40a5e2589b7947543ca57f40393b0c9af5e5fd74 gaia_date: 1408095144 platform_buildid: 20140815024915 platform_changeset: f0fa964f10f5 platform_repository: https://hg.mozilla.org/integration/b2g-inbound First bad build: application_buildid: 20140815041216 application_changeset: 192a8a9862a2 application_display_name: B2G application_name: B2G application_repository: https://hg.mozilla.org/integration/b2g-inbound application_version: 34.0a1 build_changeset: 3aa6abd313f965a84aa86c6b213dc154e4875139 device_firmware_date: 1406125546 device_firmware_version_incremental: eng.cltbld.20140723.102536 device_firmware_version_release: 4.3 device_id: flame gaia_changeset: b7fa47012abbb9fa3df8fa370ce8e9dfc36f3267 gaia_date: 1408098423 platform_buildid: 20140815041216 platform_changeset: 192a8a9862a2 platform_repository: https://hg.mozilla.org/integration/b2g-inbound
Hmm...my guess is that we made a change in Gaia that exposed a gecko crash. Maybe bug 1050404? Wilson - What do you think?
Flags: needinfo?(wilsonpage)
Trev, can this bea regression from the recent treeupdate/doc lifecycle / etc. changes you and Alex were working on?
Flags: needinfo?(trev.saunders)
Keywords: qaurgent, smoketest
Keywords: qaurgent
I tried walking through the steps manually from the integration test, but all seems to work as expected. Not experiencing any crashes. Can I get some manual STR?
Flags: needinfo?(wilsonpage)
(In reply to Wilson Page [:wilsonpage] from comment #9) > I tried walking through the steps manually from the integration test, but > all seems to work as expected. Not experiencing any crashes. Can I get some > manual STR? Have you tried testing on device?
(In reply to Jason Smith [:jsmith] from comment #10) > Have you tried testing on device? I am running through steps manually on a flame.
QA Wanted - Can we get the STR clarified here?
Keywords: qaurgent, qawanted
QA Contact: ckreinbring
Unable to repro on today's Flame 2.1. After adding a smart collection then selecting it, the collection loaded with no errors. I tried the following tests: One preset collection, one custom collection, multiple preset collections, multiple preset collections with a custom collection, plugged/unplugged, with(out) SIM in the device. BuildID: 20140818072913 Gaia: ba1992f2addc5a84afc2eab426f222a6bf2962ba Gecko: bf27e27c994d Platform Version: 34.0a1 Firmware Version: V123 User Agent: Mozilla/5.0 (Mobile; rv:33.0) Gecko/33.0 Firefox/33.0
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(jmitchell)
Keywords: qaurgent, qawanted
Dropping flags since this actually is not manually reproducible. We need more details from Bebe on how to reproduce this bug.
blocking-b2g: 2.1? → ---
Flags: needinfo?(florin.strugariu)
Whiteboard: [xfail]
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(jmitchell)
I also checked the Central Engineering build and couldn't reproduce the crash. I tried both a WiFi Connection and Data Connection to complete the STR's. Environmental Variables: Device: Flame Master Build ID: 20140818130516 Gaia: 778c39b5597ea424d6a75934221265423ab3c9e7 Gecko: cd7cbdacf9d8 Version: 34.0a1 (Master) Firmware Version: v123 User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0
(In reply to Marco Zehe (:MarcoZ) from comment #8) > Trev, can this bea regression from the recent treeupdate/doc lifecycle / > etc. changes you and Alex were working on? the only change to this code in many months was Eitan's change to GetAccessibleOrContainer to use GetCurrentCrossShadowDoc().
Flags: needinfo?(trev.saunders)
(In reply to Trevor Saunders (:tbsaunde) from comment #16) > (In reply to Marco Zehe (:MarcoZ) from comment #8) > > Trev, can this bea regression from the recent treeupdate/doc lifecycle / > > etc. changes you and Alex were working on? > > the only change to this code in many months was Eitan's change to > GetAccessibleOrContainer to use GetCurrentCrossShadowDoc(). Cool thanks! Eitan, any ideas? I know it seems to be reproducible only in tests, but still...
Flags: needinfo?(eitan)
So, looking at the stack it seems pretty clear we're crashing because UpdateTree() is called with null aContainer. The call stack is junk, but it seems very likely it is DocAccessible::ContentRemoved calling UpdateTree. The only way for ContentRemoved to pass null as aContainer is for GetAccessibleOrContainer() to return null when it is passed a non null node. It looks like The first way that can happen is for the node to not be in any document, but I don't think ContentRemoved can be called with a node not in a document. So I believe the issue is when the node passed to GetAccessibleOrContainer is in a shadow subtree, and so GetParentNode will never come up into the hosting content.
(In reply to Marco Zehe (:MarcoZ) from comment #17) > (In reply to Trevor Saunders (:tbsaunde) from comment #16) > > (In reply to Marco Zehe (:MarcoZ) from comment #8) > > > Trev, can this bea regression from the recent treeupdate/doc lifecycle / > > > etc. changes you and Alex were working on? > > > > the only change to this code in many months was Eitan's change to > > GetAccessibleOrContainer to use GetCurrentCrossShadowDoc(). > > Cool thanks! Eitan, any ideas? I know it seems to be reproducible only in > tests, but still... I'm pretty sure bug 1027315 should have changed GetParentNode() in GetAccessibleOrContainer to GetFlatteneedTreeParent(), but I'll leave verify that and the patch to Eitan
The only way I can replicate this manually is after having run the automated test on the profile. Thus it's a dirty profile and the bug can be replicated repeatedly. On a fresh flash I can't replicate it manually. It screams of a race condition but I sprayed some sleeps around to slow the test down and it didn't seem to help.
Zac is correct here. I tried to replicate this on a fresh flash phone and I failed to reproduce the issue. But after running the test I can replicate it every time. Also this reproduces on the latest build: Gaia 778c39b5597ea424d6a75934221265423ab3c9e7 Gecko https://hg.mozilla.org/mozilla-central/rev/cd7cbdacf9d8 BuildID 20140818160207 Version 34.0a1 ro.build.version.incremental=110 ro.build.date=Fri Jun 27 15:57:58 CST 2014 B1TC00011230
Flags: needinfo?(florin.strugariu)
Attached file eme_crash.log (deleted) —
Here is a logcat of the automated test running and experiencing the crash. Notable lines: 1662: Long press on homescreen's gaia-grid to open context menu 1847: Around Me collection added 1977: Marionette taps the collection icon (the step immediately preceding the crash) 2022: Report crash error I can't really see anything there, I guess because this comes from higher up in Gecko? We could ask the homescreen team if some improvements were made to homescreen that might have introduced this.
[Blocking Requested - why for this release]: Okay - reflagging the nomination here since this is causing one of our tests to permafail, which makes this a qablocker.
blocking-b2g: --- → 2.1?
Whiteboard: [xfail]
Also flagging window wanted since we should be able to bisect this with the new information given.
(In reply to Jason Smith [:jsmith] from comment #24) > Also flagging window wanted since we should be able to bisect this with the > new information given. Specifically looking for info on: * A m-c tinderbox window * Gaia/Gecko swap * If the m-c tinderbox window does not point to b2g-inbound, then bisect on the other inbound branch
I could reproduce this with the test. I'll have a go at it, and have trevor review it, since he obviously has a much better understanding of what is going on..
Assignee: nobody → eitan
Flags: needinfo?(eitan)
We're getting similar failures on pending pull-requests related to bug 1005830: https://tbpl.mozilla.org/?rev=ecb205a539ad792df27797ead41e7c7a8b2bcc67&tree=Gaia-Try Is this the same issue?
Flags: needinfo?(zcampbell)
Trevor, do you have other ideas?
Attachment #8475429 - Flags: feedback?(trev.saunders)
Apparently this work, and fixes the crash. I have no idea if this is sane or not.
Attachment #8475429 - Attachment is obsolete: true
Attachment #8475429 - Flags: feedback?(trev.saunders)
Attachment #8475488 - Flags: review?(trev.saunders)
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage+][lead-review+]
Posting Regression Window to confirm patch fix above. B2G Inbound Regression Window: Last Working Device: Flame Master BuildID: 20140815035215 Gaia: b7fa47012abbb9fa3df8fa370ce8e9dfc36f3267 Gecko: 70ce13bca890 Version: 34.0a1 (Master) Firmware Version: v123 User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0 First Broken Device: Flame Master Build ID: 20140815035215 Gaia: b7fa47012abbb9fa3df8fa370ce8e9dfc36f3267 Gecko: 70ce13bca890 Version: 34.0a1 (Master) Firmware Version: v123 User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0 Last Working Gaia and First Broken Gecko Issue DOES NOT occur here. Gaia: 40a5e2589b7947543ca57f40393b0c9af5e5fd74 Gecko: 70ce13bca890 Last Working Gecko and First Broken Gaia Issue DOES occur here. Gaia: b7fa47012abbb9fa3df8fa370ce8e9dfc36f3267 Gecko: f0fa964f10f5 Gaia Pushlog: https://github.com/mozilla-b2g/gaia/compare/40a5e2589b7947543ca57f40393b0c9af5e5fd74...b7fa47012abbb9fa3df8fa370ce8e9dfc36f3267 Possible Causes: Bug 1050404 - [Collections] Update to use gaia-header Bug 1015294 - [Homescreen] Update to use gaia-header
QA Whiteboard: [QAnalyst-Triage+][lead-review+] → [QAnalyst-Triage?][lead-review+]
Flags: needinfo?(jmitchell)
QA Whiteboard: [QAnalyst-Triage?][lead-review+] → [QAnalyst-Triage+][lead-review+]
Flags: needinfo?(jmitchell)
(In reply to Wilson Page [:wilsonpage] from comment #27) > We're getting similar failures on pending pull-requests related to bug > 1005830: > > https://tbpl.mozilla.org/ > ?rev=ecb205a539ad792df27797ead41e7c7a8b2bcc67&tree=Gaia-Try > > Is this the same issue? It looks very similar. I can see it on several other Gaia-Try jobs too, it seems this has spread now. That means it's more urgent to get an r? or a backout for this if we can!
Flags: needinfo?(zcampbell)
Blocks: 1056047
Comment on attachment 8475488 [details] [diff] [review] Fix failure of GetAccessibleOrContainer() to return a parent past a shadow root. I'm confused what the difference between these functions is, but in any case this needs a test.
Attachment #8475488 - Flags: review?(trev.saunders)
Blocks: 1011602
We need a fix for this landing today, it's blocking a load of patches.
Flags: needinfo?(trev.saunders)
Flags: needinfo?(eitan)
OK, this patch is better documented and understood. I have been trying for an entire day to reproduce this issue in a mochitest with no luck. When I add/remove content from a shadow dom I don't get ContentInserted/ContentRemoved. At least not for anything below the shadow root. This is very urgent for the FxOS folks, could we land this, and write tests later?
Attachment #8475488 - Attachment is obsolete: true
Attachment #8477434 - Flags: review?(trev.saunders)
Flags: needinfo?(eitan)
Comment on attachment 8477434 [details] [diff] [review] Fix failure of GetAccessibleOrContainer() to return a parent past a shadow root. Review of attachment 8477434 [details] [diff] [review]: ----------------------------------------------------------------- r=me. I managed to chat with Trevor (he is currently indisposed). Filing a follow up for tests should be fine. ::: accessible/generic/DocAccessible.cpp @@ +1313,5 @@ > + nsINode* parent = nullptr; > + > + // If this is a content node, try to get a flattened parent content node. > + if (currNode->IsContent()) > + parent = currNode->AsContent()->GetFlattenedTreeParent(); Note to self: ok should cross from shadow to host properly (where else might we need to make a change like this?) The comment doesn't tell us anything the code doesn't ;) Maybe say something about the case this fixes? @@ +1318,5 @@ > + > + // Fallback to just get parent node, in case there is no parent content > + // node. Or current node is not a content node. > + if (!parent) > + parent = currNode->GetParentNode(); Note to self: ... and this is what we had already.
Attachment #8477434 - Flags: review+
(In reply to David Bolter [:davidb] from comment #36) > Comment on attachment 8477434 [details] [diff] [review] > Fix failure of GetAccessibleOrContainer() to return a parent past a shadow > root. > > Review of attachment 8477434 [details] [diff] [review]: > ----------------------------------------------------------------- > > r=me. > > I managed to chat with Trevor (he is currently indisposed). Filing a follow > up for tests should be fine. > > ::: accessible/generic/DocAccessible.cpp > @@ +1313,5 @@ > > + nsINode* parent = nullptr; > > + > > + // If this is a content node, try to get a flattened parent content node. > > + if (currNode->IsContent()) > > + parent = currNode->AsContent()->GetFlattenedTreeParent(); > > Note to self: ok should cross from shadow to host properly (where else might > we need to make a change like this?) > I think we already made a bunch of changes, and I guess missed a few :P > The comment doesn't tell us anything the code doesn't ;) Maybe say something > about the case this fixes? > Fixed that. > @@ +1318,5 @@ > > + > > + // Fallback to just get parent node, in case there is no parent content > > + // node. Or current node is not a content node. > > + if (!parent) > > + parent = currNode->GetParentNode(); > > Note to self: ... and this is what we had already. Try looks green enough.. landing
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Attachment #8477434 - Flags: review?(trev.saunders)
Flags: needinfo?(trev.saunders)
> OK, this patch is better documented and understood. > > I have been trying for an entire day to reproduce this issue in a mochitest > with no luck. When I add/remove content from a shadow dom I don't get > ContentInserted/ContentRemoved. At least not for Well then your clearly doing something wrong, because this crash can only happen if it is called. anything below the shadow > root. This is very urgent for the FxOS folks, could we land this, and write > tests later? I'm dubious it couldn't wait, but what's done is done.
GetFlattenedParent() returns GetParent() by default. However you call GetParent() if GetFlattenedParent() failed. Are you trying to fix some broken bindings by that or what is it?
would be good to get comment #41 answered :) thanks
Flags: needinfo?(trev.saunders)
(In reply to alexander :surkov from comment #42) > would be good to get comment #41 answered :) thanks I have no clue its not my patch.
Flags: needinfo?(trev.saunders)
(In reply to alexander :surkov from comment #41) > GetFlattenedParent() returns GetParent() by default. However you call > GetParent() if GetFlattenedParent() failed. Are you trying to fix some > broken bindings by that or what is it? GetFlattenedTreeParent only returns nsIContent parents. If we reach the root content node of a document, we want to get the document node as well, which is not an nsIContent. If we first tried to GetParentNode() and then use GetFlattenedTreeParent() for shadow root fallback, we would end up on the document fragment node, which we have no interest in, and it is not a content node, so it does not have GetFlattenedTreeParent(), and thus we are stuck. bz said maybe a GetFlattenedTreeParentNode() method on nsINode makes sense. Possibly.
(In reply to Eitan Isaacson [:eeejay] from comment #44) > (In reply to alexander :surkov from comment #41) > > GetFlattenedParent() returns GetParent() by default. However you call > > GetParent() if GetFlattenedParent() failed. Are you trying to fix some > > broken bindings by that or what is it? > > GetFlattenedTreeParent only returns nsIContent parents. If we reach the root > content node of a document, we want to get the document node as well, which > is not an nsIContent. its seems to me we just want to use GetCrossShadowRootParentElement or whatever that thing is called.
(In reply to Trevor Saunders (:tbsaunde) from comment #43) > (In reply to alexander :surkov from comment #42) > > would be good to get comment #41 answered :) thanks > > I have no clue its not my patch. sorry, wrong ID :) (In reply to Eitan Isaacson [:eeejay] from comment #44) > (In reply to alexander :surkov from comment #41) > bz said maybe a GetFlattenedTreeParentNode() method on nsINode makes sense. > Possibly. that sounds good and less suspicious than current code, wanna go this way? (In reply to Trevor Saunders (:tbsaunde) from comment #45) > (In reply to Eitan Isaacson [:eeejay] from comment #44) > its seems to me we just want to use GetCrossShadowRootParentElement or > whatever that thing is called. can you point out exact name pls? I don't recall this method
(In reply to alexander :surkov from comment #46) > (In reply to Trevor Saunders (:tbsaunde) from comment #43) > > (In reply to alexander :surkov from comment #42) > > > would be good to get comment #41 answered :) thanks > > > > I have no clue its not my patch. > > sorry, wrong ID :) > > (In reply to Eitan Isaacson [:eeejay] from comment #44) > > (In reply to alexander :surkov from comment #41) > > bz said maybe a GetFlattenedTreeParentNode() method on nsINode makes sense. > > Possibly. > > that sounds good and less suspicious than current code, wanna go this way? > > (In reply to Trevor Saunders (:tbsaunde) from comment #45) > > (In reply to Eitan Isaacson [:eeejay] from comment #44) > > its seems to me we just want to use GetCrossShadowRootParentElement or > > whatever that thing is called. > > can you point out exact name pls? I don't recall this method nsINode::GetParentElementCrossingShadowRoot apparently because who likes consistancy anyway.
(In reply to Trevor Saunders (:tbsaunde) from comment #47) > nsINode::GetParentElementCrossingShadowRoot apparently because who likes > consistancy anyway. From listening in to a conversation you and smaug were having on IRC, I got the impression that method does not work for crossing from the light dom into the shadow dom. I might have totally misunderstood.
(In reply to Trevor Saunders (:tbsaunde) from comment #47) > nsINode::GetParentElementCrossingShadowRoot apparently because who likes > consistancy anyway. it seems it doesn't take into account XBL stuff but anyway, it has same issue as GetFlattenedParent(), i.e. it doesn't return document node.
(In reply to alexander :surkov from comment #49) > (In reply to Trevor Saunders (:tbsaunde) from comment #47) > > > nsINode::GetParentElementCrossingShadowRoot apparently because who likes > > consistancy anyway. > > it seems it doesn't take into account XBL stuff but anyway, it has same Well, neither does GetParentNode afaik > issue as GetFlattenedParent(), i.e. it doesn't return document node. I think that's actually fine so long as you don't ask for a direct child of the root doc which I don't think we ever will.
(In reply to Trevor Saunders (:tbsaunde) from comment #50) > (In reply to alexander :surkov from comment #49) > > (In reply to Trevor Saunders (:tbsaunde) from comment #47) > > > > > nsINode::GetParentElementCrossingShadowRoot apparently because who likes > > > consistancy anyway. > > > > it seems it doesn't take into account XBL stuff but anyway, it has same > > Well, neither does GetParentNode afaik it does > > issue as GetFlattenedParent(), i.e. it doesn't return document node. > > I think that's actually fine so long as you don't ask for a direct child of > the root doc which I don't think we ever will. yes but it's not what Eitan's comment #44 is about
Switching the 2.1?->2.1+, on these fixed bugs as these are regression. Nothing to land here, its just flag-cleanup of 2.1? list. Please Ni me if there is confusion/disagreement.
blocking-b2g: 2.1? → 2.1+
VERIFIED FIXED on v2.1. Environmental Variables: Device: Flame 2.1 KK (319 MB) BuildID: 20141011000201 (full flash) Gaia: f5d4ff60ffed8961f7d0380ada9d0facfdfd56b1 Gecko: d813d79d3eae Gonk: 52c909e821d107d414f851e267dedcd7aae2cebf Version: 34.0a2 (2.1) Firmware: V180 User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0 Smart collection add and launch functions as expected.
Status: RESOLVED → VERIFIED
QA Whiteboard: [QAnalyst-Triage+][lead-review+] → [QAnalyst-Triage?][lead-review+]
Flags: needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage?][lead-review+] → [QAnalyst-Triage+][lead-review+]
Flags: needinfo?(ktucker)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: