Closed Bug 933483 Opened 11 years ago Closed 11 years ago

[Poppit] App freezes when changing Poppit skill level

Categories

(Core :: DOM: Core & HTML, defect, P2)

x86_64
Gonk (Firefox OS)
defect

Tracking

()

RESOLVED FIXED
mozilla29
blocking-b2g koi+
Tracking Status
firefox26 --- wontfix
firefox27 --- wontfix
firefox28 --- fixed
firefox29 --- fixed
b2g-v1.2 --- fixed
b2g-v1.3 --- fixed
b2g-v1.3T --- fixed
b2g-v1.4 --- verified

People

(Reporter: ehelton, Assigned: mrbkap)

References

Details

(Keywords: regression)

Attachments

(7 files, 6 obsolete files)

(deleted), image/png
Details
(deleted), application/zip
Details
(deleted), text/plain
Details
(deleted), text/html
Details
(deleted), text/plain
Details
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
smaug
: review+
Details | Diff | Splinter Review
Attached image Poppit Skill Level Screen frozen.png (deleted) —
Summary: [Poppit] App freezes when changing Poppit skill level

Repro Steps:
-Open Poppit App
-Tap screen to begin game
-Press Menu button
-Select “Options” from Main Menu
-Press any level button (Easy, Medium, or Hard)

Actual: A page will come up that says “Changing skill level will erase current progress. Do you want to change the skill level?” There are buttons for ‘Cancel’ and ‘OK’ but pressing them does not produce any behavior. Also impossible to go back to previous page, so the user has to kill the app entirely. 
 
Expected: Pressing ‘Cancel’ would return the user to the previous Options page. Pressing ‘OK’ would change the skill level appropriately.

Environmental Variables
Build ID: 20131028225522
Gecko:
Gaia:
Platform Version: 26.0a2

Notes: Diagnosed on Keon device FFOS v1.2 nightly build
This is a Gaia System bug - might be a dupe of an existing bug as well. Let me look around.
Assignee: jsmith → nobody
No longer blocks: b2g-poppit
Component: Preinstalled B2G Apps → Gaia::System::Window Mgmt
Product: Tech Evangelism → Firefox OS
Pretty sure this is a regression, but someone should confirm if this reproduces on 1.1 or not.
blocking-b2g: --- → koi?
Keywords: qawanted, regression
QA Contact: sparsons
The only thing I could figure out is it's using window.confirm and something is wrong with that.
I was not able to reproduce this issue on Buri 1.1 Build ID: 20131101041316

Gaia   39b0203fa9809052c8c4d4332fef03bbaf0426fc
SourceStamp 31fa87bfba88
BuildID 20131101041316
Version 18.0

This issue does however reproduce on Buri 1.2 Build ID:20131101004000

Gaia   e717aec947571f5daf923c040a82f9f0719bb526
SourceStamp 54de309e18a9
BuildID 20131101004000
Version 26.0
Keywords: qawanted
Just realized - we can't get a regression window here easily because Poppit was OOMing until quite recently when a SkiaGL fix came in.

If we were to get a regression window here, we would need to get a reduced test case extracted out of Poppit that would not trigger the SkiaGL bug. Don't know how easy that's going to be.

I'll let a developer look into first to indicate if we need dig in deeper here or not.
Alive,

Please take a look.
blocking-b2g: koi? → koi+
Flags: needinfo?(alive)
Assignee: nobody → alive
Flags: needinfo?(alive)
Alive, what are the next steps here or do you need help with investigation or reproduction here ? Would this be resolved in the upcoming sprint deadline of (11/22)?
Flags: needinfo?(alive)
Sorry for late reply.
It's using window.confirm but that one is working in other apps, so this is not a general case for window.confirm.

I think we need gecko guy to look at the hanging since I don't think gaia has anything to do with "hanging".
Assignee: alive → nobody
Component: Gaia::System::Window Mgmt → General
Flags: needinfo?(alive)
I debug a little bit, and noticed that the app is not really hanging. We receive *only* 45 confirm prompts so if you click 45 times on the ok button then it works.

Since it works for some other apps, I wonder if we can have access to the poppip underlying code for this screen? I'm curious to see if this is a bug on our implement of mozbrowser of if it is a bug in the application code directly.
(In reply to Vivien Nicolas (:vingtetun) (:21) from comment #9)
> I debug a little bit, and noticed that the app is not really hanging. We
> receive *only* 45 confirm prompts so if you click 45 times on the ok button
> then it works.
> 
> Since it works for some other apps, I wonder if we can have access to the
> poppip underlying code for this screen? I'm curious to see if this is a bug
> on our implement of mozbrowser of if it is a bug in the application code
> directly.

Underlying code is here - https://marketplace.firefox.com/downloads/file/218874/poppit-1.0.5.zip.

I'm doubtful this is a bug in the app's code - we aren't seeing this reproduce on 1.1. This only reproduces on 1.2+.
Attached file Poppit App (deleted) —
(In reply to Vivien Nicolas (:vingtetun) (:21) from comment #9)
> I debug a little bit, and noticed that the app is not really hanging. We
> receive *only* 45 confirm prompts so if you click 45 times on the ok button
> then it works.
> 
> Since it works for some other apps, I wonder if we can have access to the
> poppip underlying code for this screen? I'm curious to see if this is a bug
> on our implement of mozbrowser of if it is a bug in the application code
> directly.

Vivien since you seem to debug this a bit, would you mind owning this bug and helping with investigation further ?
Flags: needinfo?(21)
(In reply to bhavana bajaj [:bajaj] from comment #12)
> 
> Vivien since you seem to debug this a bit, would you mind owning this bug
> and helping with investigation further ?

Well I didn't have any real idea about what's going on here. I just looked to see how many requests we received from the content. 

(In reply to Jason Smith [:jsmith] from comment #10)
> Underlying code is here -
> https://marketplace.firefox.com/downloads/file/218874/poppit-1.0.5.zip.
>
Thanks.
 
> I'm doubtful this is a bug in the app's code - we aren't seeing this
> reproduce on 1.1. This only reproduces on 1.2+.

That makes sense. The code also seems to just do a simple confirm. Not sure what's going on here.
Flags: needinfo?(21)
Andrew,

Can you please check and reassign appropriately?
Flags: needinfo?(overholt)
Hmm we sort of have this problem on desktop too if you manage to create multiple alerts/confirms...  I think dolske knows a bit about that code at least on desktop.
Also, if the same app code doesn't show this behavior on 1.1, then this is probably a bug in our code and somebody should bisect it.
The regression range here doesn't have the right changesets below:

Gaia   39b0203fa9809052c8c4d4332fef03bbaf0426fc
SourceStamp 31fa87bfba88
BuildID 20131101041316
Version 18.0

Sarah - Can you provide the right changesets here?
Flags: needinfo?(sparsons)
(In reply to Jason Smith [:jsmith] from comment #17)
> The regression range here doesn't have the right changesets below:
> 
> Gaia   39b0203fa9809052c8c4d4332fef03bbaf0426fc
> SourceStamp 31fa87bfba88
> BuildID 20131101041316
> Version 18.0
> 
> Sarah - Can you provide the right changesets here?

Disregard - this is the right information.
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #16)
> Also, if the same app code doesn't show this behavior on 1.1, then this is
> probably a bug in our code and somebody should bisect it.

Unless we have a reduced test case to reproduce the bug, we can't bisect this. Poppit was OOMing for most of the 1.2 release, so it will prevent being able to test this bug for a regression range.
(In reply to comment #19)
> (In reply to :Ehsan Akhgari (needinfo? me!) from comment #16)
> > Also, if the same app code doesn't show this behavior on 1.1, then this is
> > probably a bug in our code and somebody should bisect it.
> 
> Unless we have a reduced test case to reproduce the bug, we can't bisect this.
> Poppit was OOMing for most of the 1.2 release, so it will prevent being able to
> test this bug for a regression range.

Hmm, that's unfortunate.  Can we bisect on a device with better specs such as a Nexus 4?
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #20)
> (In reply to comment #19)
> > (In reply to :Ehsan Akhgari (needinfo? me!) from comment #16)
> > > Also, if the same app code doesn't show this behavior on 1.1, then this is
> > > probably a bug in our code and somebody should bisect it.
> > 
> > Unless we have a reduced test case to reproduce the bug, we can't bisect this.
> > Poppit was OOMing for most of the 1.2 release, so it will prevent being able to
> > test this bug for a regression range.
> 
> Hmm, that's unfortunate.  Can we bisect on a device with better specs such
> as a Nexus 4?

Hmm...we might be able to do that with a Leo device, as the Leo device has more memory.

Can someone try getting a regression range here using a Leo device?
Flags: needinfo?(sparsons)
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #15)
> Hmm we sort of have this problem on desktop too if you manage to create
> multiple alerts/confirms...  I think dolske knows a bit about that code at
> least on desktop.

I'd be curious as to what the call-stack (JS+native) is when the 2nd prompt is triggered. I was sorta thinking Gecko was blocking the recursive calls, but I'm not sure we actually do (since it works but is just horrible UI if a page manages to trigger it).

Comment 9 also kind implies this is a bug in the app (for calling confirm() multiple times, which is silly), although it would be nice if we just outright blocked that.
(In reply to comment #22)
> (In reply to :Ehsan Akhgari (needinfo? me!) from comment #15)
> > Hmm we sort of have this problem on desktop too if you manage to create
> > multiple alerts/confirms...  I think dolske knows a bit about that code at
> > least on desktop.
> 
> I'd be curious as to what the call-stack (JS+native) is when the 2nd prompt is
> triggered. I was sorta thinking Gecko was blocking the recursive calls, but I'm
> not sure we actually do (since it works but is just horrible UI if a page
> manages to trigger it).

Well, we protect against the simple cases (such as timeouts and intervals), but not against everything.  See this test case for example <http://people.mozilla.org/~eakhgari/confirm-dance.html> which uses XHR events dispatched to the confirm's nested event loop.

> Comment 9 also kind implies this is a bug in the app (for calling confirm()
> multiple times, which is silly), although it would be nice if we just outright
> blocked that.

Yeah.  I'm really curious to know how this app manages to do this regardless.
This issue started to occur on the Leo 1.2 Build ID: 20130926004001

Gaia   1e9470b9b6df630eddf1c4c8b25b3170ee786b0e
SourceStamp caa6b17647c5
BuildID 20130926004001
Version 26.0a2

Last working Leo 1.2 Build ID: 20130925004005

Gaia   b0e4a1333bb7bf0a749a384ba99e4f03f111e39a
SourceStamp fb764e648a8f
BuildID 20130925004005
Version 26.0a2
Can you also get a logcat?
Keywords: qawanted
So this definitely doesn't look Gaia related - there's nothing that landed in the system app codebase on 9/25/2013 on the 1.2 branch.
Attached file Poppit_933483_logcat (deleted) —
See logcat attached.
Keywords: qawanted
Hmm...the logcat I got included the following:

11-19 14:19:13.629: E/GeckoConsole(538): [JavaScript Error: "too much recursion" {file: "chrome://global/content/BrowserElementChildPreload.js" line: 387}]
11-19 14:21:43.869: E/GeckoConsole(538): [JavaScript Error: "too much recursion" {file: "app://{e8f991fc-a345-4d30-a48e-d5f3e34c668c}/lib/impact/input.js" line: 186}]
11-19 14:21:43.939: E/GeckoConsole(538): [JavaScript Error: "too much recursion" {file: "app://{e8f991fc-a345-4d30-a48e-d5f3e34c668c}/lib/impact/input.js" line: 211}]
11-19 14:21:43.949: E/GeckoConsole(538): [JavaScript Error: "too much recursion" {file: "app://{e8f991fc-a345-4d30-a48e-d5f3e34c668c}/lib/impact/input.js" line: 163}]
11-19 14:21:43.949: E/GeckoConsole(538): [JavaScript Error: "too much recursion" {file: "app://{e8f991fc-a345-4d30-a48e-d5f3e34c668c}/lib/impact/input.js" line: 186}]
11-19 14:21:43.949: E/GeckoConsole(538): [JavaScript Error: "too much recursion" {file: "app://{e8f991fc-a345-4d30-a48e-d5f3e34c668c}/lib/impact/input.js" line: 211}]
11-19 14:21:43.949: E/GeckoConsole(538): [JavaScript Error: "too much recursion" {file: "app://{e8f991fc-a345-4d30-a48e-d5f3e34c668c}/lib/impact/input.js" line: 163}]
Interestingly enough, if you click the prompt enough, it will eventually disappear, but touch events will stop working entirely in the app at that point.
bug 914776 looks like a suspicious candidate for causing this.

mwu - what do you think?
Flags: needinfo?(mwu)
Very unlikely.
Flags: needinfo?(mwu)
In BrowserElementChildPreload we call _waitForResult and this seem to cause too much recursion:


I/Gecko:ProcessPriorityManager(  179): [Poppit!™, child-id=3, pid=922] Got wake lock changed event. Now mHoldsCPUWakeLock=1, mHoldsHighPriorityWakeLock=0
I/Gecko   (  922): [Child 922] WARNING: NS_ENSURE_TRUE(nsContentUtils::IsCallerChrome()) failed: file ../../../../../content/html/content/src/HTMLAudioElement.cpp, line 235
I/Gecko   (  922): BrowserElementChildPreload - _waitForResult([object XrayWrapper [object Window @ 0xb279e940 (native @ 0xb33e6ad0)]])
I/Gecko   (  922): BrowserElementChildPreload - Entering modal state (outerWindowID=1, innerWindowID=3)
I/Gecko   (  922): BrowserElementChildPreload - Nested event loop - begin
W/AudioFlinger(  173): Thread AudioOut_2 cannot connect to the power manager service
E/FastMixer(  173): did not receive expected priority boost
D/audio_hw_primary(  173): start_output_stream: enter: usecase(1: low-latency-playback) devices(0x2)
D/audio_hw_primary(  173): select_devices: out_snd_device(2: speaker) in_snd_device(0: none)
D/audio_hw_primary(  173): enable_snd_device: sending audio calibration for snd_device(2) acdb_id(14)
D/ACDB-LOADER(  173): ACDB -> send_afe_cal
D/audio_hw_primary(  173): enable_snd_device: snd_device(2: speaker)
D/audio_hw_primary(  173): enable_audio_route: apply mixer path: low-latency-playback
D/audio_hw_primary(  173): start_output_stream: exit
I/Gecko   (  922): [Child 922] WARNING: NS_ENSURE_TRUE(nsContentUtils::IsCallerChrome()) failed: file ../../../../../content/html/content/src/HTMLAudioElement.cpp, line 235
I/Gecko   (  922): BrowserElementChildPreload - _waitForResult([object XrayWrapper [object Window @ 0xb279e940 (native @ 0xb33e6ad0)]])
I/Gecko   (  922): BrowserElementChildPreload - Entering modal state (outerWindowID=1, innerWindowID=3)
I/Gecko   (  922): BrowserElementChildPreload - Nested event loop - begin
I/Gecko   (  922): [Child 922] WARNING: NS_ENSURE_TRUE(nsContentUtils::IsCallerChrome()) failed: file ../../../../../content/html/content/src/HTMLAudioElement.cpp, line 235
I/Gecko   (  922): BrowserElementChildPreload - _waitForResult([object XrayWrapper [object Window @ 0xb279e940 (native @ 0xb33e6ad0)]])
I/Gecko   (  922): BrowserElementChildPreload - Entering modal state (outerWindowID=1, innerWindowID=3)
I/Gecko   (  922): BrowserElementChildPreload - Nested event loop - begin
I/Gecko   (  922): [Child 922] WARNING: NS_ENSURE_TRUE(nsContentUtils::IsCallerChrome()) failed: file ../../../../../content/html/content/src/HTMLAudioElement.cpp, line 235
I/Gecko   (  922): BrowserElementChildPreload - _waitForResult([object XrayWrapper [object Window @ 0xb279e940 (native @ 0xb33e6ad0)]])
I/Gecko   (  922): BrowserElementChildPreload - Entering modal state (outerWindowID=1, innerWindowID=3)
I/Gecko   (  922): BrowserElementChildPreload - Nested event loop - begin

....

And if we eventually click OK we get
I/Gecko   (  922): BrowserElementChildPreload - recvStopWaiting(outer=1, inner=3, returnValue=true)
I/Gecko   (  922): BrowserElementChildPreload - recvStopWaiting [object XrayWrapper [object Window @ 0xb279e940 (native @ 0xb33e6ad0)]]
I/Gecko   (  922): BrowserElementChildPreload - Nested event loop - finish
E/GeckoConsole(  922): [JavaScript Error: "too much recursion" {file: "chrome://global/content/BrowserElementChildPreload.js" line: 398}]
(In reply to Preeti Raghunath(:Preeti) from comment #14)
> Andrew,
> 
> Can you please check and reassign appropriately?

Discussed in IRC - Fabrice or Vivien are people that could investigate this.

Fabrice - Would you be interested in looking into this bug?
Flags: needinfo?(overholt) → needinfo?(fabrice)
Mrbkap looked at it yesterday and has more details about it. He thinks its a bug in the app itself.
(In reply to Gregor Wagner [:gwagner] from comment #37)
> Mrbkap looked at it yesterday and has more details about it. He thinks its a
> bug in the app itself.

Well, I see :
System JS : ERROR resource://gre/modules/BrowserElementParent.jsm:439 - InternalError: too much recursion
System JS : ERROR (null):0 - too much recursion
System JS : ERROR resource://gre/modules/BrowserElementParent.jsm:439 - unknown (can't convert to string)
System JS : ERROR (null):0 - too much recursion
System JS : ERROR (null):0 - too much recursion
System JS : ERROR (null):0 - too much recursion
System JS : ERROR (null):0 - too much recursion

as soon as we open the confirmation dialog, so we are broken too.
Flags: needinfo?(fabrice)
Flags: needinfo?(mrbkap)
(In reply to Fabrice Desré [:fabrice] from comment #38)
> as soon as we open the confirmation dialog, so we are broken too.

When you say "as soon as" what do you mean?

Looking at the source code of the application, here's its basic structure:

function main() { window.requestAnimationFrame(main, ...); tick(); }
function tick() { inputHandler.update(); }
inputHandler.update = (() => { if (inputState.clicked) clickedThing.handle(); inputState.reset(); });
clickedThing.handle = (() => { window.confirm(); });

Of note here: we requestAnimationFrame before calling confirm() and we clear our input state *after* the call to confirm. So, for every call to processNextEvent in confirm (or really _waitForResult), we call confirm again. This eats up the stack pretty fast and is, as far as I could tell yesterday, the reason for the "too much recursion" exceptions.
Flags: needinfo?(mrbkap)
Note - I doubtful this is a bug in the app, as this has worked on 1.1 & 1.01 fine. It's not working 1.2+.
I think I see a potential problem. I'll debug this tomorrow.

In short, my explanation in comment 39 is correct, but we should be suppressing requestAnimationFrame events while the window is showing a modal dialog. This should hopefully be easy to fix.
Assignee: nobody → mrbkap
(In reply to Blake Kaplan (:mrbkap) from comment #41)
> In short, my explanation in comment 39 is correct, but we should be
> suppressing requestAnimationFrame events while the window is showing a modal
> dialog. This should hopefully be easy to fix.

This is exactly what's happening. I don't have an explanation for why the behavior would have changed and I can't find a combination of Gecko+Gaia from the working range to debug to see why things worked then.

Note that the behavior of continuing to fire animation frame events during a modal dialog is the same on desktop.
Attached file testcase (deleted) —
Click on the button to see the behavior.
This more closely imitates what the poppit app does. In particular, the key is the alert showing that we now recursively call the request animation frame callback.

I have no idea what to make of the regression range in comment 25. Looking at the code, this is almost certainly a regression from bug 731974. Jason, can you test trunk builds from before and after that bug landed (that would be changeset 513ec84b5c88)?

Also: for what it's worth, it's possible to see a kinder, gentler version of this bug even before bug 731974: with a working build, do the steps to reproduce here and then turn off and on the screen. The result should be one duplicated confirmation per screen cycle.
Attached patch Possible fix (obsolete) (deleted) — Splinter Review
This fixes both the testcases in this bug to imitate Webkit and Blink.  It's a little scary that if there are other APIs that spin the event loop, this sort of bug will be possible with them. I'll take a look at that later.
Attachment #8337189 - Flags: review?(bugs)
Attachment #8337185 - Attachment description: better testcase → better testcase (careful: freezes Firefox)
(In reply to Blake Kaplan (:mrbkap) from comment #44)
> Created attachment 8337185 [details]
> better testcase (careful: freezes Firefox)
> 
> This more closely imitates what the poppit app does. In particular, the key
> is the alert showing that we now recursively call the request animation
> frame callback.
> 
> I have no idea what to make of the regression range in comment 25. Looking
> at the code, this is almost certainly a regression from bug 731974. Jason,
> can you test trunk builds from before and after that bug landed (that would
> be changeset 513ec84b5c88)?
> 
> Also: for what it's worth, it's possible to see a kinder, gentler version of
> this bug even before bug 731974: with a working build, do the steps to
> reproduce here and then turn off and on the screen. The result should be one
> duplicated confirmation per screen cycle.

That won't be possible - the oldest trunk builds we have don't go back to 2012.
Comment on attachment 8337189 [details] [diff] [review]
Possible fix

Make sure to push to try.
Attachment #8337189 - Flags: review?(bugs) → review+
This appears to be orange on try. I'll look more tomorrow.
(In reply to Blake Kaplan (:mrbkap) from comment #45)
> Created attachment 8337189 [details] [diff] [review]
> Possible fix
> 
> This fixes both the testcases in this bug to imitate Webkit and Blink.  It's
> a little scary that if there are other APIs that spin the event loop, this
> sort of bug will be possible with them. I'll take a look at that later.

I think the only APIs that can do that are alert(), confirm() and sync XHR.

(BTW, fantastic debugging here!)
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #49)
> (BTW, fantastic debugging here!)

Big +1 from me on that!
Component: General → DOM
Product: Firefox OS → Core
(In reply to Blake Kaplan (:mrbkap) from comment #48)
> This appears to be orange on try. I'll look more tomorrow.

Any updates? Please ask QA to help verify once we've got a working try build.
Attached patch Possible fix for the orange (obsolete) (deleted) — Splinter Review
I'm about to send this to try, but this *should* fix the orange (I can't reproduce the orange locally). Basically, we can close a window between firing off the event to send delayed events and need to deal with that case.

This applies on top of attachment 8337189 [details] [diff] [review].
Attachment #8341250 - Flags: review?(bugs)
Attachment #8341250 - Flags: review?(bugs) → review+
New try push is at https://tbpl.mozilla.org/?tree=Try&rev=f42bae68504b for those following along at home.
https://hg.mozilla.org/integration/mozilla-inbound/rev/77c1f23afc1c with a fix for the orange (I fixed some assertions in the test that was originally failing and had to update the assertion count.
Also of note (and sorry for the spam):

I didn't write a test because it's not immediately clear to me how to test a negative given our current test suite.

I squashed the two patches found here into a single patch because the first patch requires the second to not be orange.
Flags: in-testsuite?
Attached patch Green on try (obsolete) (deleted) — Splinter Review
Many thanks to Olli for his input here! This is green on try and fixes the (manual) testcases in this bug. The important bit is that we no longer send delayed events for documents that have navigated.
Attachment #8337189 - Attachment is obsolete: true
Attachment #8341250 - Attachment is obsolete: true
Attachment #8342051 - Flags: review?(bugs)
Attachment #8342051 - Flags: review?(bugs) → review+
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/567d2bba4f97
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
Backed out for bustage. Looks like this depends on bug 932309. Please attach a branch-specific patch.
https://hg.mozilla.org/releases/mozilla-b2g26_v1_2/rev/1a484614916b

https://tbpl.mozilla.org/php/getParsedLog.php?id=31495004&full=1&branch=mozilla-b2g26_v1_2#error0

/usr/local/bin/ccache /builds/slave/m-b26_12-osx64-d-0000000000000/build/clang/bin/clang++ -o DOMStorageIPC.o -c  -fvisibility=hidden -DDOM_STORAGE_TESTS -DMOZILLA_INTERNAL_API -DIMPL_LIBXUL  -DSTATIC_EXPORTABLE_JS_API -DNO_NSPR_10_SUPPORT -DOS_POSIX=1 -DOS_MACOSX=1  -I/builds/slave/m-b26_12-osx64-d-0000000000000/build/dom/base -I/builds/slave/m-b26_12-osx64-d-0000000000000/build/content/events/src -I/builds/slave/m-b26_12-osx64-d-0000000000000/build/ipc/chromium/src -I/builds/slave/m-b26_12-osx64-d-0000000000000/build/ipc/glue -I../../../ipc/ipdl/_ipdlheaders  -I/builds/slave/m-b26_12-osx64-d-0000000000000/build/dom/src/storage -I. -I../../../dist/include  -I/builds/slave/m-b26_12-osx64-d-0000000000000/build/obj-firefox/dist/include/nspr -I/builds/slave/m-b26_12-osx64-d-0000000000000/build/obj-firefox/dist/include/nss      -fPIC  -Qunused-arguments  -DMOZILLA_CLIENT -include ../../../mozilla-config.h -MD -MP -MF .deps/DOMStorageIPC.o.pp -Qunused-arguments  -Qunused-arguments -Wall -Wpointer-arith -Woverloaded-virtual -Werror=return-type -Wtype-limits -Wempty-body -Wsign-compare -Wno-invalid-offsetof -Wno-c++0x-extensions -Wno-extended-offsetof -Wno-unknown-warning-option -Wno-return-type-c-linkage -Wno-mismatched-tags -Wno-error=uninitialized -Wno-error=deprecated-declarations -fno-exceptions -fno-strict-aliasing -fno-rtti -ffunction-sections -fdata-sections -fno-exceptions -fno-math-errno -std=gnu++0x -pthread -DNO_X11 -pipe  -DDEBUG -D_DEBUG -DTRACING -g -O3 -fno-omit-frame-pointer  -Werror  /builds/slave/m-b26_12-osx64-d-0000000000000/build/dom/src/storage/DOMStorageIPC.cpp
../../../dom/base/nsFocusManager.cpp:995:41: error: no member named 'IsCurrentInnerWindow' in 'nsPIDOMWindow'; did you mean 'GetCurrentInnerWindow'?
          !aDocument->GetInnerWindow()->IsCurrentInnerWindow()) {
                                        ^~~~~~~~~~~~~~~~~~~~
                                        GetCurrentInnerWindow
../../dist/include/nsPIDOMWindow.h:303:18: note: 'GetCurrentInnerWindow' declared here
  nsPIDOMWindow *GetCurrentInnerWindow() const
                 ^
1 error generated.
This needs to be checked in before attachment 8342051 [details] [diff] [review] and with it that patch should be fine.
Keywords: verifyme
Backed out in https://hg.mozilla.org/mozilla-central/rev/42b2a2adda8f for causing bug 946726.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Keywords: verifyme
Because 1.2 is going to be finished, should we mark this bug from koi+ to 1.3+? Thanks.
I stared at this today and can't figure out what's going on. I can't reproduce it (and Shu couldn't either).

The easiest way to get this landed would be to use a more conservative API that only suppresses request animation frame callbacks while there's a modal dialog (I think that's pretty much guaranteed to not regress anything).

I'll try to debug this again tomorrow and see if I can turn anything up before giving up...
Attached patch Rolled up patch for trunk (obsolete) (deleted) — Splinter Review
The important change here is to unsuppress events on mDoc for inner windows in FreeInnerObjects. It turns out that the sidebar, instead of removing the frame, navigates to about:blank and hides it. This change makes sure that we unsuppress events and get rid of the strong refs in the right places.
Attachment #8342051 - Attachment is obsolete: true
Attachment #8347462 - Flags: review?(bugs)
Attachment #8347462 - Flags: review?(bugs) → review+
Let's try this again. Please note the green (several times over) try run pointed to by comment 68.
Keywords: checkin-needed
Attached patch Fix for orange on top of attachment 8347462 (obsolete) (deleted) — Splinter Review
The previous patch managed to race with the patch for bug 949589, which exposed a bug in this code: we weren't properly handling mSuspendedDoc in all cases and with the fixed code in the in-content prefs pane we were leaking. In particular, we were playing fast and loose with which window owned which document. Adding to the confusion, we weren't traversing mSuspendedDoc, leading to even more leaks. I think this patch clarifies the situation.

I decided not to add mSuspendedDoc to cycle collection because we have a known lifecycle for it and I wanted this patch to be safe. In addition, the delayed events in the focus manager aren't cycle collected. Olli, should I file a followup to do that? bent seemed to think it made sense to *not* cycle collect them. Either way, this patch shouldn't leak anymore and we should be able to finally get this checked in.

I have a new try run at <https://tbpl.mozilla.org/?tree=Try&rev=c3e053a5ade2>.
Attachment #8348434 - Flags: review?(bugs)
Comment on attachment 8348434 [details] [diff] [review]
Fix for orange on top of attachment 8347462 [details] [diff] [review]

This is orange on try...
Attachment #8348434 - Attachment is obsolete: true
Attachment #8348434 - Flags: review?(bugs)
Attached patch Fix for orange on top of attachment 8347462 (obsolete) (deleted) — Splinter Review
Olli, please see comment 73 for an explanation.

My previous patch tried to enforce the "mSuspendedDoc only on the outer window" assertion by asserting (and then forwarding) calls to Enter/LeaveModalState to the outer window. That didn't work. This approach works much better according to try.
Attachment #8348547 - Flags: review?(bugs)
Blake,

In 12/17 triage we believe that this will remain a blocker for 1.2. Please land when appropriate.
Flags: needinfo?(mrbkap)
(In reply to Blake Kaplan (:mrbkap) from comment #73)
> I decided not to add mSuspendedDoc to cycle collection because we have a
> known lifecycle for it and I wanted this patch to be safe. In addition, the
> delayed events in the focus manager aren't cycle collected. Olli, should I
> file a followup to do that? bent seemed to think it made sense to *not*
> cycle collect them.
FocusManager is a service to cycle collecting stuff it owns isn't that useful.
Comment on attachment 8348547 [details] [diff] [review]
Fix for orange on top of attachment 8347462 [details] [diff] [review]

We should make nsGlobalWindow::Enter/LeaveModalState() forward to outer window.
That seems to be what the code anyway always has when using those methods.

Could you please upload a full patch. Hard to understand this one.
Attachment #8348547 - Flags: review?(bugs)
Attached patch As requested (deleted) — Splinter Review
Attachment #8347462 - Attachment is obsolete: true
Attachment #8348547 - Attachment is obsolete: true
Attachment #8349054 - Flags: review?(bugs)
Attachment #8349054 - Flags: review?(bugs) → review+
Once more with feeling: https://hg.mozilla.org/integration/mozilla-inbound/rev/08dc60299942
Flags: needinfo?(mrbkap)
https://hg.mozilla.org/mozilla-central/rev/08dc60299942
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Depends on: 956704
Depends on: 959585
Depends on: 964076
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #83)
> https://hg.mozilla.org/releases/mozilla-aurora/rev/57157e39abea
> 
> https://hg.mozilla.org/releases/mozilla-b2g26_v1_2/rev/f9f4be5e8036
> https://hg.mozilla.org/releases/mozilla-b2g26_v1_2/rev/c5eda4db2804

Why was this bug uplifted? It caused some regressions (the bugs that this one depends on) that aren't fixed yet.
Depends on: 965673
Depends on: 965690
Depends on: 1004458
No longer depends on: 1004458
The fix was verified with the following devices:

1.4F Environmental Variables:
Device: Flame 1.4F MOZ
BuildID: 20140522000200
Gaia: 233dd43b3b976e66a619dbc1b4044ed1e3ca3e34
Gecko: c3c0c57daef8
Version: 30.0
Firmware Version: v10F-3

-----------------------------------------------------

1.4 Environmental Variables:
Device: Buri 1.4 MOZ
BuildID: 20140522000200
Gaia: 233dd43b3b976e66a619dbc1b4044ed1e3ca3e34
Gecko: c3c0c57daef8
Version: 30.0
Firmware Version: v1.2-device.cfg

-----------------------------------------------------

1.4 Environmental Variables:
Device: Open_C 1.4 MOZ
BuildID: 20140522000200
Gaia: 233dd43b3b976e66a619dbc1b4044ed1e3ca3e34
Gecko: c3c0c57daef8
Version: 30.0
Firmware Version: v10F-3
Flags: in-testsuite?
Component: DOM → DOM: Core & HTML
No longer depends on: 965673
Regressions: 965673
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: