Closed
Bug 237977
Opened 21 years ago
Closed 20 years ago
Infinite javascript loop could not stopped.
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: mark, Assigned: jst)
References
()
Details
(Keywords: hang)
Attachments
(5 files, 3 obsolete files)
(deleted),
text/html
|
Details | |
(deleted),
text/html
|
Details | |
(deleted),
text/html
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
brendan
:
review+
brendan
:
superreview+
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113
Entering http://www.netbricks.net/ into the URL field hangs all Mozilla windows
(browser and mail). The status bar shows "Transferring data from..."
The WindowsXP Task Manager shows CPU at 100% and the memory allocated to Mozilla
slowly increasing. I have left it for about 30 minutes and nothing changed (it
didn't run out of memory, but probably would after many hours).
The URL opens and displays OK in Opera (7.23) and MSIE (6.0.2800).
Reproducible: Always
Steps to Reproduce:
1. Enter http://www.netbricks.net/ in the URL windows and hit return
2. That's it!
3.
Actual Results:
Mozilla hangs.
Expected Results:
Displayed web page (or at least tried to).
Comment 1•21 years ago
|
||
Got the same result with Mozilla 1.7b
Comment 2•21 years ago
|
||
I see this also with
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7b) Gecko/20040316
I only have Process Explorer running. The stack trace of the thread shows
gklayout.dll+0xbae2
gklayout.dll+0xbb82
gklayout.dll+0x13597
gklayout.dll+0x1eebb
gklayout.dll+0x1d806
gklayout.dll+0x325d6
gklayout.dll+0x32fb7
gklayout.dll+0xddb6
gklayout.dll+0x9c52f
gklayout.dll+0xdda1d
gklayout.dll+0xe6e4c
gklayout.dll+0xe4564
gklayout.dll+0xe74bd
gklayout.dll+0xddcee
gklayout.dll+0xe06ba
gklayout.dll+0x13e577
gklayout.dll+0x13e733
xpc3250.dll+0x15ee4
js3250.dll+0x27c37
js3250.dll+0x281a5
js3250.dll+0x202bb
js3250.dll+0x1bdd3
js3250.dll+0x48b4
gklayout.dll+0x14cfa2
gklayout.dll+0xc3258
gklayout.dll+0xc3051
gklayout.dll+0xc2de5
gklayout.dll+0x1b55d8
gklayout.dll+0x1b5177
gklayout.dll+0x8f7ce
gklayout.dll+0xe7207
gklayout.dll+0xe5ba8
gkparser.dll+0xb0fa
gkparser.dll+0x8c1f
gkparser.dll+0x95d5
gkparser.dll+0x8747
gkparser.dll+0x809f
gkparser.dll+0x13440
gkparser.dll+0x13243
gkparser.dll+0x12b88
gkparser.dll+0x29364
gkparser.dll+0x5eb8
gkparser.dll+0x11dee
gkparser.dll+0x120ff
gkparser.dll+0x14158
gkparser.dll+0x12b00
ntoskrnl.exe+0x516104d
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 3•21 years ago
|
||
With a late nightly I got an endless list of errors:
CSS Error (http://www.netbricks.net/ :1.53): Unknown property 'filter'.
Declaration dropped.
'filter' is only used as a CSS property in dmenu2.js:
d.write("<div
style=\"position:absolute;visibility:hidden;z-index:1;filter:progid:DXImageTransform.Microsoft.Fade(duration=0.25)\"");
The gklayout.dll references in my previous message are reflow functions in
nsPresShell like AlreadyInQueue. The queue is growing which could be a result of
the endless number of written div's and table's in the javascript.
I think the reason is an endless recursion in buildMenu:
if(a[i][0]!=void(0)){
...
buildMenu(a[i]);
...
}
I haven't too much time to analyse the exact reason. The bug es reproducable by
renaming the dmenu2.js to .html, adding <html> and <body> tags and the the
buildMenu() javascript commands.
Maybe some one could create a minimal testcase with these information.
I think mozilla should not 'hang', even if this is a javascript implementation
error.
Comment 4•21 years ago
|
||
Updated•21 years ago
|
Assignee: general → general
Component: Browser-General → DOM: Level 0
QA Contact: general → ian
Comment 5•21 years ago
|
||
Confirmed.
Hangs on both URL and testcase.
- Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8a) Gecko/20040514
Firefox/0.8.0+
- Microsoft Windows 2000 Pro 5.00.2195 SP4
Wow, quite the bug you've found here!
Firefox 0.9.1+ (20040726) Mozilla 1.8.0 (2007071408) and Netscape 7.1 all froze
with URL and testcase.
Comment 7•21 years ago
|
||
I removed some of the "d.write" statements and the result is nearly the same.
CPU performance goes up to 100% but after a few seconds Mozilla comes back with
the popup dialog "A script on this page causing mozilla to run slowly. .... Do
you want to abort the script?". This popup dialog comes also back after
restoring some of the "d.write" statements, but it just takes a bit longer. I
guess, that the dialog also popup for the URL if you will wait long enough.
I see two bugs here:
1) The endless recursion in buildMenu. (Which could be a tech evangelism.)
2) The time before the script warning dialog popups.
Comment 8•21 years ago
|
||
(In reply to comment #7)
> 1) The endless recursion in buildMenu. (Which could be a tech evangelism.)
In http://www.netbricks.com/dmenu2.js is the variable "holdi" global defined,
but it's only used (local) in "buildMenu" to remember the last menu item when
building a submenu. Unfortunately is this variable overwritten if it builds a
sub submenu. This will result in a wrong menu item that should be drawn in the
main menu. The recursion stops if "ihold" is defined local.
An other problem is that the test if an element of "a[i]" is an array is done by
comparing "a[i][0]!=void(0)". If this is true, then it is assumed that "a[i]" is
an array. This test is also true for strings in Mozilla. (I think Mozilla
handles this correct.) I think a correct type test should be based on "typeof".
Comment 9•20 years ago
|
||
> 1) The endless recursion in buildMenu. (Which could be a tech evangelism.)
I filled bug 254540 for the www.netbricks.com tech evangelism.
This bug should follow the issue, that the javascript execution could not stopped.
Comment 10•20 years ago
|
||
Some information what I found so far. I tested this with in DevStudio.
nsJSContext::DOMBranchCallback is called well and so I expected the script
warning to abort the script. The marked line 479 in
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/dom/src/base/nsJSEnvironment.cpp&rev=1.236&root=/cvsroot&mark=479,486,502,2071#460
returns false every 1 minute. Line 486 every 8 minutes. And line 502 after 15
minutes. The timeout for a script defined in line 2071 is 5 seconds. So I think
the break logic in DOMBranchCallback needs some review. Maybe an additional time
based algorithm could help.
Idea: Calculate the runtime for let say n cycles. Interpolate from this the
expected cycles for a runtime of 5 seconds.
http://www.netbricks.net/ shows the script warning dialog if I run it in the
debugger. I don't know why it does not show it if I call the URL directly in
mozilla.
I'm not be able to create a real minimized testcase. I guess that mozilla builds
(in a separate task?) the HTML-output of a very (endles) nested table. Maybe the
real problem is a timing problem or a conflict between different tasks.
Firefox 1.0PRrc also hang. I think that such javascript is rare but it is
frustrating that javscript could hang a browser. Maybe it's too late, but I
think firefox should not be shipped with something like this.
Flags: blocking-aviary1.0?
Comment 11•20 years ago
|
||
Use the right component, please.
/be
Assignee: general → general
Component: JavaScript Engine → DOM: Level 0
Comment 12•20 years ago
|
||
jst, can you have a look at this?
Assignee: general → jst
Flags: blocking-aviary1.0? → blocking-aviary1.0+
Assignee | ||
Comment 13•20 years ago
|
||
Assignee | ||
Updated•20 years ago
|
Attachment #161510 -
Attachment is obsolete: true
Assignee | ||
Comment 14•20 years ago
|
||
Assignee | ||
Comment 15•20 years ago
|
||
There's more to this than some potentially slow DOM code on the site, the JS on
the page is an infinite loop in Firefox, and not so in IE. Check out the
attached testcase. If you add a "var" before the "holdi" in the for loop in the
buildMenu() function the problem goes away. With the attached testcase we show
the dialog that lets you terminate the script, with the real site we don't tho...
Assignee | ||
Comment 16•20 years ago
|
||
So the page does a bunch of document.writes etc, and the time between branch
callbacks goes way up, so it'll take us a *long* time to get to show the dialog...
I doubt there's much we can do about this for 1.0. Brendan, any clues as to why
the simplified testcase is an infinite loop in Firefox but not in IE?
Comment 17•20 years ago
|
||
SpiderMonkey extends ECMA-262 Edition 3 by reflecting string characters as
indexed properties whose values are unit-length strings, each containing the
indexed char: for s = "hello", s[0] === 'h', s[1] === 'e', etc.
This script is testing whether a[i] is an array by testing whether a[i][0] !=
(void)0. That's a bad idea in light of the freedom given in ECMA-262 Edition 3
Section 16 to extend objects with new properties. Someone please contact the
author of this dmenu3.js file and (a) implore them to use local variables well;
(b) test using (a[i].constructor == Array) or similar to determine whether a[i]
is an array.
/be
Assignee | ||
Comment 18•20 years ago
|
||
Resetting blocker flag, and over to evangelism.
Assignee: jst → french
Component: DOM: Level 0 → French
Flags: blocking-aviary1.0+
Product: Browser → Tech Evangelism
QA Contact: pschwartau → french
Version: Trunk → unspecified
Comment 19•20 years ago
|
||
The information in comments
In addition to my comment 7:
In all simplified testcases appears the warning dialog, but I never saw it for
the "not minimized testcase". (In one case I waited over 12 hours.) The
javascript warning will shown if I run it within a debugger. (Sorry that I
didn't mentioned this before.)
As per comment 9, I already filed a bug for the evangelism (bug 254540). I
leaved this bug for the problem why JS doesnt't show a warning for this infinite
loop.
If it is possible to popup the warning dialog with the "Not minimized testcase",
then this bug should closed as INVALID or it should follow the issue that the
warning dialog should appear earlier (in an acceptable response time) and not
after 12 hours or more.
Assignee | ||
Comment 20•20 years ago
|
||
Ok, fair enough. Back to the browser.
Component: French → DOM
Product: Tech Evangelism → Browser
Version: unspecified → 1.0 Branch
Assignee | ||
Comment 21•20 years ago
|
||
I added some debugging code to see what's going on here and everything's working
as expected, sortof. The problem is that what the page is doing ends up being an
O(n^2) operation (or something of that order) where every iteration through the
infinite loop takes longer and longer, and we'd need 64k branch callbacks to
offer to stop the script. By the time I got to 20k things were going really slow
already, and I just gave up (after maybe 10k). Had I waited long enough, I bet
the dialog would have shown up (unless we'd run out of memory before that point)...
We *could*, and probably should, drop the number of required branch callbacks
before we display the dialog since I was running Firefox for some time today and
never ever saw us get past the initial filter in the branch callback function
(4096 callbacks), so we wouldn't hurt performance in general cases, and we'd get
the benefits of a browser that's harder to lock up.
Assignee | ||
Updated•20 years ago
|
Assignee: french → jst
Comment 22•20 years ago
|
||
In comment 10 I suggested to implement a time depending algorithm. I think about
something like this:
1) Perform every 4096 callbacks a check.
2) Calculate the runtime for the first check.
3) Calculate the number of checks which are expected for a runtime of 5 seconds.
I guess that in cases like O(n^2) these 5 seconds are exceeded by the first
check. (Or the number of checks is low enough.)
4) Show the dialog after the number of calculted checks.
The values are only suggestions based on the existing code. Maybe other
improvements like a recalculation of the number of checks every 10 checks are
also usefull.
Comment 23•20 years ago
|
||
Only an idea idea: Check the javascript performance of the computer (in the
startup sequence) and depending on this is the callback check threshold calculated.
Assignee | ||
Comment 24•20 years ago
|
||
(In reply to comment #23)
> Only an idea idea: Check the javascript performance of the computer (in the
> startup sequence) and depending on this is the callback check threshold
calculated.
That'll give us inconsistent behaviour since there's no way to know that what we
measured wasn't under heavy load. It won't be reliable.
Comment 25•20 years ago
|
||
This patch is only a suggestion for more (and acurate) flexible and time based
outer check could be implemented. The URL self runs for approx. 10 seconds.
before the warning appears.
Even if the machine is slow during the first check, is the correct number of
checks recalculated before the warning appears. The worst case is if the
computer s performance increases slightly, so each time is a new number of
checks calculated. I think that the algorithm self doesn't need too much time,
because the most of the calls are filtered by the first "if" statement before.
Possible problems
1) is the initial PR_Now() correct placed?
2) duration must be non zero but this is expected but not checked.
(We have 4096 calls to the function before we calculate the duration)
3) I don't know if the usage of the long macros and data types is correct.
4) At least 4096 calls are made before a time check is done.
5) A linear "time/number of callbacks" is expected.
(We will calculate a too large number for O(n^2) problems, but I think
the result is also acceptable in those cases.)
Assignee | ||
Comment 26•20 years ago
|
||
This is a simpler approach, that I think will give us more consistent results
and should IMO be good enough. What it does is to lower the number of callbacks
needed to initialize the start time of the script from 32k to 256, and from
then on check the time every 4k callbacks, that way this works fairly well
since even in O(n^2) JS code we'll get to 4096 in a fairly reasonable ammount
of time, compared to the 64k callbacks it used to take to hit the time check.
And even if these numbers seem fairly low, we don't hit even the stat time
initialization code (i.e. we don't get 256 callbacks) while starting Firefox
and doing some normal browsing on random sites. And sure, the number of time
comparison's we'll do in (pathological) tight loops goes up, but it's not like
they're *that* expensive after all.
Comment 27•20 years ago
|
||
Can we get TIBET and other intensive JS frameworks and apps tested on a build
with this patch? jst can probably furnish a Windows Firefox build.
/be
Comment 28•20 years ago
|
||
Sure, if jst can shoot me a build (Windows or Linux, doesn't matter), I'd be
more than happy to. I don't build Mozilla regularly here.
Cheers,
- Bill
Comment 29•20 years ago
|
||
Comment on attachment 162753 [details] [diff] [review]
Simpler approach.
+ if (++ctx->mBranchCallbackCount == INITIALIZE_TIME_BRANCH_COUNT)
mBranchCallbackCount can overflow so this can lead to an infinite loop.
Assignee | ||
Comment 30•20 years ago
|
||
Attachment #162753 -
Attachment is obsolete: true
Assignee | ||
Comment 31•20 years ago
|
||
Comment on attachment 167784 [details] [diff] [review]
Updated diff of the simpler approach with no wrap-around problems.
Brendan, if you r+sr I can land this and we'll see how it performs in the real
world. I've been running with this for a while now w/o ever seeing this dialog
pop up when it shouldn't.
Attachment #167784 -
Flags: superreview?(brendan)
Attachment #167784 -
Flags: review?(brendan)
Comment 32•20 years ago
|
||
I've been buried here so I haven't worked through the problems I had with the
private build that jst provided.
I would still be to put TIBET through its paces with this patch, if I could get
a functional build (I'm probably being stupid somewhere :-)). I can't think of a
library that exercises the JS engine more than we do.
I will have some time within the next couple of days to do this. Otherwise, if
Brendan wants to r+sr, I can just download a nightly and, if TIBET runs into
real problems, I'll either file against this bug or open a new one.
Whichever you guys want to do, I'm game for.
Cheers,
- Bill
Comment 33•20 years ago
|
||
Comment on attachment 167784 [details] [diff] [review]
Updated diff of the simpler approach with no wrap-around problems.
I'd rather see one test that causes an early return, so can you use the
smallest mask (INITIALIZE_TIME_BRANCH_MASK 0xff) and return if the incremented
counter is not zero masked with that?
If it is 0, then you can do the equality test with that mask + 1 (256), and do
further stuff such as the MAYBE_GC mask test.
/be
Assignee | ||
Comment 34•20 years ago
|
||
Attachment #167784 -
Attachment is obsolete: true
Assignee | ||
Updated•20 years ago
|
Attachment #167784 -
Flags: superreview?(brendan)
Attachment #167784 -
Flags: review?(brendan)
Assignee | ||
Updated•20 years ago
|
Attachment #168174 -
Flags: superreview?(brendan)
Attachment #168174 -
Flags: review?(brendan)
Comment 35•20 years ago
|
||
Comment on attachment 168174 [details] [diff] [review]
Faster than the above, per Brendan's suggestion.
Nit: INIT_TIME_BRANCH_COUNT_MASK is more consistent with the other *_COUNT_MASK
names (past, and single present example), and still readable.
r+sr=me, thanks!
/be
Attachment #168174 -
Flags: superreview?(brendan)
Attachment #168174 -
Flags: superreview+
Attachment #168174 -
Flags: review?(brendan)
Attachment #168174 -
Flags: review+
Assignee | ||
Comment 36•20 years ago
|
||
Fix checked in. Looks like the site in this bug has changed since this was
reported, so hard to use that to verify this now...
Status: NEW → ASSIGNED
Comment 37•20 years ago
|
||
Johnny, is the testcase in bug 261974 useful for verifying?
Assignee | ||
Comment 38•20 years ago
|
||
Maybe, in a build with SVG enabled, that is.
Assignee | ||
Comment 39•20 years ago
|
||
Oh, and marking FIXED.
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Comment 40•20 years ago
|
||
Initial testing with the TIBET framework on:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8a6) Gecko/20041213
shows that we're ok.
Still reserving the right to yell at some point :-).
Thanks Johnny.
Cheers,
- Bill
Comment 41•19 years ago
|
||
The first textcase, not minimised (attachment 144328 [details]), still hangs SeaMonkey
1.0a, without any warning dialog...
(stack : http://rafb.net/paste/results/7hbZgb99.nln.html )
Comment 42•19 years ago
|
||
(In reply to comment #41)
attachment 144328 [details] causes the slow dialog warning in FF 1.6a1/WinXP from today
with a fresh profile and the default dom.max_script_run_time of 5 seconds.
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•