Open Bug 467742 Opened 16 years ago Updated 2 years ago

Frequent random orange in Windows: test_acid3_test46.html times out

Categories

(Core :: General, defect)

x86
Windows XP
defect

Tracking

()

People

(Reporter: roc, Unassigned)

Details

http://tinderbox.mozilla.org/showlog.cgi?tree=Firefox&errorparser=unittest&logfile=1228266174.1228270184.26607.gz&buildtime=1228266174&buildname=WINNT%205.2%20mozilla-central%20moz2-win32-slave07%20dep%20unit%20test&fulltext=1 Changeset 09195bdb8ff7 looks OK. There are failing builds with changeset c6b884676c0d. So something in that window caused it, but I've no idea what. Bug 458898 is my current best guess; I'll try backing that out. But then I need to sleep.
I'm a bit suspicious of bug 463289 though. Can't it cause deadlocks, if we try to load a component while holding a lock which the main thread is waiting on?
I've backed out my patch for bug 458898. If that doesn't work, I recommend backing out bug 463289 next.
Did anyone catch the deadlock in a debugger? We aren't supposed to hold locks while loading components...
Did one of the backouts fix things? If so, which?
Apparently they didn't fix. Windows tboxes have been randomly orange for sometime - maybe since branching.
Is that random orange all related to this test?
I don't think so. The end result is timeout error, but it may happen also elsewhere than after test_acid3_test46.html
Are the unit test boxes on more-heavily-loaded VMs now that the branch tinderboxes are running too?
In a scan of the 12 hours before c6b884676c0d landed, I didn't see any hangs in test_acid3_test46.html. There are lots from c6b884676c0d on. So I think something did change around then.
(In reply to comment #8) > I don't think so. The end result is timeout error, but it may happen also > elsewhere than after test_acid3_test46.html Yeah, the point of failure keeps moving, although it's often in the same test on different runs, which is interesting --- the failure point is not completely random. But this definitely looks like it started in the window ending at c6b884676c0d and probably (but not necessarily) starting after 09195bdb8ff7.
These are the non-merge changesets in that window that haven't been backed out yet: ce72e9a5dca0 Michael Ventnor — Bug 458031. Take dirty rect into account to limit box-shadow computation. r+sr=roc 1d84189da181 Robert Longson — Bug 465996. Use Ellipse instead of Arc to draw circles. r+sr=roc 9e1eab6135e2 Jonathan Kew — Bug 467228. Disable line start/end swashes on Mac since we don't support line-boundary shaping properly yet. r=roc (Tests) 7baaa800925d Jonathan Kew — Bug 467228. Disable line start/end swashes on Mac since we don't support line-boundary shaping properly yet. r=roc 992000e45526 Robert O'Callahan — Bug 467283. Ignore dirty rect when doing any image resampling --- it will lead to artifacts. r+sr=dbaron,r=vlad aaeb20c61fca Robert O'Callahan — Bug 455826. Don't reconstruct textruns just because we deleted an empty nsContinuingTextFrame. r=smontagu 885dc81bc31b Robert O'Callahan — Bug 442633. Detect removal of href attribute on SVG <use> elements. r=longsonr,sr=mats f603fec24bf7 Josh Aas — fix a drawing order glitch in the mac default plugin. b=467580 sr=jst 49a032846a3a Oleg Romashin — Bug 463872 - Cairo-qpainter build is broken after latest cairo update. missing part. r=vladimir. 4b4ee8b2dc54 Benjamin Smedberg — Bug 467579: --with-static-checking is broken in spidermonkey. There is currently no useful static checking infrastructure for spidermonkey, so disable it for the time being, r=jimb 67f8a5b06156 Peter Weilbacher — [OS/2] No Bug: add minor change and comment to gfxOS2FontGroup::FontCallback; fix debug output for missing fonts 6156d0a39763 Peter Weilbacher — Bug 466956: fix alias check in gfxFontconfigUtils::ResolveFontName for correct return value, r=karlt, sr=roc bbf7d0e42c09 Arno Renevier — Fix npruntime sample compile problem, npupp.h -> npfunctions.h. b=464481 r=josh sr=jst 52488eb15168 Benjamin Smedberg — Change the stack-class analysis to a warning instead of an error, at least temporarily: the analysis was buggy when originally landed, and there are some heap-allocated autostrings outstanding through the tree. 211c2be2fa1e Benjamin Smedberg — Bug 466492 - test for the existence of jar.mn in make, rather than in a shell script: this allows us to avoid launching the subshell in the common case where a jar.mn is not present r=ted 9f3807b5e936 Benjamin Smedberg — Bug 442012 - Allocating more than 2GB of memory in mozilla is never a good idea. On 64-bit systems PRSize and size_t are 64-bit and so truncation from PRSize to PRUint32 could cause weird behavior errors. Prevent these huge allocations. r=wtc sr=dveditz dcd1373d1dff Benjamin Smedberg — Bug 463420 - SIMPLE_PROGRAMS leads to bustage with generated.pdb r=ted
Most of those patches are trivially safe, not part of the build, or don't affect Windows. That leaves only ce72e9a5dca0 Michael Ventnor — Bug 458031. Take dirty rect into account to limit box-shadow computation. r+sr=roc 1d84189da181 Robert Longson — Bug 465996. Use Ellipse instead of Arc to draw circles. r+sr=roc 992000e45526 Robert O'Callahan — Bug 467283. Ignore dirty rect when doing any image resampling --- it will lead to artifacts. r+sr=dbaron,r=vlad aaeb20c61fca Robert O'Callahan — Bug 455826. Don't reconstruct textruns just because we deleted an empty nsContinuingTextFrame. r=smontagu 885dc81bc31b Robert O'Callahan — Bug 442633. Detect removal of href attribute on SVG <use> elements. r=longsonr,sr=mats Of those, I'd guess bug 455826 as being the most risky.
The failures might be related to bug 467634, especially if the test is particularly resource intensive.
Backing out bug 455826 seems to have fixed it. Also, nthomas got a stack for a mochitest crash in bug 467150 that implicates the same patch. I don't understand why this manifested as a timeout on Tinderbox. Maybe once I look more deeply into the bug, I'll figure that out.
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.