839270 - crash in nsStyleContext::nsStyleContext mainly with AMD Radeon HD 6250/6310/6320

Reporter

Description

•

12 years ago

It's currently #1 top browser crasher in the first hours of 19.0b5. It's worse than bug 837371, the Beta-4 version. Signature nsTArray<txExpandedNameMap_base::MapItem, nsTArrayDefaultAllocator>::IndexOf<txExpandedName, txMapItemComparator>(txExpandedName const&, unsigned int, txMapItemComparator const&) | nsStyleContext::nsStyleContext(nsStyleContext*, nsIAtom*, nsCSSPseudoEl... More Reports Search UUID 5168d641-4627-4b6f-8982-9ab4a2130207 Date Processed 2013-02-07 19:09:03 Uptime 489 Last Crash 1.0 hours before submission Install Age 1.4 hours since version was first installed. Install Time 2013-02-07 17:32:42 Product Firefox Version 19.0 Build ID 20130206083616 Release Channel beta OS Windows NT OS Version 6.1.7601 Service Pack 1 Build Architecture x86 Build Architecture Info AuthenticAMD family 20 model 2 stepping 0 Crash Reason EXCEPTION_ACCESS_VIOLATION_WRITE Crash Address 0x533b96fc User Comments This is the second one to happen after updating to the newest beta release. App Notes AdapterVendorID: 0x1002, AdapterDeviceID: 0x9802, AdapterSubsysID: 00000000, AdapterDriverVersion: 9.12.0.0 D2D? D2D+ DWrite? DWrite+ D3D10 Layers? D3D10 Layers+ Processor Notes sp-processor08.phx1.mozilla.com_11482:2008; SignatureTool: signature truncated due to length EMCheckCompatibility True Adapter Vendor ID 0x1002 Adapter Device ID 0x9802 Total Virtual Memory 4294836224 Available Virtual Memory 3574091776 System Memory Use Percentage 68 Available Page File 2722672640 Available Physical Memory 880799744 Frame Module Signature Source 0 xul.dll nsTArray<txExpandedNameMap_base::MapItem,nsTArrayDefaultAllocator>::IndexOf<txEx obj-firefox/dist/include/nsTArray.h:632 1 xul.dll nsStyleContext::nsStyleContext layout/style/nsStyleContext.cpp:57 2 xul.dll NS_NewStyleContext layout/style/nsStyleContext.cpp:721 3 xul.dll nsStyleSet::GetContext layout/style/nsStyleSet.cpp:609 4 xul.dll nsCSSFrameConstructor::AddFrameConstructionItems layout/base/nsCSSFrameConstructor.cpp:5069 5 xul.dll nsCSSFrameConstructor::GetAnonymousContent layout/base/nsCSSFrameConstructor.cpp:3865 6 xul.dll nsFrame::DidSetStyleContext layout/generic/nsFrame.cpp:826 7 xul.dll nsCSSFrameConstructor::ConstructBlock layout/base/nsCSSFrameConstructor.cpp:11056 More reports at: https://crash-stats.mozilla.com/report/list?signature=nsTArray%3CtxExpandedNameMap_base%3A%3AMapItem%2C+nsTArrayDefaultAllocator%3E%3A%3AIndexOf%3CtxExpandedName%2C+txMapItemComparator%3E%28txExpandedName+const%26%2C+unsigned+int%2C+txMapItemComparator+const%26%29+|+nsStyleContext%3A%3AnsStyleContext%28nsStyleContext*%2C+nsIAtom*%2C+nsCSSPseudoEl%2E%2E%2E

Scoobidiver (away)

Reporter

Comment 1

•

12 years ago

It accounts for 48% of all crashes. Just in case, the regression range is: http://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=e815122c4b1f&tochange=3e382bb14817

Severity: critical → blocker

Scoobidiver (away)

Reporter

Updated

•

12 years ago

Crash Signature: nsCSSPseudoEl...] → nsCSSPseudoEl...] [@ memmove | nsVoidArray::RemoveElementsAt(int, int)] [@ nsStyleSet::AppendStyleSheet(nsStyleSet::sheetType, nsIStyleSheet*)]

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 2

•

12 years ago

Can I get someone to pull a crashdump or three for examination? Kairo?

Robert Kaiser

Comment 3

•

12 years ago

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #2) > Can I get someone to pull a crashdump or three for examination? Kairo? I can surely pull minidumps, but I wonder if it wouldn't be better to give you direct access to those.

Alex Keybl [:akeybl]

Comment 4

•

12 years ago

We should be targeting a fix for Monday EOD PT (if at all possible), and will very likely hold the build if a fix hasn't been identified yet. KaiRo is planning on sending you minidumps if he can't get in touch about granting you direct access, roc.

tracking-firefox19: ? → +

Alex Keybl [:akeybl]

Comment 5

•

12 years ago

Anthony/Juan - do we have access to an AMD Radeon HD 6xxx card to test?

Keywords: qawanted, steps-wanted

juan becerra [:juanb]

Updated

•

12 years ago

QA Contact: jbecerra

u279076

Comment 6

•

12 years ago

My Radeon system is failing to boot. Juan, is there a computer in the QA lab you can use to assist?

u279076

Comment 7

•

12 years ago

I managed to get my system back up and running. I'm now trying to test various Facebook games as that seems to be the most commonly mentioned phrase in the useful comments.

u279076

Comment 8

•

12 years ago

Can someone provide more specific details on what QA should be testing? I tried doing various things on Facebook in normal browsing mode and private browsing mode since that's all I have to go on in the comments. I have not crashed once.

Scoobidiver (away)

Reporter

Comment 9

•

12 years ago

(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #8) > Can someone provide more specific details on what QA should be testing? What is your Graphics section in about:support?

u279076

Comment 10

•

12 years ago

(In reply to Scoobidiver from comment #9) > What is your Graphics section in about:support? Here you go: > Adapter Description: AMD Radeon HD 6450 > Adapter Drivers: aticfx64 aticfx64 aticfx64 aticfx32 aticfx32 aticfx32 atiumd64 atidxx64 atidxx64 atiumdag atidxx32 atidxx32 atiumdva atiumd6a atitmm64 > Adapter RAM: 1024 > Device ID: 0x6779 > DirectWrite Enabled: false (6.1.7600.16385) > Driver Date: 12-19-2012 > Driver Version: 9.12.0.0 > GPU #2 Active: false > GPU Accelerated Windows: 0/1 Basic > Vendor ID: 0x1002 > AzureCanvasBackend: cairo > AzureContentBackend: none > AzureFallbackCanvasBackend: none

Scoobidiver (away)

Reporter

Comment 11

•

12 years ago

(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #10) > > Device ID: 0x6779 Only GPUs with 0x980n device IDs are affected, that is Radeon HD 6250 (Device 9804), 6310 (Device 9802), and 6320 (Device 9806). (In reply to Alex Keybl [:akeybl] from comment #4) > will very likely hold the build if a fix hasn't been identified yet. Building Beta 6 may make those crashes disappear. In case it fails, you can also blocklist for D2D and D3D10 those devices.

juan becerra [:juanb]

Comment 12

•

12 years ago

I haven't been able to reproduce the problem. I tried several scenarios from what I could gather in the comments: - Updating to 19b5 from a previous beta - Vising most of the urls reported and browsing within - Playing a game while in Facebook - Playing videos in youtube.com - Sending, opening email in outlook.com - Upgrading Flash - Installing a URL enhancer add-on Graphics information from about:support: Adapter Description AMD Radeon HD 6900 Series Adapter Drivers aticfx32 aticfx32 aticfx32 atiumdag atidxx32 atiumdva Adapter RAM 2048 Device ID 0x6719 Direct2D Enabled true DirectWrite Enabled true (6.1.7600.16385) Driver Date 9-27-2012 Driver Version 9.2.0.0 GPU #2 Active false GPU Accelerated Windows 2/2 Direct3D 10 Vendor ID 0x1002 WebGL Renderer Google Inc. -- ANGLE (AMD Radeon HD 6900 Series) AzureCanvasBackend direct2d AzureContentBackend direct2d AzureFallbackCanvasBackend cairo

Scoobidiver (away)

Reporter

Comment 13

•

12 years ago

(In reply to juan becerra [:juanb] from comment #12) > I haven't been able to reproduce the problem. > Device ID > 0x6719 It's normal. See comment 11.

juan becerra [:juanb]

Comment 14

•

12 years ago

(In reply to Scoobidiver from comment #13) > (In reply to juan becerra [:juanb] from comment #12) > > I haven't been able to reproduce the problem. > > Device ID > > 0x6719 > It's normal. See comment 11. Where can we find a list that maps device IDs to gfx card series? This comes in handy.

u279076

Comment 15

•

12 years ago

(In reply to Scoobidiver from comment #11) > Only GPUs with 0x980n device IDs are affected, that is Radeon HD 6250 > (Device 9804), 6310 (Device 9802), and 6320 (Device 9806). Updating the bug to more accurately reflect this fact. Juan, do you know if we have either of these devices in the lab?

Summary: crash in nsStyleContext::nsStyleContext mainly with AMD Radeon HD 6xxx series → crash in nsStyleContext::nsStyleContext mainly with AMD Radeon HD 6250/6310/6320

Scoobidiver (away)

Reporter

Comment 16

•

12 years ago

(In reply to juan becerra [:juanb] from comment #14) > Where can we find a list that maps device IDs to gfx card series? This comes > in handy. http://www.pcidatabase.com

juan becerra [:juanb]

Comment 17

•

12 years ago

(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #15) > (In reply to Scoobidiver from comment #11) > > Only GPUs with 0x980n device IDs are affected, that is Radeon HD 6250 > > (Device 9804), 6310 (Device 9802), and 6320 (Device 9806). > > Updating the bug to more accurately reflect this fact. > > Juan, do you know if we have either of these devices in the lab? We don't have one of those cards in the machines I checked.

u279076

Comment 18

•

12 years ago

I did some research and these GPUs are embedded with AMDs E-series CPUs: Radeon 6250 comes with AMD C-50 PCs Radeon 6310 comes with AMD E-350 PCs Radeon 6320 comes with AMD E-450 PCs These are common configurations for ASUS EeePC and Zotac mini PCs. Juan is currently trying to track one of these down at the Mountain View office.

Robert Kaiser

Comment 19

•

12 years ago

I have sent three minidumps to roc and filed bug 839717 to get him direct access to those.

juan becerra [:juanb]

Comment 20

•

12 years ago

I couldn't find any of those machines here in the office. I wasn't able to find any machines with those gfx cards in a couple of local stores. I forwarded a request in our desktop team to see if anyone local had any.

Scoobidiver (away)

Reporter

Comment 21

•

12 years ago

You have a machine somewhere. See https://bugzilla.mozilla.org/show_bug.cgi?id=700288#c19

Scoobidiver (away)

Reporter

Updated

•

12 years ago

Crash Signature: nsCSSPseudoEl...] [@ memmove | nsVoidArray::RemoveElementsAt(int, int)] [@ nsStyleSet::AppendStyleSheet(nsStyleSet::sheetType, nsIStyleSheet*)] → nsCSSPseudoEl...] [@ memmove | nsVoidArray::RemoveElementsAt(int, int)] [@ nsStyleSet::AppendStyleSheet(nsStyleSet::sheetType, nsIStyleSheet*)] [@ @0x0 | nsStyleSet::AppendStyleSheet(nsStyleSet::sheetType nsIStyleSheet*)] [@ nsStyleSet::GatherRuleProc…

Scoobidiver (away)

Reporter

Updated

•

12 years ago

Crash Signature: nsCSSPseudoEl...] [@ memmove | nsVoidArray::RemoveElementsAt(int, int)] [@ nsStyleSet::AppendStyleSheet(nsStyleSet::sheetType, nsIStyleSheet*)] [@ @0x0 | nsStyleSet::AppendStyleSheet(nsStyleSet::sheetType nsIStyleSheet*)] [@ nsStyleSet::GatherRuleProc… → nsCSSPseudoEl...] [@ memmove | nsVoidArray::RemoveElementsAt(int, int)] [@ nsStyleSet::AppendStyleSheet(nsStyleSet::sheetType, nsIStyleSheet*) ] [@ @0x0 | nsStyleSet::AppendStyleSheet(nsStyleSet::sheetType nsIStyleSheet*) ] [@ nsStyleSet::GatherRulePr…

Scoobidiver (away)

Reporter

Updated

•

12 years ago

Crash Signature: nsIStyleSheet*) ] [@ nsStyleSet::GatherRuleProcessors(nsStyleSet::sheetType) ] [@ @0x0 | nsStyleSet::GatherRuleProcessors(nsStyleSet::sheetType)] [@ nsDocument::FillStyleSet(nsStyleSet*) ] → nsIStyleSheet*) ] [@ nsStyleSet::GatherRuleProcessors(nsStyleSet::sheetType) ] [@ @0x0 | nsStyleSet::GatherRuleProcessors(nsStyleSet::sheetType)] [@ nsDocument::FillStyleSet(nsStyleSet*)]

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 22

•

12 years ago

Here's what I've found so far, examining the minidump for 8ba4a697-3162-490d-aa78-3a3792130209: We crashed at 71F5A8CB. This code is at 1005A8CB in the dumpbin.exe disassembly of xul.dll. That is an invalid EIP --- it's really the middle of the instruction at 1005A8C9, as is clear from the dumpbin dissembly. Also, as the call stack says, 1005A8CB is in some nsTArray instance that has nothing to do with the style system. The instruction at 1005A8CB is "or byte ptr [ebp+53080A54h],cl", and inspecting the value of EBP in the mindump shows that ebp+53080A54 is the reproted crash address. The value of EBP in the minidump is close to ESP which is also what you'd expect. So we really did land here at this invalid EIP and crash. The question is how. The caller according to the crash-stats stack (which looks entirely valid to me, although VS2012 doesn't show a valid stack from the minidump) is http://hg.mozilla.org/releases/mozilla-beta/annotate/3e382bb14817/layout/style/nsStyleContext.cpp#l57, which is calling nsStyleContext::AddChild. It just so happens that nsStyleContext::AddChild is at 1005A88E, which is the function before the nsTArray function we crashed in! But inspecting the disassembly for AddChild shows that it's entirely self-contained and valid. There is no way that control flow could escape and accidentally jump to an invalid address. I'll reproduce it here for posterity: ?AddChild@nsStyleContext@@IAEXPAV1@@Z: 1005A88E: 8B 51 1C mov edx,dword ptr [ecx+1Ch] 1005A891: 83 7A 04 00 cmp dword ptr [edx+4],0 1005A895: 74 0C je 1005A8A3 1005A897: 83 C0 04 add eax,4 1005A89A: 8B 10 mov edx,dword ptr [eax] 1005A89C: 85 D2 test edx,edx 1005A89E: 75 08 jne 1005A8A8 1005A8A0: 89 08 mov dword ptr [eax],ecx 1005A8A2: C3 ret 1005A8A3: 83 C0 08 add eax,8 1005A8A6: EB F2 jmp 1005A89A 1005A8A8: 89 51 10 mov dword ptr [ecx+10h],edx 1005A8AB: 8B 10 mov edx,dword ptr [eax] 1005A8AD: 8B 52 0C mov edx,dword ptr [edx+0Ch] 1005A8B0: 89 51 0C mov dword ptr [ecx+0Ch],edx 1005A8B3: 8B 10 mov edx,dword ptr [eax] 1005A8B5: 8B 52 0C mov edx,dword ptr [edx+0Ch] 1005A8B8: 89 4A 10 mov dword ptr [edx+10h],ecx 1005A8BB: 8B 10 mov edx,dword ptr [eax] 1005A8BD: 89 4A 0C mov dword ptr [edx+0Ch],ecx 1005A8C0: EB DE jmp 1005A8A0 Followed by the nsTArray code: ??$IndexOf@VtxExpandedName@@VtxMapItemComparator@@@?$nsTArray@UMapItem@txExpandedNameMap_base@@UnsTArrayDefaultAllocator@@@@QBEIABVtxExpandedName@@IABVtxMapItemComparator@@@Z: 1005A8C2: 8B 08 mov ecx,dword ptr [eax] 1005A8C4: 8B 11 mov edx,dword ptr [ecx] 1005A8C6: 6B D2 0C imul edx,edx,0Ch 1005A8C9: 8D 41 08 lea eax,[ecx+8] 1005A8CC: 8D 54 0A 08 lea edx,[edx+ecx+8] 1005A8D0: 53 push ebx I'm low on theories to explain this :-(. Looking at a number of crash reports, the crash address is always about what you'd expect when EBP is pointing to the stack and ebp+53080A54 is accessed, so it looks like we're reliably jumping to this exact same instruction in every crash.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 23

•

12 years ago

The value at *ESP is exactly the return address for the call to AddChild, so it looks like ESP hasn't been disturbed since we entered AddChild (which does not touch ESP at all). The disassembly for the instructions leading up to the crashing instruction looks like this in VS: 71F5A8C5 11 6B D2 adc dword ptr [ebx-2Eh],ebp 71F5A8C8 0C 8D or al,8Dh 71F5A8CA 41 inc ecx 71F5A8CB 08 8D 54 0A 08 53 or byte ptr [ebp+53080A54h],cl Following control flow from the caller of AddChild, we can see that the caller passes ESI for aChild, which AddChild gets in ECX. (And AddChild doesn't modify ESI.) So if we passed through 5A8CA, then we should have ECX = ESI + 1, which in fact we do (0x1ab818f1 vs 0x1ab818f0). So we did pass through 71F5A8CA. Also, in AddChild EAX is always the address of 'this', which would be at least 4-byte aligned. But EAX is 1B2F938D, so the "or al,8Dh" must have been executed, so we passed through 71F5A8C8 too. I kinda suspect the adc instruction was not executed because EBX is 0x70000000 and I guess 0x70000000-0x2e was probably not mapped. EBX is some kind of flag, not an address, when AddChild is called. We could not have landed at 71F5A8C7 since it decodes more than one byte so 71F5A8C8 would not have executed. So I think we landed at 71F5A8C8 somehow.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 24

•

12 years ago

I can't think of anything other than some kind of weird hardware or kernel/driver fault that mutates the program counter unexpectedly. But AddChild is just leaf code, it doesn't call anything or touch anything unusual. How could it be reliably triggering some kind of fault? Would it help if we blacklisted the Radeon driver/hardware combinations implicated in this crash?

David Baron :dbaron:

Comment 25

•

12 years ago

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #24) > I can't think of anything other than some kind of weird hardware or > kernel/driver fault that mutates the program counter unexpectedly. But > AddChild is just leaf code, it doesn't call anything or touch anything > unusual. How could it be reliably triggering some kind of fault? > > Would it help if we blacklisted the Radeon driver/hardware combinations > implicated in this crash? See bug 755974, bug 722538. Though I thought I saw a comment in one of the other dependencies of bug 772330 suggesting something else to blocklist, though perhaps it predated the one of those.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 26

•

12 years ago

Most, but not all, of these crash reports have D2D and D3D10 enabled, presumably not blockedlisted.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 27

•

12 years ago

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #24) > I can't think of anything other than some kind of weird hardware or > kernel/driver fault that mutates the program counter unexpectedly. But > AddChild is just leaf code, it doesn't call anything or touch anything > unusual. How could it be reliably triggering some kind of fault? The only way I can think of right now is that the Radeon driver (or something triggered by it) actually corrupts our code. This could happen due to it trying to patch code and failing, or perhaps due to some kind of fairly deterministic general memory corruption.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 28

•

12 years ago

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #27) > The only way I can think of right now is that the Radeon driver (or > something triggered by it) actually corrupts our code. This could happen due > to it trying to patch code and failing, or perhaps due to some kind of > fairly deterministic general memory corruption. ... but the minidump captures code around the crashing EIP and the captured code bytes are correct. So that's probably not it.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 29

•

12 years ago

All three minidumps I examined show that they all crash in the same place within libxul.dll, even though thanks to ASLR those are different addresses. Per the last paragraph of comment 22, the crash addresses reported in crash-stats suggest this is true for all the crashes. So whatever triggers this is aware of, or dependent on, where libxul is loaded.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 30

•

12 years ago

I don't have any good ideas for mitigating this problem, but I have some bad ones. One idea would be to add some padding to nsStyleContext::AddChild --- useless code that doesn't do anything and that we branch over --- to see if that makes the problem go away or at least reappear somewhere with less impact. That's shooting in the dark, and we'd need crash-stats to tell us if we were making things worse or better.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 31

•

12 years ago

It might make more sense to just keep uplifting patches to beta, do frequent beta respins and pick one to release that looks best on crash-stats.

David Baron :dbaron:

Comment 32

•

12 years ago

Except these crashes tend to be build-dependent; they go away in the build after the one they appear in. So this will likely be gone in the next beta build anyway.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 33

•

12 years ago

Right, the important thing is to ship a build where the bug has just gone, rather than where it has just come back :-).

Robert Kaiser

Comment 34

•

12 years ago

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #33) > Right, the important thing is to ship a build where the bug has just gone, > rather than where it has just come back :-). Actually, scratch that. We have at least three different signatures for this bug in 19.0b5 alone: https://crash-stats.mozilla.com/report/list?signature=nsTArray%3CtxExpandedNameMap_base%3A%3AMapItem%2C%20nsTArrayDefaultAllocator%3E%3A%3AIndexOf%3CtxExpandedName%2C%20txMapItemComparator%3E%28txExpandedName%20const%26%2C%20unsigned%20int%2C%20txMapItemComparator%20const%26%29%20%7C%20nsStyleContext%3A%3AnsStyleContext%28nsStyleContext%2A%2C%20nsIAtom%2A%2C%20nsCSSPseudoEl... https://crash-stats.mozilla.com/report/list?signature=nsStyleSet%3A%3AAppendStyleSheet%28nsStyleSet%3A%3AsheetType%2C%20nsIStyleSheet%2A%29 https://crash-stats.mozilla.com/report/list?signature=memmove%20%7C%20nsVoidArray%3A%3ARemoveElementsAt%28int%2C%20int%29 (And more see Crash Signature field in this bug)

Robert Kaiser

Comment 35

•

12 years ago

Erm, I replied to the wrong comment, wanted to reply to my own comment in bug 772330 about one signature per build, that's just not true. I thi9nk we never had that bad an explosion of those crashes than with 19.0b5, though. :(

Alex Keybl [:akeybl]

Comment 36

•

12 years ago

Do we really believe it's coincidence that FF19.0b4 ran into bug 837371 and b5 ran into this top crasher? (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #30) > I don't have any good ideas for mitigating this problem, but I have some bad > ones. One idea would be to add some padding to nsStyleContext::AddChild --- > useless code that doesn't do anything and that we branch over --- to see if > that makes the problem go away or at least reappear somewhere with less > impact. That's shooting in the dark, and we'd need crash-stats to tell us if > we were making things worse or better. We're trying to go to build with our final beta no later than tomorrow, so it sounds like padding will be the way to go. Can you prepare a patch? (In reply to David Baron [:dbaron] from comment #32) > Except these crashes tend to be build-dependent; they go away in the build > after the one they appear in. So this will likely be gone in the next beta > build anyway. Do you have any alternate suggestions? Should we be pulling somebody in from the gfx team to help?

tracking-firefox19: + → ?

Alex Keybl [:akeybl]

Comment 37

•

12 years ago

(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #35) > Erm, I replied to the wrong comment, wanted to reply to my own comment in > bug 772330 about one signature per build, that's just not true. > > I thi9nk we never had that bad an explosion of those crashes than with > 19.0b5, though. :( If we go to build Friday, do you think we'll have enough early data (from users installing our FTP) in crash-stats to know whether this will be a top crasher for the release? Early b5 data did show that this was going to be a top crasher in that build.

tracking-firefox19: ? → +

juan becerra [:juanb]

Comment 38

•

12 years ago

I found a machine where I'm trying to reproduce this. I'll report back in a half hour.

Scoobidiver (away)

Reporter

Comment 39

•

12 years ago

(In reply to Alex Keybl [:akeybl] from comment #36) > Do we really believe it's coincidence that FF19.0b4 ran into bug 837371 and > b5 ran into this top crasher? There is no Layout bug in the regression range at least classified as that so it means that Layout addresses have shifted again likely because other XUL components that have addresses before Layout ones have been changed. (In reply to Alex Keybl [:akeybl] from comment #37) > If we go to build Friday, do you think we'll have enough early data (from > users installing our FTP) in crash-stats to know whether this will be a top > crasher for the release? Early b5 data did show that this was going to be a > top crasher in that build. Yes. If it's still a top crasher, we can land again bug 755974 and bug 722538 for only Firefox 19 (not 20 and above).

Alex Keybl [:akeybl]

Comment 40

•

12 years ago

(In reply to Scoobidiver from comment #39) > (In reply to Alex Keybl [:akeybl] from comment #37) > > If we go to build Friday, do you think we'll have enough early data (from > > users installing our FTP) in crash-stats to know whether this will be a top > > crasher for the release? Early b5 data did show that this was going to be a > > top crasher in that build. > Yes. If it's still a top crasher, we can land again bug 755974 and bug > 722538 for only Firefox 19 (not 20 and above). Let's not wait till we get beta 6 feedback before testing bug 755974 and bug 722538 again. Let's roll back bug 790169. We'll know whether it was a re-spin (b6) or the blocklist that resolved the issue by analyzing b5 crash data.

Alex Keybl [:akeybl]

Comment 41

•

12 years ago

Filed bug 840161.

Scoobidiver (away)

Reporter

Updated

•

12 years ago

Depends on: 840161

juan becerra [:juanb]

Comment 42

•

12 years ago

Marcia and I were able to reproduce the problem on a machine with an HD 6310 gfx card. First of all, the about:support information, in a new profile, said the gfx card was disabled, and probably because of this I had not been able to crash this morning. So we flipped the pref to force enable it and tried again. However we weren't able to crash after trying some of the sites and functionality we had tried before. Then we updated the graphics card driver to the latest, by going to the AMD web site and using their auto detection system to find whether it needed an upgrade. We tried again for a while without any luck, but then I tried to check a dummy Hotmail account email, and when I opened a new message, it crashed with the crash signature in question. I haven't tried reproducing multiple times yet.

Robert Kaiser

Comment 43

•

12 years ago

(In reply to Alex Keybl [:akeybl] from comment #37) > If we go to build Friday, do you think we'll have enough early data (from > users installing our FTP) in crash-stats to know whether this will be a top > crasher for the release? Yes, usually we see enough data from that to know at least by Tuesday morning if something this bad is going on. No guarantees, of course, but chances are good.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 44

•

12 years ago

Ok(In reply to Alex Keybl [:akeybl] from comment #36) > We're trying to go to build with our final beta no later than tomorrow, so > it sounds like padding will be the way to go. Can you prepare a patch? I'll try. We're shooting in the dark, it might not work. > Do you have any alternate suggestions? Should we be pulling somebody in from > the gfx team to help? Bringing more people in to scratch their heads is probably not going to help.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 45

•

12 years ago

Attached patch experimental patch (obsolete) (deleted) — Details — Splinter Review

https://tbpl.mozilla.org/?tree=Try&rev=ab6acac39006 I will inspect those builds once they're done to see if the compiler is doing what I expect.

Attachment #712572 - Flags: review?(dbaron)

juan becerra [:juanb]

Comment 46

•

12 years ago

I've tried again with 19b5 and a fresh profile, where the information in about:support shows that acceleration is enabled by default. I can reproduce this problem somewhat consistently following these steps: 1. Log in to hotmail.com (in my case I have enabled the Outlook interface) 2. Try to open messages or write a message and try to send There's nothing special about that account other than having created it a few days ago. It's pretty clean. Here's my last crash: https://crash-stats.mozilla.com/report/index/bp-8ce650d2-4d3a-4ded-964a-75bee2130211 And gfx card information from about:support: Graphics Adapter Description AMD Radeon HD 6310 Graphics Adapter Drivers aticfx64 aticfx64 aticfx64 aticfx32 aticfx32 aticfx32 atiumd64 atidxx64 atidxx64 atiumdag atidxx32 atidxx32 atiumdva atiumd6a atitmm64 Adapter RAM 384 Device ID 0x9802 Direct2D Enabled true DirectWrite Enabled true (6.1.7601.17789) Driver Date 12-19-2012 Driver Version 9.12.0.0 GPU #2 Active false GPU Accelerated Windows 1/1 Direct3D 10 Vendor ID 0x1002 WebGL Renderer Google Inc. -- ANGLE (AMD Radeon HD 6310 Graphics) AzureCanvasBackend direct2d AzureContentBackend direct2d AzureFallbackCanvasBackend cairo

Keywords: qawanted, steps-wanted

David Baron :dbaron:

Comment 47

•

12 years ago

Comment on attachment 712572 [details] [diff] [review] experimental patch I suspect PGO is likely to move this somewhere other than where you want, but r=dbaron... though you might want to make it #ifdef XP_WIN.

Attachment #712572 - Flags: review?(dbaron) → review+

Marcia Knous [:marcia]

Comment 48

•

12 years ago

I was able to reproduce the first time I logged into Hotmail, but on subsequent tries I have not yet been able to replicate it.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 49

•

12 years ago

Comment on attachment 712572 [details] [diff] [review] experimental patch [Approval Request Comment] Bug caused by (feature/regressing bug #): unknown User impact if declined: random crashing Testing completed (on m-c, etc.): none Risk to taking this patch (and alternatives if risky): basically zero risk String or UUID changes made by this patch: none

Attachment #712572 - Flags: approval-mozilla-beta?

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 50

•

12 years ago

(In reply to David Baron [:dbaron] from comment #47) > I suspect PGO is likely to move this somewhere other than where you want, If it does, that might help too. Or hurt.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 51

•

12 years ago

(In reply to David Baron [:dbaron] from comment #47) > I suspect PGO is likely to move this somewhere other than where you want, > but r=dbaron... though you might want to make it #ifdef XP_WIN. I made it XP_WIN because the bogo-printf code causes fatal warnings on Mac at least.

Alex Keybl [:akeybl]

Comment 52

•

12 years ago

Comment on attachment 712572 [details] [diff] [review] experimental patch Approving for beta, thanks for throwing this together so quickly roc.

Attachment #712572 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 53

•

12 years ago

Unfortunately, inspection of the PGO build shows that it didn't work. The linker guessed the code I added would not be executed much and moved it far away.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 54

•

12 years ago

At this point, I think comment #31/#32 are our best bet: i.e., keep landing beta patches until the crash goes away, and ship that.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 55

•

12 years ago

(In reply to juan becerra [:juanb] from comment #42) If you can get a full dump for the crashed process and put it somewhere I can download it, that would be great. You should be able to do it from Task Manager: http://blogs.msdn.com/b/junfeng/archive/2008/06/19/getting-a-full-memory-dump-for-a-process.aspx Alternatively you could install Visual Studio on that machine and I could try VNCing in.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 56

•

12 years ago

Attached patch another try at a patch (deleted) — Details — Splinter Review

Attachment #712572 - Attachment is obsolete: true

Attachment #712795 - Flags: review?(dbaron)

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 57

•

12 years ago

https://tbpl.mozilla.org/?tree=Try&rev=a9db52c90ad2

David Baron :dbaron:

Comment 58

•

12 years ago

Comment on attachment 712795 [details] [diff] [review] another try at a patch r=dbaron, but maybe also #ifdef _MSC_VER so it doesn't bug the people building with the mingw toolchain?

Attachment #712795 - Flags: review?(dbaron) → review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 59

•

12 years ago

That patch worked. I will carry forward the previous approval and land on beta now.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 60

•

12 years ago

https://hg.mozilla.org/releases/mozilla-beta/rev/23f455023faf

Scoobidiver (away)

Reporter

Updated

•

12 years ago

status-firefox19: affected → fixed

u279076

Comment 61

•

12 years ago

Juan, can you please verify this is fixed on your test machine once we have 19.0b6 candidate builds? They should start appearing here today: ftp://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/19.0b6-candidates/build1/ Thanks

Keywords: verifyme

juan becerra [:juanb]

Comment 62

•

12 years ago

Using 19b6 (build1) I was not able to crash after about 20 minutes of trying using the steps in comment #46. However, I had also tried to reproduce in 19b5 for a little while and today it was a little more difficult to crash. I consider this conditionally verified pending confirmation with crash-stats data once we release this beta.

Robert Kaiser

Comment 63

•

12 years ago

So far, we're not seeing that kind of signatures in early 19.0b6 data (before it was actually released publicly, though): https://crash-stats.mozilla.com/topcrasher/byversion/Firefox/19.0b6/7

Robert Kaiser

Comment 64

•

12 years ago

A recent Nightly flared up with nsStyleSet::GetContext, see https://crash-stats.mozilla.com/report/list?signature=nsStyleSet%3A%3AGetContext%28nsStyleContext%2A%2C%20nsRuleNode%2A%2C%20nsRuleNode%2A%2C%20nsIAtom%2A%2C%20nsCSSPseudoElements%3A%3AType%2C%20mozilla%3A%3Adom%3A%3AElement%2A%2C%20unsigned%20int%29

Crash Signature: nsIStyleSheet*) ] [@ nsStyleSet::GatherRuleProcessors(nsStyleSet::sheetType) ] [@ @0x0 | nsStyleSet::GatherRuleProcessors(nsStyleSet::sheetType)] [@ nsDocument::FillStyleSet(nsStyleSet*)] → nsIStyleSheet*) ] [@ nsStyleSet::GatherRuleProcessors(nsStyleSet::sheetType) ] [@ @0x0 | nsStyleSet::GatherRuleProcessors(nsStyleSet::sheetType)] [@ nsDocument::FillStyleSet(nsStyleSet*)] [@ nsStyleSet::GetContext(nsStyleContext*, nsRuleNode* nsRuleNo…

juan becerra [:juanb]

Comment 65

•

12 years ago

Following up from comment #62, I am not able to reproduce now on the release candidates for 19.0, and I can't identify a top crash signature related to this in crash-stats for either 19.0b6 or 19.0. I would mark this as verified, however I don't know what to make of comment #64

David Baron :dbaron:

Comment 66

•

12 years ago

What's the purpose of trying to mark the bug as verified? This bug (and all variants of bug 772330) are known to appear and disappear based on some unknown characteristic of the build. So until we either (a) figure out what that characteristic is or (b) wait a decent amount of time, we won't know if it's fixed. The patch in this bug was also landed only on beta, though I think there may also have been a blocklist change in another bug.

Alex Keybl [:akeybl]

Comment 67

•

12 years ago

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #55) > (In reply to juan becerra [:juanb] from comment #42) > > If you can get a full dump for the crashed process and put it somewhere I > can download it, that would be great. You should be able to do it from Task > Manager: > http://blogs.msdn.com/b/junfeng/archive/2008/06/19/getting-a-full-memory- > dump-for-a-process.aspx > > Alternatively you could install Visual Studio on that machine and I could > try VNCing in. ni? on juanb to grab this info using b5.

Flags: needinfo?(jbecerra)

juan becerra [:juanb]

Comment 68

•

12 years ago

Robert, I wasn't able to get a full dump for the crashed process, but try 10.250.6.86 to VNC into the machine. It's on the MV network.

Flags: needinfo?(jbecerra)

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 69

•

12 years ago

Juan reproduced a crash on that machine with FF20 (I believe), and it looks like bug 819337. I'll take analysis over there.

Scoobidiver (away)

Reporter

Comment 70

•

12 years ago

There are no crashes in 19.0.

Keywords: topcrash

Scoobidiver (away)

Reporter

Comment 71

•

12 years ago

There's a spike of crashes in 23.0a1/20130403090950.

Severity: blocker → critical

Keywords: regression

BMO Automation

Updated

•

9 years ago

Crash Signature: , nsRuleNode*, nsIAtom*, nsCSSPseudoElements::Type, mozilla::dom::Element*, unsigned int) ] → , nsRuleNode*, nsIAtom*, nsCSSPseudoElements::Type, mozilla::dom::Element*, unsigned int) ] [@ memmove | nsVoidArray::RemoveElementsAt] [@ nsStyleSet::AppendStyleSheet ] [@ @0x0 | nsStyleSet::AppendStyleSheet ] [@ nsStyleSet::GatherRuleProcessors ] […

mirh

Updated

•

7 years ago

Depends on: 1335925

Sylvestre Ledru [:Sylvestre]

Updated

•

6 years ago

status-firefox-esr52: --- → wontfix

Keywords: verifyme

BugBot [:suhaib / :marco/ :calixte]

Comment 72

•

4 years ago

Closing because no crashes reported for 12 weeks.

Status: NEW → RESOLVED

Closed: 4 years ago

Resolution: --- → WORKSFORME

experimental patch 12 years ago Robert O'Callahan (:roc) (email my personal email if necessary) (deleted), patch	dbaron : review+ akeybl : approval-mozilla-beta+	Details \| Diff \| Splinter Review
another try at a patch 12 years ago Robert O'Callahan (:roc) (email my personal email if necessary) (deleted), patch	dbaron : review+	Details \| Diff \| Splinter Review