Closed
Bug 1293501
Opened 8 years ago
Closed 7 years ago
Add debug annotation/code for the crash in ReleaseData()
Categories
(Core :: Networking: HTTP, defect, P3)
Tracking
()
RESOLVED
INCOMPLETE
People
(Reporter: ting, Assigned: ting)
References
Details
Attachments
(1 file, 5 obsolete files)
(deleted),
patch
|
jcristau
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
+++ This bug was initially created as a clone of Bug #1145613 +++ I have checked the code which set nsTSubstring_CharT::mFlags with F_SHRED, but I don't see any of them could allow null mData. Based on crash-stats, nsHttpHeaderArray has the highest chance to meet the crash. And from the minidumps, it's crashed while destructing the nsCString |mValue| of the first nsEntry in nsHttpHeaderArray::mHeaders, so I'd like to see memory content the array's header also the first two elements (which are not existed in the minidump). This bug may also be used for adding some more debug code.
Assignee | ||
Comment 1•8 years ago
|
||
:mcmanus is taking PTO til 08-15, could help to review this?
Attachment #8779142 -
Flags: review?(jduell.mcbugs)
Updated•8 years ago
|
Assignee: nobody → janus926
Whiteboard: [necko-active]
Comment 2•8 years ago
|
||
Comment on attachment 8779142 [details] [diff] [review] part1 v1: add crash annotatios for the nsHttpHeaderArray Review of attachment 8779142 [details] [diff] [review]: ----------------------------------------------------------------- Seems ok to me!
Attachment #8779142 -
Flags: review?(jduell.mcbugs) → review+
Assignee | ||
Comment 3•8 years ago
|
||
Updated reviewer. Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e647162c7cb4
Attachment #8779142 -
Attachment is obsolete: true
Assignee | ||
Comment 4•8 years ago
|
||
This is for debug purpose, probably more patches to come. But for now there's only one.
Keywords: checkin-needed,
leave-open
Pushed by ryanvm@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/16abccc347ef Add debug annotations to track down the crash of bug 1145613. r=jduell
Keywords: checkin-needed
Comment 6•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/16abccc347ef
Comment hidden (mozreview-request) |
Assignee | ||
Updated•8 years ago
|
Attachment #8780951 -
Attachment is obsolete: true
Assignee | ||
Comment 8•8 years ago
|
||
This time I'd like to know when |mData| is cleared to null, so I put a check in ReentrantMonitorAutoEnter, which will check before/after manipulating nsHttpHeaderArray.
Updated•8 years ago
|
Attachment #8785146 -
Flags: review?(mcmanus) → review?(dd.mozilla)
Comment 9•8 years ago
|
||
mozreview-review |
Comment on attachment 8785146 [details] Bug 1293501 - Add some more crash annotations to dianose bug 1145613. https://reviewboard.mozilla.org/r/74448/#review72576 Thanks for doing this.
Attachment #8785146 -
Flags: review?(dd.mozilla) → review+
Assignee | ||
Comment 10•8 years ago
|
||
Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=144653b8d740
Keywords: checkin-needed
Comment 11•8 years ago
|
||
Pushed by ryanvm@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/5661cf2d2b85 Add some more crash annotations to dianose bug 1145613. r=mcmanus
Keywords: checkin-needed
Comment 12•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/5661cf2d2b85
Assignee | ||
Comment 13•8 years ago
|
||
I'll see if I can use debug registers to add a watchpoint on |mData| like https://hg.mozilla.org/try/rev/db9d9d4e2dab9cffdfec027316381a0fb775c9f7.
Assignee | ||
Comment 14•8 years ago
|
||
Based on bug 1247982 comment 41, the debug registers would need to apply on multiple threads.
Assignee | ||
Comment 15•8 years ago
|
||
(In reply to Ting-Yu Chou [:ting] from comment #14) > Based on bug 1247982 comment 41, the debug registers would need to apply on > multiple threads. Seems not true in content process, I saw only main thread that is accessing nsHttpRequestHead.
Comment 16•8 years ago
|
||
Pushed by tchou@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/6e6697a3d251 Remove debug code for not propagating to Aurora. r=me
Comment 17•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/6e6697a3d251
Assignee | ||
Comment 18•8 years ago
|
||
It'd become https://crash-stats.mozilla.com/search/?reason=~EXCEPTION_ACCESS_VIOLATION_WRITE&product=Firefox if it's captured by this code.
Attachment #8785146 -
Attachment is obsolete: true
Attachment #8798718 -
Flags: review?(dd.mozilla)
Assignee | ||
Comment 19•8 years ago
|
||
Comment on attachment 8798718 [details] [diff] [review] v4: use virtualprotect to track down possible corruption This protects only the |mData| of the |value| of nsEntry, I should protect also the memory after |header|.
Attachment #8798718 -
Attachment is obsolete: true
Attachment #8798718 -
Flags: review?(dd.mozilla)
Assignee | ||
Comment 20•8 years ago
|
||
Add padding between header and value of nsEntry, set the page (starts from the array header to the first nsEntry value's |mData|) read only when ReentrantMonitorAutoEnter destructs.
Attachment #8798764 -
Flags: review?(dd.mozilla)
Updated•8 years ago
|
Attachment #8798764 -
Flags: review?(dd.mozilla) → review+
Assignee | ||
Comment 21•8 years ago
|
||
Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fc93e936346c
Comment 22•8 years ago
|
||
Pushed by tchou@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/2eebd44ff2e9 Use VirtualProtect to track down a possible memory corruption. r=dragana
Comment 23•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/2eebd44ff2e9
Assignee | ||
Comment 24•8 years ago
|
||
I am using this search to monitor the possible culprit: https://crash-stats.mozilla.com/search/?reason=~EXCEPTION_ACCESS_VIOLATION_WRITE&build_id=>%3D20161012030211&product=Firefox&version=52.0a1&platform=Windows
Assignee | ||
Comment 25•8 years ago
|
||
A more accurate filter: https://crash-stats.mozilla.com/search/?reason=~EXCEPTION_ACCESS_VIOLATION_WRITE&build_id=%3E%3D20161012030211&address=%24c&product=Firefox&version=52.0a1&platform=Windows&date=%3E%3D2016-10-07T02%3A05%3A13.000Z&date=%3C2016-11-07T02%3A05%3A13.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=address#crash-reports
Assignee | ||
Comment 26•8 years ago
|
||
The crash still happens with the VirtualProtect, see bp-5e073847-8799-4ae1-b6b2-d42f22161105.
Assignee | ||
Comment 27•8 years ago
|
||
(In reply to Ting-Yu Chou [:ting] from comment #26) > The crash still happens with the VirtualProtect, see > bp-5e073847-8799-4ae1-b6b2-d42f22161105. This one is odd, open the dump with vs2015 and it has following in ReleaseData(): &iter->header = 0x0e03300c &iter->padding = 0x0e033010 &iter->value = 0x0e034000 The array content does not start from 0x...8 (sizeof nsTArrayHeader is 8) but 0x0e03300c, which iter->value.mData is outside of the readonly page.
Assignee | ||
Comment 28•8 years ago
|
||
I was wrong for comment 27, the address is not ended with 8 (0x0e0300c) because it's the 2nd nsEntry in the array. Note the patch in comment 23 protects only the first page of the nsHttpRequestHead.mHeaders: |nsTArrayHeader|nsEntry.header|padding|nsEntry.value.mData|
Comment 29•8 years ago
|
||
Hi Ting, now that the debugging code is on Aurora too, are we seeing anything useful in newer crash reports?
Flags: needinfo?(janus926)
Assignee | ||
Comment 30•8 years ago
|
||
No, there're only 5 EXCEPTION_ACCESS_VIOLATION_WRITE crashes for 52.0a2 which have the address end with 'c' in the past 2 weeks, and none of them are '0c' or 'fc'. I'd like to wait for few more days.
Flags: needinfo?(janus926)
Assignee | ||
Comment 31•8 years ago
|
||
Between Nov 12, 2016 and Dec 12, 2016 [1], there're 6 EXCEPTION_ACCESS_VIOLATION_WRITE crashes have address ends with 0xfc (the padding shifted offset of the first nsEntry's value.mData in the array): bp-574f8db7-fa10-4ab3-8d04-fb7532161212 F_767234858____________________________ bp-2ca9629f-c698-49b5-8646-3bdd92161206 ImplCycleCollectionTraverse<T> bp-a5cbad60-913d-4396-87a1-ece372161130 js::frontend::MatchOrInsertSemicolonHelper bp-797295af-ff22-4e94-8c77-7ad852161128 kernel32.dll@0x260be bp-a7825c72-f51f-4c48-92fe-f02282161125 js::irregexp::ExecuteCode<T> bp-9189e0d4-2334-4735-97e6-66c672161125 igdusc32.dll | igd10iumd32.dll | d3d11.dll@0x63ddb There aren't many crashes so I am not sure if the root cuase is captured, but bp-2ca9629f-c698-49b5-8646-3bdd92161206 looks suspicious as it has nsTArray_base::ShiftData in the crash stack. I'll take a look at it. I am going to backout the debug patch. [1] https://crash-stats.mozilla.com/search/?build_id=%3E%3D20161012030211&reason=~EXCEPTION_ACCESS_VIOLATION_WRITE&address=%24fc&product=Firefox&version=53.0a1&version=52.0a2&version=52.0a1&platform=Windows&date=%3E%3D2016-11-12T17%3A17%3A14.000Z&date=%3C2016-12-12T17%3A17%3A14.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=address&_columns=reason#crash-reports
Comment 32•8 years ago
|
||
Pushed by tchou@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/62b7071249cf Back out the debug patch (changeset 2eebd44ff2e9). r=me
Assignee | ||
Comment 33•8 years ago
|
||
Approval Request Comment [Feature/Bug causing the regression]: none [User impact if declined]: perf regression [Is this code covered by automated tests?]: not sure [Has the fix been verified in Nightly?]: no [Needs manual test from QE? If yes, steps to reproduce]: no [List of other uplifts needed for the feature/fix]: none [Is the change risky?]: no [Why is the change risky/not risky?]: just back out a debug patch [String changes made/needed]: none
Attachment #8798764 -
Attachment is obsolete: true
Attachment #8818143 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 34•8 years ago
|
||
(In reply to Ting-Yu Chou [:ting] from comment #31) > There aren't many crashes so I am not sure if the root cuase is captured, > but bp-2ca9629f-c698-49b5-8646-3bdd92161206 looks suspicious as it has > nsTArray_base::ShiftData in the crash stack. I'll take a look at it. It doesn't seem relate to the bug here because: 1) From vs2015, the crash address 0x11ba52fc is __vfptr of the argument aCallback for ImplCycleCollectionTraverse() which is not nsHttpHeaderArray. 2) The stack in crash-stats is different from the one when open the crash dump by vs2015, which does not have nsTArray_base::ShiftData(): xul.dll!ImplCycleCollectionTraverse<nsIBrowserElementAPI>(nsCycleCollectionTraversalCallback & aCallback, nsCOMPtr<nsIBrowserElementAPI> & aField, const char *) Line 1102 C++ xul.dll!nsPurpleBuffer::PurpleBlock::VisitEntries<RemoveSkippableVisitor>(nsPurpleBuffer & aBuffer, RemoveSkippableVisitor & aVisitor) Line 1035 C++ xul.dll!SnowWhiteKiller::SnowWhiteKiller(nsCycleCollector * aCollector) Line 2649 C++ xul.dll!nsTimerImpl::Fire() Line 477 C++ xul.dll!nsTimerEvent::Run() Line 293 C++ xul.dll!nsThread::ProcessNextEvent(bool aMayWait, bool * aResult) Line 1222 C++ xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate * aDelegate) Line 96 C++ xul.dll!MessageLoop::RunHandler() Line 226 C++ xul.dll!MessageLoop::Run() Line 206 C++ xul.dll!nsBaseAppShell::Run() Line 158 C++ xul.dll!nsAppShell::Run() Line 262 C++ xul.dll!nsAppStartup::Run() Line 283 C++ xul.dll!XREMain::XRE_mainRun() Line 4467 C++ xul.dll!XREMain::XRE_main(int argc, char * * argv, const nsXREAppData * aAppData) Line 4600 C++ xul.dll!XRE_main(int argc, char * * argv, const nsXREAppData * aAppData, unsigned int aFlags) Line 4692 C++ firefox.exe!wmain(int argc, wchar_t * * argv) Line 115 C++ [External Code] 3) The assembly around ip (0x10ed62a2) does not write to any address: 10ED62A0 mov edi,dword ptr [edx] 10ED62A2 test byte ptr [esi+4],1 10ED62A6 je ImplCycleCollectionTraverse<nsIBrowserElementAPI>+19h (10ED62B5h)
Assignee | ||
Comment 35•8 years ago
|
||
VirtualProtect() doesn't catch noticeable invalid writes, could it be something in kernel mode? I don't know how to proceed without STR...
Comment 36•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/62b7071249cf
Comment 37•8 years ago
|
||
Comment on attachment 8818143 [details] [diff] [review] v4 backout backout a debugging patch from aurora52
Attachment #8818143 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 38•8 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/2af052da3031
status-firefox52:
--- → fixed
Updated•8 years ago
|
status-firefox51:
affected → ---
status-firefox52:
fixed → ---
Comment 39•7 years ago
|
||
Bulk priority update: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P1
Comment 40•7 years ago
|
||
Ting, where are we with this bug? Should it stay open?
Flags: needinfo?(janus926)
Priority: P1 → P3
Whiteboard: [necko-active]
Assignee | ||
Comment 41•7 years ago
|
||
The patches I tried here didn't catch any culprits, and I don't have any other ideas how to proceed without STR.
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(janus926)
Resolution: --- → INCOMPLETE
Comment 42•7 years ago
|
||
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in
before you can comment on or make changes to this bug.
Description
•