Closed
Bug 1276636
Opened 9 years ago
Closed 8 years ago
Crash in mozilla::places::AsyncFetchAndSetIconForPage::start
Categories
(Toolkit :: Places, defect, P2)
Tracking
()
People
(Reporter: Gijs, Unassigned)
References
Details
(Keywords: crash, regression)
Crash Data
This bug was filed from the Socorro interface and is
report bp-b02e97f5-bccf-48ed-bfc1-705222160528.
=============================================================
This is on beta and aurora.
I'm wondering if the out param is somehow not happy-making, but the reports seem to be from JS calls so that doesn't make an awful lot of sense to me.
Comment 1•9 years ago
|
||
yeah, so far it's a mistery to me. event is created with new, that should be infallible and _canceler is checked with NS_ENSURE_ARG_POINTER.
The only alternative I could think of, would be to use NS_IF_ADDREF(*_canceler = event); but it should basically be the same.
As a side note, it looks like a crash on startup.
I wonder if the fact there are no crashes on 48 is just due to a limited number of users, or something has been fixed in 48, but I couldn't find anything related by a quick look.
Updated•9 years ago
|
Priority: -- → P2
Reporter | ||
Comment 2•9 years ago
|
||
(In reply to Marco Bonardo [::mak] from comment #1)
> yeah, so far it's a mistery to me. event is created with new, that should be
> infallible and _canceler is checked with NS_ENSURE_ARG_POINTER.
Is it? Where? I don't see it here:
https://hg.mozilla.org/releases/mozilla-beta/file/tip/toolkit/components/places/AsyncFaviconHelpers.cpp#l377
It wasn't part of the csets that landed for bug 1265420 so it wasn't part of the code that landed on beta.
Even so, as noted, if we're calling from JS I don't understand why the arg pointer would be unusable. :-\
Flags: needinfo?(mak77)
Comment 3•9 years ago
|
||
(In reply to :Gijs Kruitbosch from comment #2)
> Is it? Where? I don't see it here:
it's in the API calling point, http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/nsFaviconService.cpp#220
Flags: needinfo?(mak77)
Updated•8 years ago
|
Keywords: regression
Startup crash, regression from 47 (bug 1255270)
status-firefox47:
--- → affected
status-firefox48:
--- → affected
status-firefox49:
--- → ?
tracking-firefox47:
--- → ?
tracking-firefox48:
--- → +
The overall crash volume on this one is really low on Beta47. 12 crashes in 7 days. Unless the fix is ready in the next 8 hrs and is low risk, we might have to ship with this on Fx47.
The risk I am taking here is when we do go to release, this crash might worsen but will it be bad enough to need a dot release, we will have to wait and watch.
Comment 6•8 years ago
|
||
All the new crashes are EXCEPTION_ACCESS_VIOLATION_WRITE which could be a security vulnerability. To the extent it seems to be at startup it's probably not exploitable -- but I did find one crash at 112 seconds and another at 226.
bp-341cc34b-bc10-4d28-826c-a13e32160529
bp-851be43e-e506-40f5-b6bc-e073f2160601
Of the 20 crashes I see there are only 5 unique people based on install time. One person crashed 10 times, another 6 times, another twice, and two people crashed once each.
There are some older crashes that might be related, and might indicate the recent changes are exposing a latent problem.
Back in 43.0 I see a use-after-free crash at nsCOMPtr_base::~nsCOMPtr_base | mozilla::places::AsyncFetchAndSetIconForPage::start
bp-b421bf83-4689-445b-ad26-056282160602
(maybe the callback has gone away?)
In 46.0.1 I see two crashes that are a read access violation on the address 0x7f532015, which is also the top of the stack. (both seem to be startup, and the same person judging by install time.) This is in the constructor AsyncFetchAndSetIconForPage(), so maybe a getter on one of the params, and it's bogus? Maybe the callback again?
bp-e85cc0cd-4b65-435d-b13f-6002a2160521
Comment 7•8 years ago
|
||
(In reply to Marco Bonardo [::mak] from comment #1)
> I wonder if the fact there are no crashes on 48 is just due to a limited number of users, or
> something has been fixed in 48, but I couldn't find anything related by a quick look.
In the 20 crashes I found, one person crashed twice on 48.0a2
bp-905fbf3d-ff25-41a2-a09a-1364b2160525
bp-18e9bc20-6dd0-484b-b2e9-102682160526
I plan to keep this bug around for a few more weeks in case this becomes a dot-release driver/ride-along for Fx47. Otherwise, I will wontfix for Fx47 3 weeks from now.
Tracked for Fx47 (to retriage a week from now on whether this needs to be in a dot release or not).
Updated•8 years ago
|
Comment 10•8 years ago
|
||
Marco, can we get an assignee here or is this lower priority?
Flags: needinfo?(mak77)
Comment 11•8 years ago
|
||
(In reply to Jim Mathies [:jimm] from comment #10)
> Marco, can we get an assignee here or is this lower priority?
I'd be glad to fix it, but I already investigated the code and I cannot see anything wrong, unless I'm missing something about having to use NS_IF_ADDREF rather than .forget(). The only way to tell, for me, would be to be able to capture this at least once in a debugger, and I doubt it's possible since we have no STR.
Maybe someone with a greater expertise in xpcom may notice some detail I missed?
Nathan, could you please help me figuring if there's any wrong xpcom usage in the way we pass through _canceler and assign to it?
http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/nsFaviconService.cpp#215
Flags: needinfo?(mak77) → needinfo?(nfroyd)
Comment 12•8 years ago
|
||
(In reply to Marco Bonardo [::mak] from comment #11)
> Nathan, could you please help me figuring if there's any wrong xpcom usage
> in the way we pass through _canceler and assign to it?
> http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/
> nsFaviconService.cpp#215
There's nothing wrong with the xpcom usage.
There is something peculiar about the address we're trying to assign to in Gijs's original report, though; the address lies within the text segment of libxul, which seems a bit of an odd place. That address was fetched off the stack though; I'm not sure whether that means:
* somebody corrupted the stack;
* somebody didn't properly update the stack location; or
* the compiler has bad codegen here.
I would think that if it was the third option, we'd be seeing a lot more crashes? Unless this is just uncommonly executed code, of course...
I don't have time at the moment to undertake a detailed analysis of the assembly and figure out whether the compiler is indeed getting things from the correct location.
Flags: needinfo?(nfroyd)
Updated•8 years ago
|
No crashes showing up here for 48 in the last 3 weeks. There is one remaining crash in 47. Marking this Worksforme since we don't have a clear idea of the problem or if something we did fixed it.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•