Closed Bug 1038900 Opened 10 years ago Closed 10 years ago

Crash in SetCurrentProcessSandbox while stability testing

Categories

(Core :: Security, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla33
blocking-b2g 2.0+
Tracking Status
firefox31 --- wontfix
firefox32 --- fixed
firefox33 --- fixed
b2g-v2.0 --- fixed
b2g-v2.1 --- fixed

People

(Reporter: ggrisco, Assigned: jld)

References

Details

(Keywords: crash, Whiteboard: [b2g-crash][caf-crash 272][caf priority: p1][CR 693894])

Crash Data

Attachments

(3 files)

Attached file decoded minidump (deleted) —
      No description provided.
This crash came during stability testing, so no definite STR available.  We have so far only seen this crash signature twice.
Attached file EXTRA file attachment (with logs) (deleted) —
Given the rate of occurrence and nothing concrete here, I am not sure I'd block on this bug as yet.

Nevertheless based on the crash signature NI on jed Davis to get stared here. 

jed, anything striking from the logs or minidump here ?
Flags: needinfo?(jld)
Whiteboard: [CR 693894] → [caf priority: p1][CR 693894]
Observed on: 

Device: msm8610
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.029
Moz BuildID: 20140710000201
B2G Version: 2.0
Gecko Version: 32.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=35a9b715e7348ec738ff6c8a59f50190390a06f2
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=2fb60c777d3f82d580cba249e5e01a167a01de39
https://mxr.mozilla.org/mozilla-aurora/source/security/sandbox/linux/Sandbox.cpp#295

295   if (signal(sSetSandboxSignum, SetThreadSandboxHandler) != SIG_DFL) {
296     LOG_ERROR("signal %d in use!\n", sSetSandboxSignum);
297     MOZ_CRASH();
298   }

Something set a signal handler on SIGRTMIN+3 before sandbox startup.  There's nothing else in Gecko that would do that, judging by MXR, so it must be code in some other library.  It would help to find out what it is, why it only sometimes happens, and whether it's running on the main thread.
(In reply to Jed Davis [:jld] from comment #5)
> https://mxr.mozilla.org/mozilla-aurora/source/security/sandbox/linux/Sandbox.
> cpp#295
> 
> 295   if (signal(sSetSandboxSignum, SetThreadSandboxHandler) != SIG_DFL) {
> 296     LOG_ERROR("signal %d in use!\n", sSetSandboxSignum);
> 297     MOZ_CRASH();
> 298   }
> 
> Something set a signal handler on SIGRTMIN+3 before sandbox startup. 
> There's nothing else in Gecko that would do that, judging by MXR, so it must
> be code in some other library.  It would help to find out what it is, why it
> only sometimes happens, and whether it's running on the main thread.

:Greg any way to provide more information here that Jed's asking for. I understand these are hard to hit stbility crashes on your side but not sure if there is antyhing else you could provide to fill the gaps here.

OTOH, Jed, can we add more debugging code around this area to get more information that could be useful if this crash happens again ?
Flags: needinfo?(ggrisco)
Component: General → Security
Product: Firefox OS → Core
(In reply to Jed Davis [:jld] from comment #5)
> https://mxr.mozilla.org/mozilla-aurora/source/security/sandbox/linux/Sandbox.
> cpp#295
> 
> 295   if (signal(sSetSandboxSignum, SetThreadSandboxHandler) != SIG_DFL) {
> 296     LOG_ERROR("signal %d in use!\n", sSetSandboxSignum);
> 297     MOZ_CRASH();
> 298   }
> 
> Something set a signal handler on SIGRTMIN+3 before sandbox startup. 
> There's nothing else in Gecko that would do that, judging by MXR, so it must
> be code in some other library.  It would help to find out what it is, why it
> only sometimes happens, and whether it's running on the main thread.

A did a quick search for SIGRTMIN+3 across our codebase only found it in the gecko sandbox code, so no luck there. 

If there's any patch that you can apply to help debug this, just let me know.  Thanks.
Flags: needinfo?(ggrisco)
(In reply to Greg Grisco from comment #7)
> A did a quick search for SIGRTMIN+3 across our codebase only found it in the
> gecko sandbox code, so no luck there. 

In that case, my guess is that something is iterating over the range [SIGRTMIN,SIGRTMAX] and taking the first signal that's not in use (that has default handling).  I can make the sandbox code do the same thing, but the problem is that if two threads do that at the same time, they can still race and both try to use the same signal.  That would still be an improvement over the current situation.

The other change that's definitely worth making is to log the address of the unexpected signal handler, because that should be enough to identify what installed it.
Flags: needinfo?(jld)
This patch will probably fix this bug, and if it doesn't then we'll at least know why.  I've tested locally on emulator-x86-kk.  The patch applies cleanly to mozilla-aurora.
Assignee: nobody → jld
Attachment #8457476 - Flags: review?(gdestuynder)
blocking-b2g: 2.0? → 2.0+
Attachment #8457476 - Flags: review?(gdestuynder) → review+
https://hg.mozilla.org/mozilla-central/rev/cfca2c09feee
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
Whiteboard: [caf priority: p1][CR 693894] → [b2g-crash][caf-crash 272][caf priority: p1][CR 693894]
Keywords: crash
Observed on: 

Device: msm8610
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.040
Moz BuildID: 20140716000201
B2G Version: 2.0
Gecko Version: 32.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=5f8b1b8a2da9e3b531eee817a669f57fa4d9b9c6
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=e00f7e464333689fcf54edb4945ece94f97f930b
Observed on: 

Device: msm8610
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.043
Moz BuildID: 20140721000201
B2G Version: 2.0
Gecko Version: 32.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=8cb1a949f2e9650bb2c5598e78a6f24a58bbaf97
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=5f27d3ee3ccf01ac91a3efacb5e3e22ea62fd73c
(In reply to cafbot (PoC: ggrisco) from comment #15)
> Observed on: 
> Gecko:
> http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=5f27d3ee3ccf01ac91a3efacb5e3e22ea62fd73c

That version should have the patch from this bug.  Are there crash dumps?
Flags: needinfo?(ggrisco)
I created bug 1046210 and uploaded new logs there
Flags: needinfo?(ggrisco)
Flags: in-moztrap?(ychung)
No STR is present to create test case to address bug.
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Flags: in-moztrap?(ychung)
Flags: in-moztrap-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: