Closed Bug 907006 Opened 11 years ago Closed 11 years ago

seccomp-bpf doesn't fully kill processes with our kernel 3.0 backport

Categories

(Firefox OS Graveyard :: General, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: jld, Assigned: jld)

References

Details

Attachments

(1 file)

When MOZ_CONTENT_SANDBOX is set but MOZ_CONTENT_SANDBOX_REPORTER is unset (which is currently the default on b2g) and a child process makes a system call which is forbidden by seccomp (e.g., bug 906996), I see the child process enter zombie state but not exit, and neither gdb nor strace is made aware that anything out of the ordinary happened to it. When MOZ_CONTENT_SANDBOX_REPORTER is set — and, thus, a SIGSYS handler is established — signal handling proceeds normally, gdb catches the signal, and (modulo bug 907002) it can be debugged normally. Filing this under b2g (and noting that I'm seeing it on keon) because I suspect kernel-specific weirdness, though I have no evidence to support that presumption at present.
zombie state could happen when MOZ_CONTENT_SANDBOX_REPORTER is defined, and a syscall is denied while in the Reporter() function (defined in /security/sandbox/Sandbox.cpp). This however should not happen since PR_LOG is allowed to use the already opened FD to write it' message, and the function doesn't do anything else (it then exits - basically, when exit is not called, the process would stay as zombie) When MOZ_CONTENT_SANDBOX_REPORTER is not set however, the kernel kills the process directly. This is defined at the top of /security/sandbox/Sandbox.ccp: struct sock_filter seccomp_filter[] = { VALIDATE_ARCHITECTURE, EXAMINE_SYSCALL, SECCOMP_WHITELIST, #ifdef MOZ_CONTENT_SANDBOX_REPORTER TRAP_PROCESS, #else KILL_PROCESS, #endif In the bpf filter, if all conditions of the macros pass (in short, the syscall isn't in SECCOMP_WHITELIST and the bpf filter goes on), the last item is KILL_PROCESS which instruct the kernel to really kill the process. Normally (and AFAIK) this shouldn't end up with a zombie, and did not in my experience on the Keon (at least, when gdb is detached). However, it might be some kernel specific stuff, indeed. Slightly related, and similar to what Chromium does, we could also consider using a runtime (as opposed to compile time) flag to "turn on debugging", which allows ptrace, process_vm_{readv,writev}, kcmp (x86) and enables the reporter as well. I generally am a little afraid of runtime flags, since someone could ship with it on by mistake easily.
(In reply to Guillaume Destuynder [:kang] from comment #1) > Normally (and AFAIK) this shouldn't end up with a zombie, and did not in my > experience on the Keon (at least, when gdb is detached). > However, it might be some kernel specific stuff, indeed. That's interesting, because I always get a zombie process when the kernel tries to SECCOMP_RET_KILL something on my keon, and we should be using the same kernel (2ca8e9dea in my case).
(In reply to Guillaume Destuynder [:kang] from comment #1) > zombie state could happen when MOZ_CONTENT_SANDBOX_REPORTER is defined, and > a syscall is denied while in the Reporter() function (defined in > /security/sandbox/Sandbox.cpp). I should also add that, so far, I haven't seen the process fail to exit cleanly in the SECCOMP_RET_TRAP case. Assuming that force_sig_info does what it looks like, if we get a second SIGSYS while in the SIGSYS handler (or otherwise blocking it), then it will be reset to SIG_DFL (and unblocked) — but also I know this wasn't happening because I added a printk to seccomp_send_sigsys.
Summary: seccomp errors can't be caught on b2g (keon) unless MOZ_CONTENT_SANDBOX_REPORTER=1 → seccomp-bpf doesn't fully kill processes with our kernel 3.0 backport
Attached patch seccomp-sigkill.mbox (deleted) — Splinter Review
This patch posts SIGKILL instead of calling do_exit, for SECCOMP_RET_KILL. It seems to work, in testing with --disable-content-sandbox-reporter and with a syscall removed from the whitelist. How do we feel about this, security-wise?
Assignee: nobody → jld
Attachment #819890 - Flags: feedback?(gdestuynder)
I looked into this some more. When seccomp calls do_exit, tracehook_notify_death is returning DEATH_DELAYED_GROUP_LEADER instead of SIGCHLD, because thread_group_empty is false, which would explain the lack of signaling. tracehook_notify_death was refactored away in commit 45cdf5cc0703c, between 3.1 and 3.2, but the corresponding logic after that change looks like it should do the same thing as before, and doesn't seem to have changed since then. So we still don't know why the behavior differs between 3.0 and 3.5, but I'm leaving my observations here in case they're useful later.
Comment on attachment 819890 [details] [diff] [review] seccomp-sigkill.mbox Review of attachment 819890 [details] [diff] [review]: ----------------------------------------------------------------- While it's not as nice and direct, I see no current security implication (to my knowledge, it doesn't mean there isn't). After discussing with :jld I would say we should take this patch "because it works" and it should not be needed anymore in a future kernel version (probably 3.2+). As per :jld: "jld | kang: It looks like tracehook_notify_death is returning DEATH_DELAYED_GROUP_LEADER instead of SIGCHLD, because thread_group_empty is false, which would explain the lack of signaling." Fixing that would be a much bigger task without a big gain for our purposes.
Attachment #819890 - Flags: feedback?(gdestuynder) → feedback+
Now that bug 979590 has landed, we're not using SECCOMP_RET_KILL (except as a non-default configure option), and in the long run we'll be moving away from ICS and 3.0-based kernels. So I think this is no longer worth the effort of trying to get the workaround accepted by manufacturers.
Status: NEW → RESOLVED
Closed: 11 years ago
Depends on: 979590
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: