Closed Bug 1470591 Opened 6 years ago Closed 5 years ago

Add a fork server for launching content processes

Categories

(Core :: IPC, enhancement, P2)

Unspecified
Linux
enhancement

Tracking

()

RESOLVED FIXED
mozilla73
Performance Impact none
Fission Milestone M6
Tracking Status
firefox73 --- fixed

People

(Reporter: erahm, Assigned: sinker)

References

(Depends on 1 open bug, Blocks 3 open bugs)

Details

(Whiteboard: [overhead:>4MB])

Attachments

(9 files, 8 obsolete files)

(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
gsvelto
: feedback+
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
As we spin up more content processes we'd like to improve the amount of shared data across processes. On Linux, read-only portions of the binary such as .text and .rodata can be shared, but portions that must be relocated such as .data.rel.ro cannot. .data.rel.ro accounts for ~4MB of unsharable data. vtables account for a fair amount of this data, and while we do have bugs on file for reducing that amount of vtables, that's a rather tedious process with diminishing returns. Instead I propose implementing a system that loads a minimal content process (essentially just a main loop) that is then used to fork real content processes. This should give us a sizeable memory win for the relocations as well as other possibilities for sharing memory pages marked as copy-on-write. Prior art can be found in Chrome's zygote process [1] as well as our previous attempts of Nuwa for B2G [2]. I'm proposing a less aggressive version of Nuwa in that we would perform the fork before initializing XPCOM and avoid dealing with threading, mutex, polling, etc. We might be able to get larger wins by initializing some of our core libraries such as ICU, NSS, libav, and portions of SpiderMonkey prior to forking. Additionally if we can implement something that works for mac as well we'd see at least a 15MB improvement. I'm filing this in IPC, but it clearly has implications on sandboxing and xpcom as well. [1] https://chromium.googlesource.com/chromium/src/+/master/docs/linux_zygote.md [2] https://wiki.mozilla.org/NuwaTemplateProcess
Jed, when you get a chance can you sketch out some of your thoughts on this?
Flags: needinfo?(jld)
In theory this should be a perf win as there's less initialization required. The Chrome folks measured ~56ms/GHz [1]. [1] https://chromium.googlesource.com/chromium/src/+/master/docs/linux_zygote.md#appendix-a_runtime-impact-of-relocations
Whiteboard: [overhead:>4MB] → [overhead:>4MB][qf]
For a basic proof-of-concept, we should be able to hook in early in main() to check for a command line flag or env var and, without starting threads (or using XPCOM, probably) run a little server that receives packets containing: 1. a list of fds (as SCM_RIGHTS) and a list of destination fds to map them to 2. environment variable settings 3. argv 4. [reserved for future expansion] I think the IPC Pickle / ParamTraits stuff can be safely used to deserialize the data, but the fd passing would have to be hand-written. At the risk of stating the obvious: it then forks, and the child applies the fd mapping (see [1], although the CloseSuperfluousFds is a little unnecessary here) and sets the env vars (setenv is safe, because single-threaded), and continues with the provided argv; the server would send back the pid or error. This server would be launched normally with LaunchApp (maybe lazily the first time it's needed?) and GeckoChildProcessHost::PerformAsyncLaunchInternal would use it instead. On IRC I suggested adding options to LaunchApp, but on further thought I think it makes more sense just to write something specialized. Things that are broken with this: * Sandboxing as it currently exists can work, but at the moment it's factored kind of badly for this — we just want to send down the SandboxFork constructor params, but that's all abstracted inside SandboxLaunch and hidden behind the ForkDelegate abstract class. (Those params are computed by poking at a lot of XPCOM stuff in the parent process; that part needs to stay where it is.) * Sandboxing in the future was (at some point) going to allow launching processes via a setuid helper for distributions like Debian and Arch and RHEL7 that don't allow unprivileged user namespaces by default. Chrome appears to handle this by sandbox-launching the entire zygote, which also means the renderers *start* without filesystem access if I'm reading the code correctly (among other quirks). Not insurmountable, but definitely makes this harder. Alternately, those setups could take the memory overhead of per-process ASLR. * Waiting for processes to exit. On Linux the server could use CLONE_PARENT to create a sibling instead of a child; portably, it could handle it as a second RPC message. (I wouldn't mind throwing out and rewriting the child process watcher code.) * Thread creation at initializer time. This can happen if people follow NVIDIA's advice about multithreaded GL, which isn't needed for Firefox; we could detect that and scrub LD_PRELOAD. In general we'd want to be able to detect this and fall back; I don't know if there's anything more portable than interposing pthread_create. (On Linux there's a trick with the link count of /proc/self/task, but the Tor Browser people want to run with /proc unmounted. On the other hand they might also want to sacrifice memory for per-process ASLR.) TSan also creates extra threads, but we can just turn this all off. * Not exactly broken, but doing a blocking read on the I/O thread to wait for the pid isn't ideal. Making that async or moving it to a dedicated thread would be nice; this is entangled with making the main thread not sync wait to get the pid from the I/O thread. * Mac, maybe. I've heard that fork-without-exec can cause problems involving Mach ports, but I don't understand the details and whether it applies to us / if there's some initialization we could defer to prevent it. (Mac sandboxing doesn't need any magic at launch time.) A thing that is good: * This also means that we're not forking the parent process, which imposes time costs proportional to how much writable private memory it has, which is usually a lot. I wanted to do something about this anyway. (Corollary: that blocking read to get the pid might actually be less jank than forking directly.) The other idea I mentioned on IRC was using mozglue/linker to do the loading and modifying it to use shared memory (or MADV_MERGEABLE?) for the relocated things. That would be ELF-only (and Linux-only with KSM), but it avoids some of the fork-related problems. Also there might be reasons we can't or shouldn't do our own loading on desktop. [1] https://searchfox.org/mozilla-central/rev/93d2b9860b3d341258c7c5dcd4e278dea544432b/ipc/chromium/src/base/process_util_linux.cc#34-54
Flags: needinfo?(jld)
I'm hearing a lot of talk about Linux (and maybe Mac), but none about Windows...and our platform priorities run in roughly the opposite direction. I guess we would win on...Android?
(In reply to Nathan Froyd [:froydnj] from comment #4) > I'm hearing a lot of talk about Linux (and maybe Mac), but none about > Windows...and our platform priorities run in roughly the opposite direction. > I guess we would win on...Android? The main issue we're trying to solve here is relocated data not being sharable across processes. That isn't a problem on Windows, because relocations are shared across separate processes. It is a big problem on Linux and OS-X, though, and we can't really ignore it there. Same goes for Android.
Android/GeckoView is… interesting. We're currently launching child processes as Android services, which means that we already have Android Runtime stuff when we're started (so, probably threads), and if we want N content processes we'd have to declare ≥N services in an XML file. At present we support only one content process. It's apparently also possible to use fork/exec, but there's concern that this isn't really supported and whatever we do with that could be arbitrarily broken by OS updates. Also, exec'ing means no Android Runtime, which means no way to get a GL context, which means we'd have to do WebGL remoting, which Chrome (last I heard) does on desktop but *not* on mobile because of the overhead. (This is all secondhand from :snorp; I hope I haven't mangled it too much.)
(At this point, this doesn't sound like it's in the [qf] umbrella, but feel free to renominate with more details if needed. Knee-jerk triage decision: there will be lots work around fission to avoid incurring perf regressions as we increase the number of content processes, and that's all worthwhile work, and we also don't want the [qf] project to scope-creep to encompass all of that work.)
Whiteboard: [overhead:>4MB][qf] → [overhead:>4MB][qf-]
(In reply to Daniel Holbert [:dholbert] from comment #7) > (At this point, this doesn't sound like it's in the [qf] umbrella, but feel > free to renominate with more details if needed. Knee-jerk triage decision: > there will be lots work around fission to avoid incurring perf regressions > as we increase the number of content processes, and that's all worthwhile > work, and we also don't want the [qf] project to scope-creep to encompass > all of that work.) I don't think this is scope creep. This is a project that benefits both memshrink and qf in unrelated ways. It benefits memshrink by allowing us to share relocated data (and some data touched by static initializers) between child processes. It benefits qf by making it much cheaper/faster to spawn new content processes, and, importantly, moving the janky fork() step from the parent process (where it's user-visible) to the fork server (where it's not).
For what it's worth, moving the fork() to a dedicated server with a minimal amount of private writable data should greatly decrease the amount of jank (and CPU usage), as well as moving it. There are plans (bug 1348361, bug 1461459, bug 1446161) to stop making the main thread block waiting for the I/O thread to finish the launch operation; it may also be possible to move that off the I/O thread so it doesn't block IPC message passing either. But it's a little more complicated. Profiling on Linux, I'm seeing a gap in samples from the parent process main thread in LaunchSubprocess, flanked by pthread_cond_wait blocking on the I/O thread. I'd understand that if I were profiling the I/O thread as well, because it blocks SIGPROF in order to ensure it can make progress on forking and I believe that will hang the entire profiler for the duration… but I'm not doing that. So this suggests that the entire process gets suspended (either explicitly or as a side effect of blocking in page faults) in order to remove write permissions and do TLB shootdown. In any case, I'm seeing 11ms of jank there in a test profile, and it would probably be more in a heavily used browser, and offloading the fork() to another process is the only real solution. Also, the parent process is going to take an ongoing performance hit as it incurs page faults to flip the momentarily copy-on-write memory back to writeable. I've observed this with perf(1) but I don't have numbers at the moment; I remember the total time was on the same order of magnitude as the fork itself. On Mac the situation is different: we're using posix_spawn, which in theory doesn't need to do anything like fork() and can just create the new process /de novo/, but I haven't tried profiling it yet. tl;dr: this is a jank problem on Linux (and async launch probably won't help), it may not be on Mac but there's no data yet, and Windows is out of scope for this bug (see comment #5).
Depends on: 1440207
glandium points out in bug 1480401 that we may need SandboxFork to call pthread_atfork handers to use it like this. The fork server will definitely be single-threaded (unless we're using TSan, but in that case sandboxing is disabled and the real fork() will always be used), so the usual problems with multithreaded fork don't apply, but there might be something.
Priority: -- → P3
Priority: P3 → P2
Fission Milestone: --- → M2

Eric, this is targeting M2, can you please add an assignee and also update the status of this?

Flags: needinfo?(erahm)

(In reply to Neha Kochar [:neha] from comment #11)

Eric, this is targeting M2, can you please add an assignee and also update the status of this?

Jed's going to be looking at this. The last time we checked in a Linux PoC was still on target for M2.

Assignee: nobody → jld
Flags: needinfo?(erahm)
Fission Milestone: M2 → M3
Fission Milestone: M3 → M4

In comment #9, jld mentioned about page faults in the parent process. Why do we care it? Firefox creates a fixed number of processes, it doesn't create processes from time to time. Do you try to improve launch time of the browser too?

For comment#10, some libraries may create threads when it is loading. One case we had during B2G time, a graphic driver on a device creates a thread for unknown purposes. The simplest solution is to postpone its loading.

it doesn't create processes from time to time.

It will with process-per-origin, which is a big driver for these changes.

Attached patch forkserver-WIP.diff (obsolete) (deleted) — Splinter Review

This a workable WIP.

The basic idea is running a single thread fork server at the head of |content_process_main()| to receive fork request from the chrome process. In the chrome process, |fork()| is replaced by a sync IPC call to the fork server, for the process type of |GeckoProcessType_Content|. Once the fork server receive the request, it forks a new process, and the child process will leave the fork server returning back to |content_process_main()| to continue the remaining code after it. Then, everything will run as a normal content process without the fork server.

There are a lot of details to deal with. They includes sandbox, the initialization of XPCOM, and shutting down the message loop on the main thread used by the fork server. Single thread also cause some problems. Trying to run IO loop on the main thread run into some cases that is not considered by current implementation of IPC.

The reason of single thread is to reduce memory usage as much as possible, and also avoid memory leaking caused by abandoned threads after fork.

The parameters, options, to create a new content process are passed to the fork server as a part of IPC. They includes args, environment variables, and file descriptors. At the fork server side, it will initialize environment variables, shuffle file descriptors, and replace args with the values from IPC.

According to smaps of the content processes, they share about 6MB, the major contribution is from .data.rel.ro & .got of libxul.

Attached patch forkserver-wip2.diff (obsolete) (deleted) — Splinter Review

Changes:

  • change binary type of |forkserver| to |Self|,
  • support forkserver for both |Self| & |PluginContainer|,
  • move --enable-forkserver from old-configure to moz.configure,
  • fix the problem repeating the same log message caused by the buffer of |stdout| & |stderr|,
  • fix the problem that deadlock detector loses information of instances of BlockingResourceBase created in the forkserver process, and
  • some minor bugs.
Attachment #9077833 - Attachment is obsolete: true

This is awesome Thinker! A 6MB improvement is even better than what I was hoping for. It looks like you're making good progress on your proof of concept. At this point it might be a good idea to get some high level design feedback from kmag and jld before we get too stuck with any architectural decisions. I believe the main points they want to address are making process launching async and how we handle file descriptor sharing, but I'll let them speak more to that.

Kris & Jed, would you mind taking a moment to provide some high level feedback (I don't think we need to do a full on code review yet)?

Flags: needinfo?(kmaglione+bmo)
Flags: needinfo?(jld)

Just a bit explanation to make it easier to understand,

  • The fork() in base::LaunchApp() for linux is replaced by a sync IPC call,
  • base::LaunchApp() is supposed to be running at |IPC Launch| thread,
    • AFAIK, |IPC Launch| is used only for forking new processes, so it is async for other threads.

At the fork server,

  • the main thread is used for both handling IO and the actor of PForkService,
  • there is no additional IO thread to avoid additional overhead and memory leaking caused by a thread,
    • resources allocated for IO thread may leak and be difficult to recover after fork.

I started looking at this back in April but I didn't get very far towards something that would be testable (or buildable); I wrote some code for serialization/deserialization and most of the client side, but not the server side. I thought I'd attach it in case it's useful. A few notes:

  • My plan was to use nothing from IPC but the Pickle class. No IPDL, no actors, no event loops or runnables, no XPCOM; just sockets and SCM_RIGHTS used directly. Ideally, as little happens in the fork server itself as possible.
  • Also no hard dependencies on anything Linux-specific, because we'll want to use this on Mac if at all possible, and support for the Tier-3 Unixes would also be nice.
  • I have an answer for dealing with the Linux sandbox (serialization for the fork delegate and delaying the chroot server setup) but it seems a little inelegant; I have some FIXME comments about that.
  • The message to the fork server passes a socket that the forked/cloned child uses to read most of the serialized data, so the fork server itself handles only a fixed-size message and can respond immediately with the pid. The fds to remap should also be passed that way, I think; what I sketched out here (passing them in the initial message) means there's a fixed limit on the number of fds and the fork server has to care about closing its unnecessary copies. Kris might have some ideas about this.
  • I'd hoped to have the fork server report when and how a process exits, as part of cleaning up how we handle process termination, but I don't have a clear picture of how to integrate that yet, and it's not strictly necessary to solve this problem generally. The requirements we currently have, as I understand it, are that opt builds will kill a process if it hasn't exited within a timeout and debug builds wait with no timeout (so that refcount / leak logging can finish).

Some points about linux specific

There is actually no/or few linux specific in current implementation for linux.

Our process_util_linux.cc is what depends on Linux, but actually it works for all UNIX like systems as far as I can see. It implements fork-exec model, fd shuffling, signal handling. Nothing is special just for Linux.

What Mac is different is using posix_spawn() instead of fork(). But, apparently, we need fork() to make sense. So, when it comes to the fork server, it should stop using posix_spawn() and start using fork(). That means what is implemented by process_util_linux.cc.

For the fork sever, it is apparently using fork-exec model. Basing on this, so far I don't seen much problem to work with linux and mac, and other UNIX-like systems since they all work with fork-exec model and APIs.

I have implemented a new one without using message loop, ipc channel ... etc. However, one of my concerns is to duplicate a bunch of logic from existing IPC code that have been improved and fixed with a long history. Now, we have a new copy for the same purpose. The saying is more code more bugs.

Attached patch forkserver-wip3.diff (obsolete) (deleted) — Splinter Review

Remove dependencies of IPC::Channel, MessageChannel, MessageLoop, ...etc.

Handle transmissions of IPC messages with custom code.

Attachment #9078239 - Attachment is obsolete: true
Attached patch forkserver-wip3.diff (obsolete) (deleted) — Splinter Review

The previous one missed some files.

Attachment #9081627 - Attachment is obsolete: true
Comment on attachment 9081675 [details] [diff] [review] forkserver-wip3.diff Could you check if the design is what you want? Basically, there is no IPDL any more. MiniTransceiver handles all messages from/to the fork server.
Attachment #9081675 - Flags: feedback?(jld)

Any estimate about when someone will have time to look at this patch? Thanks!

Attachment #9081675 - Flags: feedback?(kmaglione+bmo)
Comment on attachment 9081675 [details] [diff] [review] forkserver-wip3.diff Hi Gabriele, Could you check the patch and give me some feedbacks. This patch has implemented a tiny IPC protocol without IPDL, but with parcels/Messages and a tiny transport, MiniTransceiver, to handle IPC. It is implemented with single thread in mind in the fork server process, seeing class ForkServer, and blocking IPC at IPC Launch thread in the parent process, seeing class ForkServiceChild & GeckoChildProcessHost.
Attachment #9081675 - Flags: feedback?(jld) → feedback?(gsvelto)

(In reply to Thinker Li [:sinker] from comment #28)

Could you check the patch and give me some feedbacks.
This patch has implemented a tiny IPC protocol without IPDL, but with
parcels/Messages and a tiny transport, MiniTransceiver, to handle IPC.
It is implemented with single thread in mind in the fork server process,
seeing class ForkServer, and blocking IPC at IPC Launch thread in the parent
process, seeing class ForkServiceChild & GeckoChildProcessHost.

Hi Thinker,
I've rebased your patch on top of a current mozilla-central and it's crashing on startup so I fear I might have gotten something wrong in the process. Specifically it asserts somewhere in IPC where a boolean is set to a value that is neither '1' nor '0' during the call to mozilla::ipc::ForkServiceChild::SendForkNewSubprocess(). Anyway I've spent some time going through the code and from a birds eye view it looks sound to me.

I don't know exactly what :jld expected for the IPC part but your implementation is straightforward and it seems more than sufficient to handle the case at hand. To be frank I found it easier to read than our regular IPC but maybe that's just me.

As for the forking procedure I couldn't spot anything Linux-specific either and the logic seems sound, however the devil is in the details. For macOS for example one of the reasons we've moved away from fork() and switched to posix_spawn() is that the implementation seems to allocate memory when called. Since we're hooking up jemalloc at startup and tearing it down at shutdown then calling fork() too late on macOS would lead to crashes; see bug 1376567 comment 3 for an example of this. I wouldn't worry about that for now but keep in the back of your head that if we would share the code with macOS then you must be prepared to deal with these issues.

There's also some aspects of your patch I'd like to see clarified. For example I don't understand the change to void IPDLParamTraits<FileDescriptor>::Write(IPC::Message* aMsg, IProtocol* aActor, const FileDescriptor& aParam). What is it for?

Also there's quite a bit of code duplication around process startup; I'd like to see that removed and there's a few functions that are too large for my tastes like MiniTransceiver::Send(), MiniTransceiver::Recv() and ForkServer::OnMessageReceived(). Splitting those up would improve readability.

If you make this work again on top of a recent mozilla-central I'd say this is basically ready for review. I guess :jld will want to chime in for the IPC part so I'll try pinging him today.

BTW this is so much simpler to read and deal with than Nuwa! If you can make it work I'll make sure this becomes the default on as many platforms as possible.

Hi Gabriele,
Thank you for your feedback. I am making changes according your comments.
For the crashing, I can not reproduce it on the top of mozilla-central (changeset 0a0112c2cad4).
Could you show me your mozconfig and how you start it? It could be even better if you tell me where the assertion is.

Attachment #9081675 - Flags: feedback?(kmaglione+bmo)
Flags: needinfo?(kmaglione+bmo)
Flags: needinfo?(jld)
Flags: needinfo?(gsvelto)
Flags: needinfo?(gsvelto)
Attachment #9081675 - Flags: feedback?(gsvelto) → feedback+
Attached file stack_trace.txt (obsolete) (deleted) —

This is the stack trace I'm getting. Since attachment 9081675 [details] [diff] [review] did not apply cleanly it's possible that I've done something wrong while rebasing it. I've got these options enabled in my .mozconfig:

ac_add_options --enable-debug
ac_add_options --enable-crashreporter
ac_add_options --enable-forkserver
ac_add_options --enable-application=browser
ac_add_options --enable-tests
mk_add_options MOZ_TELEMETRY_REPORTING=1

Attached patch forkserver.-wip4.diff (deleted) — Splinter Review

I have made changes according feedbacks, and rebased the patch to the latest m-c.
I don't get the error that Gabriele got, so far.

Gabriele, could you try this patch to see if it works.

Attachment #9081675 - Attachment is obsolete: true
Attachment #9091158 - Flags: feedback?(gsvelto)

I'm testing with your new patch, will give feedback ASAP.

Attachment #9090669 - Attachment is obsolete: true
Comment on attachment 9091158 [details] [diff] [review] forkserver.-wip4.diff This is looking good overall, I think it's time to move this to Phabricator for a proper review. I found only one issue in the patch which you might want to address and it's that if a content processes crashes the associated tab gets stuck. I suppose it must be a problem with IPC as it happens both with and without the crash reporter enabled. To repro just browse to a regular page and then to about:crashcontent. The page will turn white and show the waiting spinner but will never reach the "This tab has crashed" page.
Attachment #9091158 - Flags: feedback?(gsvelto) → feedback+
Attached patch forkserver-wip5.diff (deleted) — Splinter Review

Fix the bug that the parent process kills the forkserver process.

Thinker, the patch is in good shape, can you upload it on Phabricator for review? I'm eager to get this in mozilla-central so people can start testing it.

Flags: needinfo?(thinker.li)

With a fork server, the parameters to fork a new content process are
passed through a socket. This patch does following tasks to adapt
sandbox to work with a fork server,

  • passing a FD of a chroot server,
  • passing flags of SandboxFork, and
  • setting LaunchOptions and its fork_delegate field at a fork server.

Depends on D46878

An instance of AppForkBuilder creates a new content process from
the passed args and LaunchOptions. It bascally does the same thing as
LaunchApp() for Linux, but it divides the procedure to two parts,

  • the 1st part forking a new process, and
  • the 2nd part initializing FDs, ENV, and message loops.

Going two parts gives fork servers a chance to clean new processes
before the initialization and running WEB content. For example, to
clean sensitive data from memory.

Depends on D46879

MiniTransceiver is a simple request-reponse transport, always waiting
for a response from the server before sending next request. The
requests are always initiated by the client.

Depends on D46880

Class ForkServer and class ForkServiceChild are implemented. The
chrome process can ask the fork server process to create content
processes. The requests are sent by MiniTransceiver over a socket.
The fork server replys with the process IDs/handles of created
processes.

LaunchOptions::use_forkserver is a boolean. With use_forkserver being
true, the chrome process sends a request to the fork server instead of
forking directly.

Depends on D46881

This patch make changes of Gecko infrastrutures to run a fork server
process.

  • ForkServerLauncher is a component, which creates a fork server
    process at XPCOM startup.

  • nsBrowserApp.cpp and related files have been chagned to start a
    fork server in a process.

  • Logging and nsTraceRefcnt were changed to make it work with the
    fork server.

Depends on D46883

Attached file Bug 1470591 - Part 7: Enable fork server by default. (obsolete) (deleted) —

Depends on D46884

Attachment #9094777 - Attachment is obsolete: true
Flags: needinfo?(thinker.li)
Attachment #9094778 - Attachment is obsolete: true

Gabriele, would you mind to suggest reviewers?

Attachment #9094787 - Attachment is obsolete: true

Gabriele, Would you mind to suggest reviewers?

Flags: needinfo?(gsvelto)

Part 1: I can do that
Part 2: :gcp or :jld
Part 3/4/5/6: An IPC peer https://wiki.mozilla.org/Modules/All#IPC

Flags: needinfo?(gsvelto)
Assignee: jld → thinker.li
Status: NEW → ASSIGNED

Hi Jed, when do you think you will have time to review?

Flags: needinfo?(jld)

Jed's KO, I'm trying to find another reviewer who's qualified. We're really sorry for the latency here.

Flags: needinfo?(jld)

If Jed's not available, unfortunately that this point your best bet might be waiting yet another week for Nathan to get back, and hope he has the cycles to spare. I'm not very familiar with the low-level parts of IPC. If Nathan gets back and he doesn't think he'll have time, then I can try to set aside some time to work on this. Sorry again for the delays here. Unfortunately this is not an area of the code where we have a ton of people available, which usually works okay because it doesn't have to change much.

I discussed this with Gabriele, we think his r+ should be enough to land this with the following plan:

  1. Land with enabling off by default
  2. As a follow-up add a pref for enabling / disabling this at runtime if possible
  3. Pref on by default

This lets us get the code in-tree and then gives us an escape hatch if we see any issues once it makes it to beta/release. It also gives us a fair amount of time on Nightly to evaluate the effectiveness and land any patches to handle feedback we may eventually get.

Thinker, does that sound like a reasonable plan?

Flags: needinfo?(thinker.li)

Gabriele, can you take care of the final reviews to help move this along?

Flags: needinfo?(gsvelto)

Sure, I'll be on it tomorrow morning.

Flags: needinfo?(gsvelto)

Eric,
Sure! I agree.

Flags: needinfo?(thinker.li)
Fission Milestone: M4 → M5

M6 because the fork server doesn't block dogfooding (M5).

Fission Milestone: M5 → M6
Attachment #9094780 - Attachment description: Bug 1470591 - Part 2: Provide methods to recreate a delegated forker. → Bug 1470591 - Part 2: Provide methods to recreate a delegated forker. r=gsvelto

We're good to land but since we're during the soft code freeze we should probably wait until next monday. The forkserver is off by default so that's not a risk but the patch is large so it might confuse people during merges and the like. The good thing is that we should be able to turn it on quickly after we land it since we'll be at the beginning of a new cycle.

Thanks Gabriele!

Attachment #9094779 - Attachment description: Bug 1470591 - Part 1: Add a new process type for ForkServer. → Bug 1470591 - Part 1: Add a new process type for ForkServer. r=gsvelto
Attachment #9094781 - Attachment description: Bug 1470591 - Part 3: AppForkBuilder to ceate a new content process. → Bug 1470591 - Part 3: AppForkBuilder to ceate a new content process. r=gsvelto
Attachment #9094782 - Attachment description: Bug 1470591 - Part 4: MiniTransceiver to do single-tasking IPC. → Bug 1470591 - Part 4: MiniTransceiver to do single-tasking IPC. gsvelto
Attachment #9094785 - Attachment description: Bug 1470591 - Part 5: ForkServer to create new processes. → Bug 1470591 - Part 5: ForkServer to create new processes. r=gsvelto
Attachment #9094786 - Attachment description: Bug 1470591 - Part 6: Create a fork server process. → Bug 1470591 - Part 6: Create a fork server process. r=gsvelto
Attachment #9094782 - Attachment description: Bug 1470591 - Part 4: MiniTransceiver to do single-tasking IPC. gsvelto → Bug 1470591 - Part 4: MiniTransceiver to do single-tasking IPC. r=gsvelto

The code freeze is over, land at will Thinker!

I tried re-basing and landing your patches today but I have an issue on Lando. The patches are blocked because the author is not set correctly apparently. We have an entry in the wiki regarding this problem; can you have a look and update the patches with the correct author information? I'm eager to get this landed :-)

Flags: needinfo?(thinker.li)

Just updated!

Flags: needinfo?(thinker.li)
Pushed by btara@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a10772f780f7 Part 1: Add a new process type for ForkServer. r=gsvelto https://hg.mozilla.org/integration/autoland/rev/ca1b804d404a Part 2: Provide methods to recreate a delegated forker. r=gsvelto https://hg.mozilla.org/integration/autoland/rev/daad4d736ec0 Part 3: AppForkBuilder to ceate a new content process. r=gsvelto https://hg.mozilla.org/integration/autoland/rev/cbac2d7dfe42 Part 4: MiniTransceiver to do single-tasking IPC. r=gsvelto https://hg.mozilla.org/integration/autoland/rev/f80db6e63169 Part 5: ForkServer to create new processes. r=gsvelto https://hg.mozilla.org/integration/autoland/rev/3ca19f8f388e Part 6: Create a fork server process. r=gsvelto

Backed out 6 changesets (Bug 1470591) for test_punycodeURIs & test_nsIProcess* crashes

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&searchStr=android%2Cx3&fromchange=3c08edf74d039af79f9daad8ff5b57ffb64fdab6&tochange=09111adf1bd1502668e50d0983afc0bc97b99694&selectedJob=279428789

Backout link: https://hg.mozilla.org/integration/autoland/rev/09111adf1bd1502668e50d0983afc0bc97b99694

Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=279428789&repo=autoland&lineNumber=4123

[task 2019-12-03T22:18:33.850Z] 22:18:33 INFO - TEST-START | uriloader/exthandler/tests/unit/test_punycodeURIs.js
[task 2019-12-03T22:19:05.914Z] 22:19:05 INFO - TEST-FAIL | uriloader/exthandler/tests/unit/test_punycodeURIs.js | took 32064ms
[task 2019-12-03T22:19:06.239Z] 22:19:06 INFO - mozcrash Downloading symbols from: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/NXVzVJMWTJmTQTfhjq7TAw/artifacts/public/build/target.crashreporter-symbols.zip
[task 2019-12-03T22:19:08.952Z] 22:19:08 INFO - mozcrash Copy/paste: /builds/worker/workspace/build/linux64-minidump_stackwalk /tmp/tmpO8MNEO/3a4f1864-4a89-3442-d0b5-9c40b46d762a.dmp /tmp/tmpJKX2OI
[task 2019-12-03T22:19:11.313Z] 22:19:11 INFO - mozcrash Saved minidump as /builds/worker/workspace/build/blobber_upload_dir/3a4f1864-4a89-3442-d0b5-9c40b46d762a.dmp
[task 2019-12-03T22:19:11.313Z] 22:19:11 INFO - mozcrash Saved app info as /builds/worker/workspace/build/blobber_upload_dir/3a4f1864-4a89-3442-d0b5-9c40b46d762a.extra
[task 2019-12-03T22:19:11.314Z] 22:19:11 WARNING - PROCESS-CRASH | uriloader/exthandler/tests/unit/test_punycodeURIs.js | application crashed [@ base::LaunchApp(std::__ndk1::vector<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >, std::__ndk1::allocator<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > > > const&, base::LaunchOptions const&, int*)]
[task 2019-12-03T22:19:11.314Z] 22:19:11 INFO - Crash dump filename: /tmp/tmpO8MNEO/3a4f1864-4a89-3442-d0b5-9c40b46d762a.dmp
[task 2019-12-03T22:19:11.314Z] 22:19:11 INFO - Operating system: Android
[task 2019-12-03T22:19:11.314Z] 22:19:11 INFO - 0.0.0 Linux 3.10.0+ #260 SMP PREEMPT Fri May 19 12:48:14 PDT 2017 x86_64
[task 2019-12-03T22:19:11.319Z] 22:19:11 INFO - CPU: amd64
[task 2019-12-03T22:19:11.319Z] 22:19:11 INFO - family 6 model 6 stepping 3
[task 2019-12-03T22:19:11.319Z] 22:19:11 INFO - 4 CPUs
[task 2019-12-03T22:19:11.319Z] 22:19:11 INFO - GPU: UNKNOWN
[task 2019-12-03T22:19:11.319Z] 22:19:11 INFO - Crash reason: SIGSEGV /SEGV_MAPERR
[task 2019-12-03T22:19:11.319Z] 22:19:11 INFO - Crash address: 0x0
[task 2019-12-03T22:19:11.319Z] 22:19:11 INFO - Process uptime: not available
[task 2019-12-03T22:19:11.319Z] 22:19:11 INFO - Thread 0 (crashed)
[task 2019-12-03T22:19:11.319Z] 22:19:11 INFO - 0 libxul.so!base::LaunchApp(std::__ndk1::vector<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >, std::__ndk1::allocator<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > > > const&, base::LaunchOptions const&, int*) [process_util_linux.cc:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 268 + 0xa]
[task 2019-12-03T22:19:11.319Z] 22:19:11 INFO - rax = 0x00007e4e0caaf498 rdx = 0x00007fffc3f0aae1
[task 2019-12-03T22:19:11.319Z] 22:19:11 INFO - rcx = 0x0000000000000000 rbx = 0x00007fffc3f0aa70
[task 2019-12-03T22:19:11.319Z] 22:19:11 INFO - rsi = 0x00007fffc3f0a1b0 rdi = 0x0000000000000000
[task 2019-12-03T22:19:11.319Z] 22:19:11 INFO - rbp = 0x00007fffc3f0aba0 rsp = 0x00007fffc3f0aa00
[task 2019-12-03T22:19:11.319Z] 22:19:11 INFO - r8 = 0x000000000000004b r9 = 0x00000000ffffffb5
[task 2019-12-03T22:19:11.319Z] 22:19:11 INFO - r10 = 0x00007fffc3f0a230 r11 = 0x00007fffc3f0a218
[task 2019-12-03T22:19:11.319Z] 22:19:11 INFO - r12 = 0x00007fffc3f0aa70 r13 = 0x00007fffc3f0aa08
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - r14 = 0x0000000000000018 r15 = 0x00007fffc3f0aa30
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00007e4e0e2c902f
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: given as instruction pointer in context
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 1 libxul.so!nsProcess::RunProcess(bool, char**, nsIObserver*, bool, bool) [nsProcessCommon.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 522 + 0x5]
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0ac60 rsp = 0x00007fffc3f0abb0
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00007e4e0df5f828
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 2 libxul.so!nsProcess::CopyArgsAndRunProcess(bool, char const**, unsigned int, nsIObserver*, bool) [nsProcessCommon.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 357 + 0x18]
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0acb0 rsp = 0x00007fffc3f0ac70
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00007e4e0df5f6bd
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 3 libxul.so!nsLocalHandlerApp::LaunchWithURI(nsIURI*, nsIInterfaceRequestor*) [nsLocalHandlerApp.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 97 + 0x65]
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0ad60 rsp = 0x00007fffc3f0acc0
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00007e4e0e698502
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 4 libxul.so!NS_InvokeByIndex + 0x8e
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0ad90 rsp = 0x00007fffc3f0ad70
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00007e4e115c19c6
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 5 libxul.so!XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) [XPCWrappedNative.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 1149 + 0xae9]
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0afa0 rsp = 0x00007fffc3f0ada0
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00007e4e0e594661
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 6 libxul.so!XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) [XPCWrappedNativeJSOps.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 946 + 0x8]
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0b0e0 rsp = 0x00007fffc3f0afb0
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00007e4e0e5952dd
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 7 libxul.so!js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) [Interpreter.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 549 + 0x182]
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0b1b0 rsp = 0x00007fffc3f0b0f0
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00007e4e1038a5ad
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 8 libxul.so!Interpret(JSContext*, js::RunState&) [Interpreter.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 618 + 0x9]
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0b690 rsp = 0x00007fffc3f0b1c0
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00007e4e103817c5
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 9 libxul.so!js::RunScript(JSContext*, js::RunState&) [Interpreter.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 424 + 0xb]
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0b6f0 rsp = 0x00007fffc3f0b6a0
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00007e4e10375a38
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 10 libxul.so!js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) [Interpreter.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 590 + 0xb]
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0b7c0 rsp = 0x00007fffc3f0b700
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00007e4e1038a9ea
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 11 libxul.so!js::jit::DoCallFallback(JSContext*, js::jit::BaselineFrame*, js::jit::ICCall_Fallback*, unsigned int, JS::Value*, JS::MutableHandle<JS::Value>) [BaselineIC.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 2941 + 0xa]
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0bb00 rsp = 0x00007fffc3f0b7d0
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00007e4e108be265
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 12 0x3b7fe1ea7f58
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0bb78 rsp = 0x00007fffc3f0bb10
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00003b7fe1ea7f58
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 13 0x7e4e04bcad10
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0bc18 rsp = 0x00007fffc3f0bb88
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00007e4e04bcad10
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 14 0x3b7fe1ea548f
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0bc80 rsp = 0x00007fffc3f0bc28
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00003b7fe1ea548f
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 15 libxul.so!js::jit::EnterBaselineInterpreterAtBranch(JSContext*, js::InterpreterFrame*, unsigned char*) [BaselineJIT.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 187 + 0xec]
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0be70 rsp = 0x00007fffc3f0bc90
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00007e4e109599f5
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 16 libxul.so!Interpret(JSContext*, js::RunState&) [Interpreter.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 2014 + 0x8]
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0c350 rsp = 0x00007fffc3f0be80
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00007e4e10376bcb
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - 17 libxul.so!js::RunScript(JSContext*, js::RunState&) [Interpreter.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 424 + 0xb]
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rbp = 0x00007fffc3f0c3b0 rsp = 0x00007fffc3f0c360
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - rip = 0x00007e4e10375a38
[task 2019-12-03T22:19:11.320Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.321Z] 22:19:11 INFO - 18 libxul.so!js::ExecuteKernel(JSContext*, JS::Handle<JSScript*>, JSObject&, JS::Value const&, js::AbstractFramePtr, JS::Value*) [Interpreter.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 811 + 0x5]
[task 2019-12-03T22:19:11.321Z] 22:19:11 INFO - rbp = 0x00007fffc3f0c430 rsp = 0x00007fffc3f0c3c0
[task 2019-12-03T22:19:11.321Z] 22:19:11 INFO - rip = 0x00007e4e1038bcc5
[task 2019-12-03T22:19:11.321Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.321Z] 22:19:11 INFO - 19 libxul.so!js::Execute(JSContext*, JS::Handle<JSScript*>, JSObject&, JS::Value*) [Interpreter.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 844 + 0x19]
[task 2019-12-03T22:19:11.321Z] 22:19:11 INFO - rbp = 0x00007fffc3f0c480 rsp = 0x00007fffc3f0c440
[task 2019-12-03T22:19:11.321Z] 22:19:11 INFO - rip = 0x00007e4e1038be06
[task 2019-12-03T22:19:11.321Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.322Z] 22:19:11 INFO - 20 libxul.so!JS::EvaluateDontInflate(JSContext*, JS::ReadOnlyCompileOptions const&, JS::SourceText<mozilla::Utf8Unit>&, JS::MutableHandle<JS::Value>) [CompilationAndEvaluation.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 594 + 0x4a4]
[task 2019-12-03T22:19:11.322Z] 22:19:11 INFO - rbp = 0x00007fffc3f0c710 rsp = 0x00007fffc3f0c490
[task 2019-12-03T22:19:11.322Z] 22:19:11 INFO - rip = 0x00007e4e1047f22c
[task 2019-12-03T22:19:11.322Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.322Z] 22:19:11 INFO - 21 libxul.so!XRE_XPCShellMain(int, char**, char**, XREShellData const*) [XPCShellImpl.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 1000 + 0x13]
[task 2019-12-03T22:19:11.322Z] 22:19:11 INFO - rbp = 0x00007fffc3f0caf0 rsp = 0x00007fffc3f0c720
[task 2019-12-03T22:19:11.322Z] 22:19:11 INFO - rip = 0x00007e4e0e58aa84
[task 2019-12-03T22:19:11.322Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.322Z] 22:19:11 INFO - 22 xpcshell!main [xpcshell.cpp:3ca19f8f388ec99c1888f5eed7176629174a66d9 : 66 + 0xc]
[task 2019-12-03T22:19:11.322Z] 22:19:11 INFO - rbp = 0x00007fffc3f0cb40 rsp = 0x00007fffc3f0cb00
[task 2019-12-03T22:19:11.322Z] 22:19:11 INFO - rip = 0x00007e4e153b87cb
[task 2019-12-03T22:19:11.323Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.323Z] 22:19:11 INFO - 23 libc.so + 0x1c8d5
[task 2019-12-03T22:19:11.323Z] 22:19:11 INFO - rbp = 0x00007fffc3f0cc88 rsp = 0x00007fffc3f0cb50
[task 2019-12-03T22:19:11.323Z] 22:19:11 INFO - rip = 0x00007e4e13b7a8d5
[task 2019-12-03T22:19:11.323Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.323Z] 22:19:11 INFO - 24 0x7fffc3f0da18
[task 2019-12-03T22:19:11.323Z] 22:19:11 INFO - rbp = 0x00007fffc3f0d9fc rsp = 0x00007fffc3f0cc98
[task 2019-12-03T22:19:11.323Z] 22:19:11 INFO - rip = 0x00007fffc3f0da18
[task 2019-12-03T22:19:11.323Z] 22:19:11 INFO - Found by: previous frame's frame pointer
[task 2019-12-03T22:19:11.323Z] 22:19:11 INFO - 25 0x7fffc3f21000
[task 2019-12-03T22:19:11.323Z] 22:19:11 INFO - rbp = 0x00007fffc3f0d9fc rsp = 0x00007fffc3f0cda8
[task 2019-12-03T22:19:11.324Z] 22:19:11 INFO - rip = 0x00007fffc3f21000
[task 2019-12-03T22:19:11.324Z] 22:19:11 INFO - Found by: stack scanning
[task 2019-12-03T22:19:11.324Z] 22:19:11 INFO - 26 0x7e4e152d5000
[task 2019-12-03T22:19:11.324Z] 22:19:11 INFO - rbp = 0x00007fffc3f0d9fc rsp = 0x00007fffc3f0ce18
[task 2019-12-03T22:19:11.324Z] 22:19:11 INFO - rip = 0x00007e4e152d5000
[task 2019-12-03T22:19:11.324Z] 22:19:11 INFO - Found by: stack scanning
[task 2019-12-03T22:19:11.324Z] 22:19:11 INFO - 27 xpcshell + 0x6f0
[task 2019-12-03T22:19:11.324Z] 22:19:11 INFO - rbp = 0x00007fffc3f0d9fc rsp = 0x00007fffc3f0ce38
[task 2019-12-03T22:19:11.324Z] 22:19:11 INFO - rip = 0x00007e4e153b86f0
[task 2019-12-03T22:19:11.324Z] 22:19:11 INFO - Found by: stack scanning
.....

Flags: needinfo?(thinker.li)

https://phabricator.services.mozilla.com/D46880?vs=201991&id=202713

I have made a minor change to remove the crash when |exec()| after |fork()| fails.
The test harness expect to get a non-zero exit code instead of crash, however the previous changes
make it crash intently.

Flags: needinfo?(thinker.li)
Pushed by rmaries@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/5fe0f063565f Part 1: Add a new process type for ForkServer. r=gsvelto https://hg.mozilla.org/integration/autoland/rev/7dbc37f95ab5 Part 2: Provide methods to recreate a delegated forker. r=gsvelto https://hg.mozilla.org/integration/autoland/rev/8f40dfd4d92f Part 3: AppForkBuilder to ceate a new content process. r=gsvelto https://hg.mozilla.org/integration/autoland/rev/a43f1a2e53ad Part 4: MiniTransceiver to do single-tasking IPC. r=gsvelto https://hg.mozilla.org/integration/autoland/rev/7ea0650d489d Part 5: ForkServer to create new processes. r=gsvelto https://hg.mozilla.org/integration/autoland/rev/d58db9c67aae Part 6: Create a fork server process. r=gsvelto

Great work folks!

Comment 17 indicates a savings of 6 MB. So this would be a savings of that per process?
Comment 19 refers to that initial code of proof of concept. Do we need to rerun numbers to see how things have worked out practically?
Sorry I've not read every comment, but do we know if this has had any performance impact? Such as process opening (better or worse)?

Blocks: 1601742
Performance Impact: --- → -
Whiteboard: [overhead:>4MB][qf-] → [overhead:>4MB]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: