Closed Bug 1337723 Opened 8 years ago Closed 3 years ago

Handle wasm code allocation failures better

Categories

(Core :: JavaScript: WebAssembly, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
94 Branch
Tracking Status
firefox94 --- fixed

People

(Reporter: jandem, Assigned: rhunt, Mentored)

References

(Blocks 2 open bugs)

Details

(Keywords: good-first-bug)

Attachments

(2 files)

See bug 1337561 comment 12. We should throw a different exception (or at least report a warning to the console) if we are unable to allocate executable code.
Luke: is this something important? If yes you might want to mark it P1 and assign to somebody.
Flags: needinfo?(luke)
So I was actually wondering about this: it seems like the "right" fix is just the fix we've talked about for a few years now of making ReportOutOfMemory() actually attempt to create a normal InternalError exception object (which has a stack), only throwing the string "out of memory" if this InternalError allocation failed?
Flags: needinfo?(luke)
But actually, maybe we want to start by just throwing the InternalError ourselves manually from wasm since that's easy and changing all of ReportOutOfMemory() might get hairy. bbouvier, is this something you could look into?
Flags: needinfo?(bbouvier)
Yes, adding to my todo list.
Flags: needinfo?(bbouvier)
Given the comments I'm marking this as P1. Feel free to disagree and a lower priority to it.
Priority: -- → P1
This is a P1 bug without an assignee. P1 are bugs which are being worked on for the current release cycle/iteration/sprint. If the bug is not assigned by Monday, 28 August, the bug's priority will be reset to '--'.
Keywords: stale-bug
Sadly ARM is at best P2 right now.
Keywords: stale-bug
Priority: P1 → P2
Component: JavaScript Engine: JIT → Javascript: Web Assembly

We have a wasm logging infrastructure now, we should be able to at least print a warning.

Type: defect → enhancement
Priority: P2 → P3
Depends on: 1590305
Blocks: 1590305
No longer depends on: 1590305
Mentor: lhansen
Keywords: good-first-bug

I see this is marked as good-first-bug - few questions just to see if I would have any way to approach it (given my current lack of experience in the code set etc).

Are the steps to reproduce still as described in https://bugzilla.mozilla.org/show_bug.cgi?id=1337561#c12? Or; since that is fixed as of 3 years ago would it require some other reproduction approach?

Would the structure of this fix be similar to https://hg.mozilla.org/mozilla-central/rev/55715c676a88? i.e. Define a new message, and use appropriately in the correct place?

From my as yet still very incomplete reading of some docs; it appears that the choices for builds are either full build or artifact builds - so would a full build be what is required to test fixes locally?

Are there any automated tests that relate to this area of the code?

I don't want to take up too much of anyone's time; cause there is a fairly high chance this is going to be a bit beyond my current can-do; but if I can be gently nudged down a reasonable path to replicate the issue I can see how I get on from there.

(In reply to Brent from comment #9)

Are the steps to reproduce still as described in https://bugzilla.mozilla.org/show_bug.cgi?id=1337561#c12? Or; since that is fixed as of 3 years ago would it require some other reproduction approach?

That case might still reproduce and it would be worthwhile trying. It should be easy to concoct something simpler here, eg a small loop that creates new Wasm modules and holds onto them until it OOMs. Also bug 1645638 comes to mind but there's a lot going on there, so maybe not the easiest place to start.

Would the structure of this fix be similar to https://hg.mozilla.org/mozilla-central/rev/55715c676a88? i.e. Define a new message, and use appropriately in the correct place?

Yes, I think so.

From my as yet still very incomplete reading of some docs; it appears that the choices for builds are either full build or artifact builds - so would a full build be what is required to test fixes locally?

Mostly you can just build the JS shell (much faster to do this), you don't have to build all of Firefox. But if you want to test with Firefox then you need a full build.

Are there any automated tests that relate to this area of the code?

Probably not, so an automated test would be welcome here.

I don't want to take up too much of anyone's time; cause there is a fairly high chance this is going to be a bit beyond my current can-do; but if I can be gently nudged down a reasonable path to replicate the issue I can see how I get on from there.

That's great. For good-first-bug, the hurdles are rarely in changing the code but in getting the build system set up and working, and all the other things around that. Let us know when you need more guidance.

I am able to build the code - that was straightforward enough just taking the defaults from the setup guides.

Would a loop like this create multiple modules; or would instantiateStreaming do some caching?
for (i = 0; i < iterationCountTotal; i++) {
            WebAssembly.instantiateStreaming(fetch('global.wasm'), { js: { global } })
                .then(({ instance }) => {

Is there a flag I can set somewhere to limit the "maximum" amount of memory available; so I don't have to loop too much loading modules?

... but having looked at the code a bit more; I am starting to feel rather out of my depth; so really don't want to take time that I can't then repay in some value delivered; so also wondering if I should try find some other first bug in a slightly less complex part of the code.

instantiateStreaming might not do caching right now but it will in the (near) future.

I was thinking something like this (in the JS shell, not in the browser):

var modules = [];
var bin = wasmTextToBinary(`(module (func))`);
for ( ;; ) {
  modules.push(new WebAssembly.Module(bin))
}

which I would think should exhaust executable memory quickly in a 32-bit build. If you built for 64-bit (probably what happens by default) it will take rather longer, but it will happen. You could do something similar for the browser by fetching the .wasm binary first and then just creating new modules, basically merging your idea and mine.

Looks like the values for max memory are hardcoded: https://searchfox.org/mozilla-central/source/js/src/jit/ProcessExecutableMemory.h#17. But since you're building the browser you can tweak these numbers.

Re wasting time etc, there's no shame in looking for another bug if the learning curve on this one seems a little too steep, but it's also OK if you want to soldier on here; we're happy that you want to contribute regardless.

Bit more soldiering -

I think I lowered MaxCodeBytesPerProcess in the else block for my 64 bit builds.
Then I cobbled-up

        var modules = [];

        //for (i = 0; i < iterationCountTotal; i++) {

        fetch('simple.wasm').then(response =>
            response.arrayBuffer()
        ).then(bytes => {

            for (; ;) {
                let mod = createWasmModule(bytes);
                WebAssembly.instantiate(mod, importObject)
                    .then(result =>
                        result.exports.exported_func()
                    );
            }
        })

And ...... maybe the below is what I we are after :o

~/repos/mozilla-unified$ ./mach run
 0:00.24 /home/brent/repos/mozilla-unified/obj-x86_64-pc-linux-gnu/dist/bin/firefox -no-remote -profile /home/brent/repos/mozilla-unified/obj-x86_64-pc-linux-gnu/tmp/profile-default
JavaScript error: , line 0: uncaught exception: out of memory
JavaScript error: , line 0: uncaught exception: out of memory
JavaScript error: resource:///modules/sessionstore/ContentSessionStore.jsm, line 476: uncaught exception: out of memory
JavaScript error: , line 0: uncaught exception: out of memory
JavaScript error: , line 0: uncaught exception: out of memory
JavaScript error: , line 0: uncaught exception: out of memory
JavaScript error: , line 0: uncaught exception: out of memory
JavaScript error: , line 0: uncaught exception: out of memory
JavaScript error: , line 0: uncaught exception: out of memory
JavaScript error: , line 0: uncaught exception: out of memory

`next round of "please helps"

1.)
Is ContentSessionStore.jsm trying to tell me useful info that I don't understand yet?
Line is 476 is

      if (value || (key != "storagechange" && key != "historychange")) {
       476: data[key] = value;
      }

2.)
From searching the code for MaxCodeBytesPerProcess; these two file have some reference to OOMs; so perhaps the fix will be somewhere around them?
https://searchfox.org/mozilla-central/source/js/src/jit/MacroAssembler.cpp#2452
https://searchfox.org/mozilla-central/source/js/src/jit/x86-shared/AssemblerBuffer-x86-shared.h#81

3.)
Is this something I could try to hit a debug break point on? Should I read through; https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/Debugging_Mozilla_with_gdb - and then set a breakpoint on MacroAssembler.cpp#2452 and um; try and trigger an execution that hits that? I feel like that might be somewhat tricky to setup correctly / not possible - given I'm going to have my test build firefox interpreting JS to load WASM and at some point in there hitting the OOM.

4.)
Is it possible to create a vs-code debug profile (https://code.visualstudio.com/docs/cpp/launch-json-reference) so that I can launch my debuggable firefox and then set breakpoints in vs-code?

5.)
Bit more of an abstract question; other than asking for help (which is critical; and hugely appreciated) - any suggestions on how I can build my understanding of the code? Since its a "complex" code-set - when I look at past patches or find usages of the limit you pointed me at; I don't feel I am able to get a clear mental model of the code in my head. Perhaps the best is for me to look at things a bit and then ask for help & repeat. But if you have any other thoughts I would be happy to hear them.

!! Thanks for all the help so far !!

Test case looks plausible. I bet there are two simplifcations you can make: you don't need to create an instance, only a module, and you can disable the optimizing compiler to reduce confusion for now - in about:config, set javascript.options.wasm_ionjit to false. That way, I'm guessing the ContentSessionStore problem disappears temporarily.

My own guess about where to place a log message would be somewhere higher up in the call chain than you suggest, but I'm honestly not completely sure. The call to setOOM is possibly a good site but I don't think it's the best one, as it will only catch a situation where an individual compilation will overflow the budget once the assembler's output is copied into a code segment. What you really want is to find a location where we check that we've not allocated too much code memory process-wide. https://searchfox.org/mozilla-central/source/js/src/jit/ProcessExecutableMemory.cpp#596 might be such a location, though this is also used for JS and not just for Wasm.

In searchfox you can find the uses of that function, and following it up a bit you get to users of js::jit::AllocateExecutableMemory: https://searchfox.org/mozilla-central/search?q=symbol:_ZN2js3jit24AllocateExecutableMemoryEjNS0_17ProtectionSettingE12MemCheckKind%2C_ZN2js3jit24AllocateExecutableMemoryEmNS0_17ProtectionSettingE12MemCheckKind%2C_ZN2js3jit24AllocateExecutableMemoryEyNS0_17ProtectionSettingE12MemCheckKind&redirect=false. And now you see that some of those callers are in Wasm code. And it's therefore possible that any logging should be performed inside those callers; it depends a little on what they do with that information (do they already log an error? do they pass the failure up to a caller that performs logging? etc).

You could definitely set a breakpoint at the "return nullptr" in ProcessExecutableMemory.cpp:596 and then see where you are in the code when you get there, that would teach you a few things about how things work. gdb works fine on Linux; lldb on Mac; I've only had limited experience with VS code debugging but I have made it work, and probably you can too, if Windows is your thing -- the trick was always to launch the browser, then find the browser processes in the "attach to process" dialog, and just attach to every browser process; the debugger worked OK after that.

The code is hard to understand... there's no getting away from that. There's #spidermonkey and #webassembly on chat.mozilla.org if you want to ask questions interactively. Other than that, most of us get to know it by fixing bugs, reading a lot of code, and stepping through test cases in a debugger, over and over. You can search the source tree for the tags SMDOC and WASMDOC to find documentation blurbs here and there.

thanks-again for all the help so far
Working through some re-build compile errors & trying to get debugging working

Blocks: wasm-oom

This commit causes ModuleGenerator to report a warning when
we fail to allocate executable memory for a ModuleSegment. This will
get reported in the JS console in addition to the OOM error.

The one tricky change is that entry points to module compilation
needed to be reworked so that they could report warnings even if
an error or OOM was encountered.

Assignee: nobody → rhunt
Status: NEW → ASSIGNED

Tier-2 tasks are finished on a helper thread and don't have a
JSContext* to report errors/warnings to. It may be useful for
end-users or engine developers to be notified when a tier-2
task fails and why. This commit implements logging of errors
and warnings by printing to stderr.

An alternative approach would be to get a runnable on the
originating JS thread that reports the errors/warnings using
that *JSContext. I'm not sure the best way to do this, so I
went with the simpler approach for now.

Depends on D125230

Pushed by rhunt@eqrion.net: https://hg.mozilla.org/integration/autoland/rev/2ba372fbca76 wasm: Warn about executable memory allocation failure. r=lth https://hg.mozilla.org/integration/autoland/rev/ce9243279077 wasm: Report tier-2 warnings and errors to stderr. r=lth
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 94 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: