Closed
Bug 760642
Opened 12 years ago
Closed 7 years ago
IonMonkey: Refactor IonAssemblyBuffer
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: mjrosenb, Assigned: mjrosenb)
References
(Blocks 1 open bug)
Details
(Whiteboard: [ion:t] [games:p?] [patches landed in Firefox 25])
Attachments
(7 files, 10 obsolete files)
(deleted),
patch
|
jbramley
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
jbramley
:
review+
|
Details | Diff | Splinter Review |
(deleted),
application/x-gzip
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
gkw
:
feedback-
|
Details | Diff | Splinter Review |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details |
Several poor design decisions were made while writing the AssemblerBuffer code, which have been repeatedly hacked around, so I intend to undo those when the tree is stable enough I can hack on this for a week.
Things that need to be fixed:
* the API for speaking with the Assembler involves passing pointers into the AssemblerBuffer when you want to put data into a particular pool. I think I did this so there was no temptation to do arithmetic on the values, and there would be a dedicated "NONE" value. This later caused many headaches, since I was unable to just allocate a new pool, memcpy and resetting the old pool was required.
* The code for determining if pools are full feels klunky and redundant.
* There are ~10 places where we iterate over the list of subPools, taking into account their size
* On every instruction insertion, we check every subPool to see if it is full
* indexing into the instruction stream requires a linear walk every time.
Expect more bullet points as I inspect the code some more.
Updated•12 years ago
|
Whiteboard: [ion:t]
This is currently the cause of pathologically long asm.js compile times on ARM (e.g. 2.5s on desktop, 5 minutes on a Nexus 4)
Whiteboard: [ion:t] → [ion:t] [games:p1]
Assignee | ||
Comment 3•11 years ago
|
||
Despite what the name says, I haven't actually tested the perf of this patch yet, I've only tested to make sure it doesn't break anything.
Attachment #767791 -
Flags: review?
Assignee | ||
Updated•11 years ago
|
Attachment #767791 -
Flags: review? → review?(Jacob.Bramley)
Comment 4•11 years ago
|
||
Comment on attachment 767791 [details] [diff] [review]
/home/mjrosenb/patches/SuperSpeedAssemblerBuffers-r0.patch
Review of attachment 767791 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/ion/shared/IonAssemblerBuffer.h
@@ +63,3 @@
> void setNext(BufferSlice<SliceSize> *next_) {
> JS_ASSERT(this->next == NULL);
> this->next = next_;
Shouldn't this set next_->prev too, to make sure the doubly-linked list is always consistent?
The same applies to setPrev, of course.
@@ +188,5 @@
> }
> JS_ASSERT(cur != NULL);
> }
> // the offset within this node should not be larger than the node itself.
> + // this check is now completely bogus since a slightly different algorithm is used.
As discussed, this check probably still holds. If it doesn't, delete it (rather than commenting it out) and work out what's going on, and if the subsequent array lookup is actually safe.
Attachment #767791 -
Flags: review?(Jacob.Bramley) → review+
Assignee | ||
Comment 5•11 years ago
|
||
part 2, use a finger to get the last bit of perf out of getInst.
The time to compile the demo is now 1m19s, down from 2m53.254s, Yay‽
the bad news comes from the perf results:
Pre patches, the first few entries were:
61.37% js js [.] js::ion::Assembler::bind(js::ion::Label*, js::ion::BufferOffset)
3.73% js js [.] js::ion::VirtualRegister::intervalFor(js::ion::CodePosition)
3.15% js js [.] js::ion::LiveInterval::addRange(js::ion::CodePosition, js::ion::CodePosition)
2.31% js [kernel.kallsyms] [k] 0xc01240a4
1.65% js js [.] js::ion::LiveInterval::covers(js::ion::CodePosition)
1.53% js js [.] js::ion::LinearScanAllocator::findBestFreeRegister(js::ion::CodePosition*)
1.06% js js [.] js::ion::LinearScanAllocator::UnhandledQueue::enqueueBackward(js::ion::LiveInterval*)
0.95% js js [.] js::ion::EliminatePhis(js::ion::MIRGenerator*, js::ion::MIRGraph&, js::ion::Observability)
0.88% js js [.] js::ion::LinearScanAllocator::resolveControlFlow()
0.79% js js [.] js::ion::LiveRangeAllocator<js::ion::LinearScanVirtualRegister>::buildLivenessInfo()
0.79% js js [.] js::ion::LinearScanAllocator::allocateRegisters()
0.67% js js [.] js::frontend::TokenStream::getTokenInternal()
0.55% js js [.] js::ion::MDefinition::replaceAllUsesWith(js::ion::MDefinition*)
A bit of math without any testing says that the improvement was about 60%.
Concrete numbers from testing show:
9.78% js js [.] js::ion::VirtualRegister::intervalFor(js::ion::CodePosition)
8.76% js js [.] js::ion::LiveInterval::addRange(js::ion::CodePosition, js::ion::CodePosition)
5.67% js [kernel.kallsyms] [k] 0xc01329c0
4.50% js js [.] js::ion::LiveInterval::covers(js::ion::CodePosition)
3.91% js js [.] js::ion::LinearScanAllocator::findBestFreeRegister(js::ion::CodePosition*)
2.88% js js [.] js::ion::LinearScanAllocator::UnhandledQueue::enqueueBackward(js::ion::LiveInterval*)
2.58% js js [.] js::ion::EliminatePhis(js::ion::MIRGenerator*, js::ion::MIRGraph&, js::ion::Observability)
2.38% js js [.] js::ion::LinearScanAllocator::resolveControlFlow()
2.16% js js [.] js::ion::LiveRangeAllocator<js::ion::LinearScanVirtualRegister>::buildLivenessInfo()
2.13% js js [.] js::ion::LinearScanAllocator::allocateRegisters()
1.77% js js [.] js::frontend::TokenStream::getTokenInternal()
1.51% js js [.] js::ion::MDefinition::replaceAllUsesWith(js::ion::MDefinition*)
1.28% js js [.] js::ion::MBasicBlock::inherit(js::ion::MBasicBlock*, unsigned int)
1.22% js js [.] CheckExpr(FunctionCompiler&, js::frontend::ParseNode*, Use, js::ion::MDefinition**, Type*)
1.20% js js [.] js::ion::MPhi::addInputSlow(js::ion::MDefinition*, bool*)
1.16% js js [.] js::ion::MBasicBlock::discardPhiAt(js::InlineForwardListIterator<js::ion::MPhi>&)
1.09% js js [.] js::InflateUTF8StringToBuffer(JSContext*, char const*, unsigned int, unsigned short*, unsigned int*)
1.08% js js [.] js::ion::MPhi::setOperand(unsigned int, js::ion::MDefinition*)
0.92% js js [.] js::ion::ValueNumberer::lookupValue(js::ion::MDefinition*)
0.86% js js [.] js::ion::ValueNumberer::computeValueNumbers()
0.82% js js [.] js::ion::MBasicBlock::addPredecessorPopN(js::ion::MBasicBlock*, unsigned int)
0.80% js js [.] js::ion::LiveInterval::intersect(js::ion::LiveInterval*)
0.80% js js [.] JSAtom* js::AtomizeChars<(js::AllowGC)1>(JSContext*, unsigned short const*, unsigned int, js::InternBehavior)
0.77% js js [.] js::ion::LBlock::firstId()
0.75% js js [.] NameResolver::resolve(js::frontend::ParseNode*, JS::Handle<JSAtom*>)
0.75% js js [.] js::ion::Assembler::actualOffset(unsigned int) const
0.72% js js [.] JS::Compile(JSContext*, JS::Handle<JSObject*>, JS::CompileOptions, _IO_FILE*)
0.69% js js [.] PushNodeChildren(js::frontend::ParseNode*, NodeStack*)
0.65% js js [.] js::ion::MPhi::New(unsigned int)
0.64% js js [.] js::ion::ValueNumberer::eliminateRedundancies()
0.58% js js [.] js::ion::LinearScanAllocator::findBestBlockedRegister(js::ion::CodePosition*)
0.54% js js [.] js::ion::LBlock::lastId()
0.53% js js [.] js::ion::LinearScanAllocator::reifyAllocations()
tldr: these two patches eliminate all of the overhead of the assembler buffer, but there is still a ways to go.
PS: all perf numbers are from a panda board.
Attachment #767866 -
Flags: review?
Assignee | ||
Comment 6•11 years ago
|
||
just ran tests on my chromebook, which has a processor that is slightly better than the nexus 4 (but much more comparable than a pandaboard)
total time was 0m19.902s
perf's comment on the matter is:
7.44% js js [.] js::ion::VirtualRegister::intervalFor(js::ion::CodePosition)
6.60% js js [.] js::ion::LiveInterval::addRange(js::ion::CodePosition, js::ion::CodePosition)
4.45% js js [.] js::ion::LinearScanAllocator::findBestFreeRegister(js::ion::CodePosition*)
3.86% js [kernel.kallsyms] [k] 0x80496fac
3.39% js js [.] js::ion::LiveRangeAllocator<js::ion::LinearScanVirtualRegister>::buildLivenessInfo()
2.81% js js [.] js::ion::LinearScanAllocator::allocateRegisters()
2.75% js js [.] js::ion::MDefinition::replaceAllUsesWith(js::ion::MDefinition*)
2.49% js js [.] js::ion::EliminatePhis(js::ion::MIRGenerator*, js::ion::MIRGraph&, js::ion::Observability)
2.14% js js [.] js::ion::LinearScanAllocator::UnhandledQueue::enqueueBackward(js::ion::LiveInterval*)
2.02% js js [.] js::frontend::TokenStream::getTokenInternal()
1.90% js js [.] CheckExpr(FunctionCompiler&, js::frontend::ParseNode*, Use, js::ion::MDefinition**, Type*)
1.78% js js [.] js::ion::LiveInterval::covers(js::ion::CodePosition)
1.67% js js [.] js::ion::LinearScanAllocator::resolveControlFlow()
1.60% js js [.] js::InflateUTF8StringToBuffer(JSContext*, char const*, unsigned int, unsigned short*, unsigned int*)
1.30% js js [.] js::ion::ValueNumberer::lookupValue(js::ion::MDefinition*)
1.30% js js [.] js::ion::MBasicBlock::inherit(js::ion::MBasicBlock*, unsigned int)
1.26% js js [.] js::ion::MPhi::addInputSlow(js::ion::MDefinition*, bool*)
1.23% js js [.] NameResolver::resolve(js::frontend::ParseNode*, JS::Handle<JSAtom*>)
1.17% js js [.] PushNodeChildren(js::frontend::ParseNode*, NodeStack*)
1.15% js js [.] JS::Compile(JSContext*, JS::Handle<JSObject*>, JS::CompileOptions, _IO_FILE*)
1.09% js js [.] JSAtom* js::AtomizeChars<(js::AllowGC)1>(JSContext*, unsigned short const*, unsigned int, js::InternBehavior)
1.01% js js [.] js::ion::LiveInterval::addRangeAtHead(js::ion::CodePosition, js::ion::CodePosition)
0.95% js js [.] js::ion::MPhi::setOperand(unsigned int, js::ion::MDefinition*)
0.92% js js [.] js::ion::LiveInterval::intersect(js::ion::LiveInterval*)
0.83% js js [.] js::ion::MBasicBlock::addPredecessorPopN(js::ion::MBasicBlock*, unsigned int)
0.81% js js [.] js::ion::ValueNumberer::computeValueNumbers()
0.79% js js [.] js::ion::MPhi::getOperand(unsigned int) const
0.79% js js [.] js::ion::Assembler::actualOffset(unsigned int) const
0.78% js js [.] js::ion::ValueNumberer::eliminateRedundancies()
0.74% js js [.] js::ion::LBlock::lastId()
0.64% js js [.] js::ion::MNode::discardOperand(unsigned int)
0.62% js js [.] js::ion::LinearScanAllocator::findBestBlockedRegister(js::ion::CodePosition*)
0.57% js js [.] js::ion::Loop::init()
0.55% js libgcc_s.so.1 [.] __aeabi_dsub
0.54% js js [.] js::ion::LiveRangeAllocator<js::ion::LinearScanVirtualRegister>::init()
0.54% js js [.] js::Vector<js::ion::LiveInterval::Range, 1u, js::ion::IonAllocPolicy>::growStorageBy(unsigned int)
0.52% js js [.] js::ion::LinearScanAllocator::reifyAllocations()
0.51% js js [.] js::ion::MPhi::New(unsigned int)
0.50% js libc-2.17.so [.] memset
For me on the N4, this was a significant improvement, but not enough -- down to 68s. My profile looks basically the same though; 80% of the time in getInst(). This is with both patches in this bug. Also note that I'm only profiling the main thread.. I believe off main thread compilation is enabled, but now I'm not sure.
I actually see the same compilation time (~68s) whether I have profiling enabled or disabled. I thought that parallel compilation gets disabled when profiling is enabled, which then makes me wonder why I'm seeing the same compilation speed either way...
Assignee | ||
Updated•11 years ago
|
Attachment #767866 -
Flags: review? → review?(Jacob.Bramley)
Assignee | ||
Comment 9•11 years ago
|
||
small blessing in disguise: I left a print statement in there, but commented out. can you uncomment it, and give me the (rather long) log of what it prints?
Flags: needinfo?(vladimir)
Comment 10•11 years ago
|
||
Comment on attachment 767866 [details] [diff] [review]
/home/mjrosenb/patches/useAFinger-r0.patch
Review of attachment 767866 [details] [diff] [review]:
-----------------------------------------------------------------
It looks correct.
Linked lists tend to be very hard on caches, since they involve a lot of pointer chasing which is hard to predict. The finger might resolve quite a lot of that, since you can dramatically reduce the number of traversals you need to do, but it will only work well if your getInst calls have strong locality. Depending on your usage pattern, I think that you could get more performance by maintaining an index list of every chunk and not using the linked list. Your printf output should tell you that.
::: js/src/ion/shared/IonAssemblerBuffer.h
@@ +202,5 @@
> + } else {
> + // it is closest to the end
> + cur = tail;
> + cur_off = bufferSize;
> + }
It might be useful to make this a helper function ("findNearbyChunk(...)"), to aid readability.
@@ +228,4 @@
> }
> JS_ASSERT(cur != NULL);
> }
> + if (count > 2 || used_finger) {
This threshold might need tuning. For example, if the getInst accesses often alternate between two distant chunks, then updating the finger on "count > 2" will be costly since it would often be flipping between them, and each flip would trigger a lengthy lookup. On the other hand, it might work very well as it is. I think it would be useful to study the typical access patterns (if you haven't already).
@@ +235,4 @@
> // the offset within this node should not be larger than the node itself.
> // this check is now completely bogus since a slightly different algorithm is used.
> + JS_ASSERT(local_off < cur->size());
> + // printf("### %c%s took %d steps\n", sigil, used_finger ? "G" : " ", count);
Remove this (and the count++ parts) before pushing.
Attachment #767866 -
Flags: review?(Jacob.Bramley) → review+
Whoops, my previous timings were without the second patch. With the second patch (and no printf) compilation time is down to 18.9s! Definitely making solid progress.
Here's the log of the steps output, gzip'd since it's 2.5MB otherwise.
Flags: needinfo?(vladimir)
Comment 12•11 years ago
|
||
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/54c2f0363dca for android/b2g build bustage.
Relanded this for Marty with a cast:
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/379e3345913f
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/ee1918b17e98
Note: I'm not sure that cast was correct, because the result of the subraction is already a uint32. So this abs is basically a noop.
Also, there are a lot of warnings in this code that should get resolved, but in a followup.
Assignee | ||
Comment 15•11 years ago
|
||
all of the values should probably be signed, because while none of them should ever be negative, I do actually want to take the difference of two of them and get a negative value.
Comment 16•11 years ago
|
||
But
int finger_off = abs((int)(local_off - finger_offset));
doesn't take the difference and get a negative value. It's really equivalent to this:
int finger_off = abs((int)(unsigned(local_off) - finger_offset));
That is, the subtraction is done in unsigned-space, then the cast to int -- which only works correctly per spec if the subtraction result was >= 0 -- and then abs under that assumption will return the exact same value. If you want signed subtraction, you have to convert the unsigned operand to signed before subtracting.
As far as abs goes, you should use mozilla::Abs in "mozilla/MathAlgorithms.h", which returns the corresponding unsigned type. Then use SafeCast<int> from "mozilla/Casting.h" to convert to int, asserting the value will be unchanged by the cast.
Assignee | ||
Updated•11 years ago
|
Whiteboard: [ion:t] [games:p1] → [ion:t] [games:p1] [leave open]
Comment 17•11 years ago
|
||
Comment 18•11 years ago
|
||
Why not use mozilla::LinkedList?
Updated•11 years ago
|
Blocks: gecko-games
Assignee | ||
Comment 19•11 years ago
|
||
wow, that involved a whole lot more time debugging than I expected. Anyhow, this passes all jit-tests, but it likely doesn't compile on x86 or x64, due to a couple of minor api changes. An updated patch should be along later today
Attachment #8347947 -
Flags: review?(Jacob.Bramley)
Comment 20•11 years ago
|
||
Since this is a big rewrite, I think it'd be good to have somebody on the JS team review this as well. Luke, I think you and Marty have discussed this patch before, do you have some spare cycles? Else we can find somebody else or I can take it, but I'm pretty swamped atm.
Flags: needinfo?(luke)
Comment 21•11 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #20)
> Since this is a big rewrite, I think it'd be good to have somebody on the JS
> team review this as well. Luke, I think you and Marty have discussed this
> patch before, do you have some spare cycles? Else we can find somebody else
> or I can take it, but I'm pretty swamped atm.
I was going to say the same thing. I don't think I'm qualified to review a change on this scale. (I'll still do a review and give my opinions, though, but someone else should look at it too.)
Comment 22•11 years ago
|
||
I probably won't be able to get to a review until after the Christmas break, but I could start reviewing then.
In the meantime, it sounds like there is some work to be done to finish up x86/x64 and I see a number of TODOs and extraneous #ifdefs in the patch. It'd be nice to have those ironed out before starting to review.
Two other questions in the meantime:
1. Have you measured the effect of this on asm.js compile time? In particular, I'd be interested to hear about the change in the "compilation succeeded in Nms" number reported after running bullet/zlib/box2d in awfy asmjs-apps.
2. Do I recall correctly that the plan is to refactor x86/x64 buffers to use he same buffer (so that we can do fancy things like using shorter jump instructions)?
Flags: needinfo?(luke)
Comment 23•11 years ago
|
||
Comment on attachment 8347947 [details] [diff] [review]
AssemblerBufferRework-r2.patch
Review of attachment 8347947 [details] [diff] [review]:
-----------------------------------------------------------------
I've scratched the surface. As Luke said, I think I'd like to wait until the patch is in a more final state before I continue. Most of the issues I've seen so far relate to formatting and debug code.
::: js/src/jit/AsmJS.cpp
@@ +1672,5 @@
> + }
> + for (unsigned i = 0; i < module_->numExits(); i++) {
> + module_->exit(i).updateOffsets(&masm_);
> + }
> +
There's trailing whitespace here. There are other instances, but I won't note all of them.
::: js/src/jit/AsmJSModule.h
@@ +210,5 @@
> public:
> Exit() {}
> Exit(unsigned ffiIndex, unsigned globalDataOffset)
> : ffiIndex_(ffiIndex), globalDataOffset_(globalDataOffset),
> + interpCodeOffset_(UINT_MAX), ionCodeOffset_(UINT_MAX)
Should you define a meaningful name for those? Exit::offset_uninitialized perhaps.
Also, something like Exit::is(Interp|Ion)OffsetInitialized() would be a good idea, rather than checkout offset() for a magic value.
@@ +228,3 @@
> ionCodeOffset_ = off;
> }
> + void updateOffsets(jit::MacroAssembler *masm) {
This function isn't idempotent. Should it assert that it hasn't already been run?
::: js/src/jit/BaselineCompiler.cpp
@@ +718,5 @@
> return false;
> masm.bind(&noPush);
>
> // Store the start offset in the appropriate location.
> + eprintf("spsPushToggleOffset_: %d\n", spsPushToggleOffset_.offset());
Did you mean to leave this here?
::: js/src/jit/Ion.cpp
@@ +646,5 @@
> + if (getenv("MYSPEW") && !strcmp(getenv("MYSPEW"), "insns")) {
> + char cmd[4096];
> + snprintf(cmd, 4096, "gdb /proc/%d/exe %d -batch -ex 'set pagination off' -ex 'xi %d %p' 1>&2", getpid(), getpid(), insnSize_/4, code_);
> + system(cmd);
> + }
Did you mean to leave this here?
::: js/src/jit/IonCode.h
@@ +70,5 @@
> jumpRelocTableBytes_(0),
> dataRelocTableBytes_(0),
> preBarrierTableBytes_(0),
> invalidated_(false)
> + { if (getenv("MYSPEW")) fprintf(stderr, "IonCode::IonCode code=%p\n", code);}
Did you mean to leave this here?
::: js/src/jit/IonMacroAssembler.h
@@ +847,5 @@
> // the IonCode onto the stack in order to GC it correctly. exitCodePatch should
> // be unset if the code never needed to push its IonCode*.
> if (hasEnteredExitFrame()) {
> + exitCodePatch_.fixup(this);
> + eprintf("patching %p with itself\n", code->raw());
Did you mean to leave that here?
::: js/src/jit/arm/Assembler-arm.cpp
@@ +574,1 @@
> dataRelocations_.writeUnsigned(real_offset);
'offset' is never used other than to write to 'real_offset'.
@@ +580,1 @@
> jumpRelocations_.writeUnsigned(real_offset);
Ditto.
@@ +586,1 @@
> preBarriers_.writeUnsigned(real_offset);
Ditto.
@@ +598,5 @@
> + char cmd[4096];
> + snprintf(cmd, 4096, "gdb /proc/%d/exe %d --batch -ex 'set arm force-mode arm' -ex 'disas %p,+%d' -ex bt",
> + getpid(), getpid(), buffer, m_buffer.size());
> + if (m_buffer.size() > 4096 && getenv("MYSPEW"))
> + system(cmd);
Debug code.
@@ +607,5 @@
> Assembler::resetCounter()
> {
> m_buffer.resetCounter();
> }
> +#if 0
Can't you just delete the code?
@@ +2664,5 @@
> static int hit = 0;
> if (stopBKPT == hit)
> dbg_break();
> writeInst(0xe1200070 | (hit & 0xf) | ((hit & 0xfff0)<<4));
> + //writeInst(0xffff0000 | hit & 0xffff);
Did you mean to delete this?
::: js/src/jit/shared/Assembler-shared.h
@@ +562,5 @@
> }
>
> void operator = (CodeOffsetJump base) {
> raw_ = (uint8_t *) base.offset();
> + if (getenv("MYSPEW")) fprintf(stderr, "setting myself (%p) to %p\n", &raw_, raw_);
Did you mean to leave this here?
::: js/src/jit/shared/CodeGenerator-shared.cpp
@@ +394,5 @@
>
> bool
> CodeGeneratorShared::markSafepointAt(uint32_t offset, LInstruction *ins)
> {
> + // I'm decently sure this is bogus on ARM.
Why? Can you leave a reference or some explanation in case someone else stumbles across it?
@@ +418,5 @@
> // represent the join point of the call out of the function.
> //
> // At points where we want to ensure that invalidation won't corrupt an
> // important instruction, we make sure to pad with nops.
> +#ifndef JS_CPU_ARM
Why?
@@ +425,5 @@
> paddingSize -= masm.currentOffset() - lastOsiPointOffset_;
> for (int32_t i = 0; i < paddingSize; ++i)
> masm.nop();
> }
> + //JS_ASSERT(masm.currentOffset() - lastOsiPointOffset_ >= Assembler::patchWrite_NearCallSize());
Why is this commented out?
Attachment #8347947 -
Flags: review?(Jacob.Bramley) → review-
Assignee | ||
Comment 24•11 years ago
|
||
Still didn't add in the x86/x64 changes, but I remembered to deal with all of the messy allocations this time. Just need to handle OOM, which shouldn't lead to any major architectural changes. Also, got rid of a bunch more commented out code.
Attachment #8347947 -
Attachment is obsolete: true
Assignee | ||
Comment 25•11 years ago
|
||
ok, builds on x64, passes a whole boatload of tests when run under valgrind, I'm like 90 %sure that I handle all of the oom cases, and about as sure that I got the bounds checking right for falling back to long branches/calls. I'll benchmark it... soon.
Attachment #8349419 -
Attachment is obsolete: true
Attachment #8349745 -
Flags: review?(luke)
Attachment #8349745 -
Flags: review?(Jacob.Bramley)
Comment 26•11 years ago
|
||
I probably won't have time to review it until the new year I'm afraid.
(In reply to Marty Rosenberg [:mjrosenb] from comment #25)
> I'm like 90 %sure that I handle all of the oom cases, and about as sure that
> I got the bounds checking right for falling back to long branches/calls.
That 10% is quite a big window of uncertainty considering the nature of the code. (A bug in the branch handling code could easily be a security bug.) Can you do some more focussed testing to make sure that the branches are converted properly at the boundary conditions please? The JavaScript-based test suites won't be able to test that reliably.
Comment 27•11 years ago
|
||
The patch references the new file jit/shared/IonAssemblerBuffer.cpp but this does not appear to be included in the patch?
Flags: needinfo?(mrosenberg)
Assignee | ||
Comment 28•11 years ago
|
||
as Doug pointed out, I didn't add in a new .cpp file. My bad.
Attachment #8355727 -
Flags: review?(luke)
Attachment #8355727 -
Flags: review?(Jacob.Bramley)
Flags: needinfo?(mrosenberg)
Assignee | ||
Comment 29•11 years ago
|
||
unlike the second -r4, I actually qref'ed before uploading a new patch. thankfully my script realized that I've already submitted that patch.
Attachment #8349745 -
Attachment is obsolete: true
Attachment #8355727 -
Attachment is obsolete: true
Attachment #8349745 -
Flags: review?(luke)
Attachment #8349745 -
Flags: review?(Jacob.Bramley)
Attachment #8355727 -
Flags: review?(luke)
Attachment #8355727 -
Flags: review?(Jacob.Bramley)
Attachment #8355729 -
Flags: review?(luke)
Attachment #8355729 -
Flags: review?(Jacob.Bramley)
Comment 30•11 years ago
|
||
Thank you for the updated patch.
A little rebasing was needed to apply the patch to the m-c tip and fwiw the rebased patch is attached. Also noticed some trailing white space which was removed.
There is still a good deal of debug code in the patch but it's not distracting and might even be useful in the short term. There are quite a few compilation warnings that could be addressed, unused variables etc.
Initial testing on some asm.js code is positive. Ammo and BananaBread work on the ARMv7, and Ammo works on the ARMv6. Had to apply the patch in bug 861208, and this might be something to consider here. BananaBread failed on the ARMv6 with an assertion failure which might be related to FP loads from the pools:
Assembler-arm.h line 511:
Imm8VFPOffData(uint32_t imm) : data (imm) {
JS_ASSERT((imm & ~(0xff)) == 0);
}
Shall continue exploring the code and try to debug this, and test it further. Shall try the jit-tests on the ARMv6 to see if some simpler tests failure.
Noticed some crashes reloading demos repeatedly, that did not seems to occur if loading after a fresh start, but it's not yet clear if these are related to the changes here.
Great work, looks very promising.
Comment 31•11 years ago
|
||
jit-test tbpl on an ARMv6 device went very well and was a big improvement. All the previous buffer pool failures were corrected! There was only one new failure for tests/asm.js/testBasic.js
0x003d48c4 in js::jit::DtrOffImm::DtrOffImm(int) () at js/src/jit/arm/Assembler-arm.h:782
782 JS_ASSERT(mozilla::Abs(imm) < 4096);
#0 0x003d48c4 in js::jit::DtrOffImm::DtrOffImm(int) () at js/src/jit/arm/Assembler-arm.h:782
#1 0x003c5a20 in js::jit::Assembler::patchConstantPoolLoad(js::jit::Instruction*, void*) () at js/src/jit/arm/Assembler-arm.cpp:2039
jit-test tbpl on an ARMv7 device produces a good number of new failures. Too many to analyses them all, but so far they all crashed with a SIGSEGV on the following instruction sequence.
0x40ccde04: movw r0, #65535 ; 0xffff
0x40ccde08: movt r0, #65535 ; 0xffff
=> 0x40ccde0c: ldr r12, [r0, #124] ; 0x7c
Comment 32•11 years ago
|
||
uneval(this)
asserts js debug shell on m-c rev 30f3710477c2 with --ion-eager --ion-parallel-compile=off at Assertion failure: typeAlloc->isArgument(), at jit/LinearScan.cpp with the patch in comment 30.
(this was a drive-by fuzzing instance, I'd also suggest a try run on all relevant platforms prior to landing)
Updated•11 years ago
|
Attachment #8355754 -
Flags: feedback-
Comment 33•11 years ago
|
||
Comment on attachment 8355754 [details] [diff] [review]
Rebased and cleaned up some trailing white space.
Dang it. It turns out I was hitting a variant of bug 958432, and the patch there seems to fix the issue.
Attachment #8355754 -
Flags: feedback-
Comment 34•11 years ago
|
||
Comment on attachment 8355729 [details] [diff] [review]
AssemblerBufferRework-r5.patch
Sounds like there are some issues dougc found (crashes, remaining debug/todo code). Perhaps, since dougc is going to be working with this a lot, he could review the patch for logic and then give Jan the review for Ion style?
Attachment #8355729 -
Flags: review?(luke)
Assignee | ||
Comment 35•11 years ago
|
||
I updated to your rebased patch, and added in a fix for the one bug I found. it looks like it is *mostly* clean on armel as well as armhf, but I need to do a bit of investigation that may take a while. (my armel machine is acting up). Feel free to test this out, and see if it fixes the bugs you've been seeing.
Attachment #8361322 -
Flags: feedback?(dtc-moz)
Comment 36•11 years ago
|
||
Comment on attachment 8361322 [details] [diff] [review]
AssemblerBufferRework-r6.patch
Review of attachment 8361322 [details] [diff] [review]:
-----------------------------------------------------------------
Thank you, the fix helps a lot. All the jit-tests now pass on the ARMv7. There are still are few jit-test failures on the ARMv6 that appear to be related and they all appear to have the same signature:
Assertion failure: data == imm, at js/src/jit/arm/Assembler-arm.h:553
#0 0x0037ec96 in js::jit::datastore::Imm12Data::Imm12Data(unsigned int) () at js/src/jit/arm/Assembler-arm.h:553
#1 0x0037f1c0 in js::jit::DtrOffImm::DtrOffImm(int) () at js/src/jit/arm/Assembler-arm.h:780
#2 0x0035bfd0 in js::jit::Assembler::patchConstantPoolLoad(js::jit::Instruction*, void*) () at js/src/jit/arm/Assembler-arm.cpp:2022
#3 0x003862f8 in js::jit::AssemblerBufferWithConstantPool<4, js::jit::Instruction, js::jit::Assembler, js::jit::SliceKind, 1, 2>::executableCopy(unsigned char*) ()
at js/src/jit/shared/IonAssemblerBufferWithConstantPools.h:980
#4 0x00358db0 in js::jit::Assembler::executableCopy(unsigned char*) () at js/src/jit/arm/Assembler-arm.cpp:595
#5 0x002565a2 in js::jit::JitCode::copyFrom(js::jit::MacroAssembler&) () at js/src/jit/Ion.cpp:645
#6 0x0023ab14 in js::jit::JitCode* js::jit::Linker::newCode<(js::AllowGC)1>(JSContext*, JSC::ExecutableAllocator*, JSC::CodeKind) () at js/src/jit/IonLinker.h:63
#7 0x00225c32 in js::jit::JitCode* js::jit::Linker::newCode<(js::AllowGC)1>(JSContext*, JSC::CodeKind) () at js/src/jit/IonLinker.h:81
#8 0x001a2018 in js::jit::BaselineCompiler::compile() () at js/src/jit/BaselineCompiler.cpp:111
#9 0x001cbedc in js::jit::BaselineCompile(JSContext*, JS::Handle<JSScript*>) () at js/src/jit/BaselineJIT.cpp:236
Shall continue testing and explore the jit-test failures.
Attachment #8361322 -
Flags: feedback?(dtc-moz) → feedback?
Comment 37•11 years ago
|
||
The patch in bug 861208 was necessary to stop some assertion failures even on the ARMv7, so it has been added as a dependency for now. Previously bug 861208 was only seen on the ARMv6.
Some larger Odin demos run on the ARMv7, such as ammo. BananaBench compiles but crashes on the ARMv7, SIGSEGV at 0xfffffffe. Was working before the last 'fix'. Following the link register points to the call below. The buffer before the call appears to be corrupt and a call address pointer was not set.
0x88fdfc2c: beq 0x88fdfc34
0x88fdfc30: bkpt 0x45bf
0x88fdfc34: movw r12, #48992 ; 0xbf60
0x88fdfc38: movt r12, #16391 ; 0x4007
0x88fdfc3c: andcs r0, r0, r0 \ corrupt buffer?
0x88fdfc40: movwgt r6, #52213 ; 0xcbf5 |
0x88fdfc44: andge r0, r0, r0 |
0x88fdfc48: svclt 0x0091df46 |
0x88fdfc4c: andeq r0, r0, r0 |
0x88fdfc50: andeq r0, r0, r0 |
0x88fdfc54: andeq r0, r0, r0 /
0x88fdfc58: movw r12, #65535 ; 0xffff < not set?
0x88fdfc5c: movt r12, #65535 ; 0xffff <
0x88fdfc60: blx r12
0x88fdfc64: vmov d0, r0, r1
Assignee | ||
Comment 38•11 years ago
|
||
So I tried to reproduce this using a debug-only armel build, but after compiling, it seems to die a fairly normal death:
exception thrown: TypeError: GLctx is undefined,_glGetString@/home/mjrosenb/bench/BananaBread/cube2/bb.js:6435
_glGetString@/home/mjrosenb/bench/BananaBread/cube2/bb.js:6687
Module._main@./js/game-setup.js:115
callMain@/home/mjrosenb/bench/BananaBread/cube2/bb.js:16060
doRun@/home/mjrosenb/bench/BananaBread/cube2/bb.js:16099
run/<@/home/mjrosenb/bench/BananaBread/cube2/bb.js:16109
window.runEventLoop@/home/mjrosenb/bench/BananaBread/cube2/game/headless.js:73
@./js/game-setup.js:605
/home/mjrosenb/bench/BananaBread/cube2/bb.js:6435:10 TypeError: GLctx is undefined
How did you build the shell, and how did you run the benchmark?
Flags: needinfo?(dtc-moz)
Comment 39•11 years ago
|
||
(In reply to Marty Rosenberg [:mjrosenb] from comment #38)
> So I tried to reproduce this using a debug-only armel build, but after
> compiling, it seems to die a fairly normal death:
> exception thrown: TypeError: GLctx is
> undefined,_glGetString@/home/mjrosenb/bench/BananaBread/cube2/bb.js:6435
> _glGetString@/home/mjrosenb/bench/BananaBread/cube2/bb.js:6687
> Module._main@./js/game-setup.js:115
> callMain@/home/mjrosenb/bench/BananaBread/cube2/bb.js:16060
> doRun@/home/mjrosenb/bench/BananaBread/cube2/bb.js:16099
> run/<@/home/mjrosenb/bench/BananaBread/cube2/bb.js:16109
> window.runEventLoop@/home/mjrosenb/bench/BananaBread/cube2/game/headless.js:
> 73
> @./js/game-setup.js:605
>
> /home/mjrosenb/bench/BananaBread/cube2/bb.js:6435:10 TypeError: GLctx is
> undefined
>
> How did you build the shell, and how did you run the benchmark?
It was run in a browser, not a shell. The simulator in bug 959597 will run BananaBread in the browser. Let me try and sort out some of the simulator issues and then try to reproduce the crash seen using the simulator.
The movw r12, #65535, movt r12, #65535, sequence is quite distinctive and perhaps some hack code could scan all compiled buffers for this sequence to try and locate the error at compile time.
Flags: needinfo?(dtc-moz)
Assignee | ||
Comment 40•11 years ago
|
||
Ok, I think I fixed the issue with that stray write into the instruction stream. I also put in some assertions, but they may be bogus (I haven't actually tested this yet, mostly because I wasn't able to reproduce the error initially.).
Attachment #8367441 -
Flags: review?(Jacob.Bramley)
Comment 41•11 years ago
|
||
Comment on attachment 8367441 [details] [diff] [review]
AssemblerBufferRework-r7.patch
I'm not going to be able to review this in a timely manner. Doug, could you have a look please? Luke suggested you in comment 34.
Attachment #8367441 -
Flags: review?(Jacob.Bramley) → review?(dtc-moz)
Comment 42•11 years ago
|
||
Comment on attachment 8355729 [details] [diff] [review]
AssemblerBufferRework-r5.patch
This was replaced with newer patches.
Attachment #8355729 -
Flags: review?(Jacob.Bramley)
Comment 43•11 years ago
|
||
(In reply to Marty Rosenberg [:mjrosenb] from comment #40)
> Created attachment 8367441 [details] [diff] [review]
> AssemblerBufferRework-r7.patch
Marty, do you mind rebasing this patch to m-c tip? (rev cafe909f7e07 at the time of this writing)
Flags: needinfo?(mrosenberg)
Comment 44•11 years ago
|
||
Excuse some minor reformatting of text comments, and wip review annotations.
Flags: needinfo?(mrosenberg)
Comment 45•11 years ago
|
||
(In reply to Douglas Crosher [:dougc] from comment #44)
> Created attachment 8372742 [details] [diff] [review]
> R7 rebased
>
> Excuse some minor reformatting of text comments, and wip review annotations.
I hit this compile error:
Traceback (most recent call last):
File "./config.status", line 419, in <module>
config_status(**args)
File "/home/fuzz5lin/Desktop/js-dbg-32-dm-ts-er-sfp-linux-mozilla-central-167616-cafe909f7e07-dgapI_-patched/compilePath/python/mozbuild/mozbuild/config_status.py", line 99, in config_status
summary = backend.consume(definitions)
File "/home/fuzz5lin/Desktop/js-dbg-32-dm-ts-er-sfp-linux-mozilla-central-167616-cafe909f7e07-dgapI_-patched/compilePath/python/mozbuild/mozbuild/backend/base.py", line 204, in consume
for obj in objs:
File "/home/fuzz5lin/Desktop/js-dbg-32-dm-ts-er-sfp-linux-mozilla-central-167616-cafe909f7e07-dgapI_-patched/compilePath/python/mozbuild/mozbuild/frontend/emitter.py", line 104, in emit
objs = list(self.emit_from_sandbox(out))
File "/home/fuzz5lin/Desktop/js-dbg-32-dm-ts-er-sfp-linux-mozilla-central-167616-cafe909f7e07-dgapI_-patched/compilePath/python/mozbuild/mozbuild/frontend/emitter.py", line 205, in emit_from_sandbox
% (symbol, src, sandbox['RELATIVEDIR']))
mozbuild.frontend.reader.SandboxValidationError: Reference to a file that doesn't exist in UNIFIED_SOURCES (jit/shared/IonAssemblerBuffer.cpp) in js/src
Flags: needinfo?(dtc-moz)
Comment 46•11 years ago
|
||
Sorry, forgot to add the new file to the patch set.
Attachment #8372742 -
Attachment is obsolete: true
Flags: needinfo?(dtc-moz)
Comment 47•11 years ago
|
||
Comment on attachment 8372763 [details] [diff] [review]
R7 rebased
(function(b, foreign) {
"use asm";
var ff = foreign.f;
function f() {
ff((.0), 0, 7, (.0), 0, (.0), (.0), (.7), (.5),
(.0), (.6), (-0), (.1), (.0), (.0), (.0), (1.), (.0), (.0),
(.0), (-0), (.0), (.0), (.0), (.0), (.0), (.0), (.7), (.0))
}
return f
})
$ ./js-dbg-32-dm-ts-er-sfp-linux-cafe909f7e07 --ion-eager testcase.js
Assertion failure: false (MOZ_ASSUME_UNREACHABLE(unsupported relocation)), at jit/arm/Assembler-arm.cpp:795
Attachment #8372763 -
Flags: feedback-
Flags: needinfo?(dtc-moz)
Comment 48•11 years ago
|
||
Tested on Ubuntu on ARM pandaboard.
Comment 49•11 years ago
|
||
(In reply to Gary Kwong [:gkw] [:nth10sd] catching up on email/bugmail from comment #47)
> Comment on attachment 8372763 [details] [diff] [review]
> R7 rebased
>
> (function(b, foreign) {
> "use asm";
> var ff = foreign.f;
> function f() {
> ff((.0), 0, 7, (.0), 0, (.0), (.0), (.7), (.5),
> (.0), (.6), (-0), (.1), (.0), (.0), (.0), (1.), (.0), (.0),
> (.0), (-0), (.0), (.0), (.0), (.0), (.0), (.0), (.7), (.0))
> }
> return f
> })
>
> $ ./js-dbg-32-dm-ts-er-sfp-linux-cafe909f7e07 --ion-eager testcase.js
>
> Assertion failure: false (MOZ_ASSUME_UNREACHABLE(unsupported relocation)),
> at jit/arm/Assembler-arm.cpp:795
This does not reproduce in the ARM simulator with the patch in bug 861208 applied. A rebased patch for bug 861208 has just been uploaded. Is it convenient to apply this too, or do you really need a combined patch?
Flags: needinfo?(dtc-moz)
Comment 50•11 years ago
|
||
I tried that additional patch and when run with --ion-eager, got:
Assertion failure: i->is<InstMovW>(), at jit/arm/MacroAssembler-arm.cpp
Flags: needinfo?(dtc-moz)
Comment 51•11 years ago
|
||
(In reply to Gary Kwong [:gkw] [:nth10sd] catching up on email/bugmail from comment #50)
> Created attachment 8375383 [details]
> second stack with additional patch from bug 861208
>
> I tried that additional patch and when run with --ion-eager, got:
>
> Assertion failure: i->is<InstMovW>(), at jit/arm/MacroAssembler-arm.cpp
Ok, thank you, I was able to reproduce this in the simulator by applying just these patches.
A small fix to the alignment code makes this failure go away, but it might not be the cause.
Flags: needinfo?(dtc-moz)
Comment 52•11 years ago
|
||
Some brackets were needed here to get the desired precedence. This just avoids extra alignment fill, the alignment was not wrong.
Comment 53•11 years ago
|
||
(In reply to Gary Kwong [:gkw] [:nth10sd] catching up on email/bugmail from comment #50)
> Created attachment 8375383 [details]
> second stack with additional patch from bug 861208
>
> I tried that additional patch and when run with --ion-eager, got:
>
> Assertion failure: i->is<InstMovW>(), at jit/arm/MacroAssembler-arm.cpp
The revised patch in bug 861208 fixes the real issue. There might be more cases that need similar handling. Some refactoring might be appropriate.
It is not necessary to apply the patch noted in comment 52, but it would make it easer to reproduce future crashes if we are using the same patches.
Thank you for testing, and further testing would help.
Comment 54•11 years ago
|
||
> It is not necessary to apply the patch noted in comment 52, but it would
> make it easer to reproduce future crashes if we are using the same patches.
So ideally I'd need the patch in comment 46, the patch in bug 861208 comment 5, and the patch in comment 52?
Flags: needinfo?(dtc-moz)
Comment 55•11 years ago
|
||
(In reply to Gary Kwong [:gkw] [:nth10sd] catching up on email/bugmail from comment #54)
> > It is not necessary to apply the patch noted in comment 52, but it would
> > make it easer to reproduce future crashes if we are using the same patches.
>
> So ideally I'd need the patch in comment 46, the patch in bug 861208 comment
> 5, and the patch in comment 52?
Yes. If your expectation is a combined patch then just let me know and I will bundled everything together?
Flags: needinfo?(dtc-moz)
Comment 56•11 years ago
|
||
> Yes. If your expectation is a combined patch then just let me know and I
> will bundled everything together?
Such a combined patch is always appreciated, as it can then be clear which problems found are for exactly which patch. :)
Or you could file a new meta bug with these combined patches to test 'em all, or whatever you're most comfortable with.
Flags: needinfo?(dtc-moz)
Comment 57•11 years ago
|
||
Comment on attachment 8367441 [details] [diff] [review]
AssemblerBufferRework-r7.patch
The work will continue of a user repo and the combined set will be reviewed when ready.
Attachment #8367441 -
Flags: review?(dtc-moz)
Flags: needinfo?(dtc-moz)
Comment 58•11 years ago
|
||
Comment on attachment 8367441 [details] [diff] [review]
AssemblerBufferRework-r7.patch
A later rebased patch has been supplied.
Attachment #8367441 -
Attachment is obsolete: true
Updated•11 years ago
|
Attachment #8361322 -
Attachment is obsolete: true
Attachment #8361322 -
Flags: feedback?
Updated•11 years ago
|
Attachment #8355754 -
Attachment is obsolete: true
Updated•11 years ago
|
Attachment #8355729 -
Attachment is obsolete: true
Comment 59•11 years ago
|
||
Comment on attachment 8375471 [details] [diff] [review]
Fix an alignment calculation
Moved to follow up work on a user repo.
Attachment #8375471 -
Attachment is obsolete: true
Comment 60•11 years ago
|
||
Attachment 8372763 [details] [diff] has been landed on a user repo to continue the work:
https://hg.mozilla.org/users/dtc-moz_scieneer.com/asmbuffer/rev/d2844e88d8d8
Comment 61•10 years ago
|
||
What's the status on this bug?
Whiteboard: [ion:t] [games:p1] [leave open] → [ion:t] [games:p?] [leave open]
Comment 62•7 years ago
|
||
I assume this bug is no longer valid and/or the patches are heavily bitrotted ... can we close this then? Is there anything actionable left for this bug?
Flags: needinfo?(Virtual)
Comment 63•7 years ago
|
||
Yep, let's mark it as FIXED, per Comment 17, as some patches landed.
If there will be some leftovers, let's track it in new bug, which will block this one.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Flags: needinfo?(Virtual)
Resolution: --- → FIXED
Whiteboard: [ion:t] [games:p?] [leave open] → [ion:t] [games:p?] [patches landed in Firefox 25]
Updated•7 years ago
|
Version: unspecified → 25 Branch
You need to log in
before you can comment on or make changes to this bug.
Description
•