Closed Bug 699415 Opened 13 years ago Closed 13 years ago

IonMonkey: improve block reordering algorithm

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: jandem, Assigned: jandem)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

IM reorders basic blocks so that they are in reverse postorder. A problem with the current algorithm is that the resulting block list does not (always) reflect the original program very well. For instance, consider this function: -- function f() { for (var i=0; i<0x123;) { i += 5; } var a = i * 0x999; while (true) {} return a; } -- Here we generate the following code: -- // i < 0x123? 0x9ce167: cmp $0x123,%eax 0x9ce16d: jl 0x9ce1a3 // a = i * 0x999 0x9ce173: mov %eax,%ecx 0x9ce175: imul $0x999,%eax,%eax 0x9ce17b: jo 0x9ce004 // while (true) {} 0x9ce181: mov $0x1,%ecx 0x9ce186: test %ecx,%ecx 0x9ce188: jne 0x9ce19a // a = i, jump to return 0x9ce18e: mov $0xffffff81,%ecx 0x9ce193: mov %eax,%edx 0x9ce195: jmp 0x9ce1b9 // next iteration of second loop 0x9ce19a: mov %eax,%edx 0x9ce19c: mov %edx,%eax 0x9ce19e: jmp 0x9ce181 // i += 5 0x9ce1a3: mov %eax,%ecx 0x9ce1a5: mov %ecx,0x7c(%esp) 0x9ce1a9: add $0x5,%ecx 0x9ce1ac: jo 0x9ce009 // next iteration of first loop 0x9ce1b2: mov %ecx,%eax 0x9ce1b4: jmp 0x9ce167 // return a 0x9ce1b9: add $0x80,%esp 0x9ce1bf: ret -- The block reordering algorithm visits the blocks in post order and reverses the result to get the reverse postorder. We should either build a reverse postorder directly, or change the first step of the algorithm to visit the successors of a node in reverse order. The second step of the algorithm will then reverse the list and restore the original order where possible. Strictly speaking this may not be a "reverse postorder" anymore, but it generates more efficient code (and does not introduce new jit-test failures): -- // i < 0x123? 0x9ce167: cmp $0x123,%eax 0x9ce16d: jge 0x9ce183 // i += 5 0x9ce173: mov %eax,%ecx 0x9ce175: add $0x5,%eax 0x9ce178: jo 0x9ce004 // next iteration of first loop 0x9ce17e: jmp 0x9ce167 // i *= 0x999 0x9ce183: mov %eax,%ecx 0x9ce185: mov %ecx,0x7c(%esp) 0x9ce189: imul $0x999,%ecx,%ecx 0x9ce18f: jo 0x9ce009 // while (true) {} 0x9ce195: mov $0x1,%eax 0x9ce19a: test %eax,%eax 0x9ce19c: je 0x9ce1a7 0x9ce1a2: jmp 0x9ce195 // a = i, return a 0x9ce1a7: mov $0xffffff81,%eax 0x9ce1ac: mov %ecx,%edx 0x9ce1ae: mov %eax,%ecx 0x9ce1b0: add $0x80,%esp 0x9ce1b6: ret -- 19 instructions instead of 22, and it more closely resembles the original source code.
Nice analysis. Strict RPO is required for most phases in the pipeline, including lowering, though codegen should be free to have almost any order. Would it improve things to change the order loop successors are visited, versus normal branches, in the ReorderBlocks()?
We really want all blocks in a loop to be contiguous, which I think the above suggestion accomplishes. If we have contiguous blocks, then LSRA can allocate a single interval for the entire loop, so a chunk of nasty code starting at LinearScan.cpp:553 can vanish.
I haven't been able to find a testcase where this breaks (using Anion), but we decided that it may be better/safer to reorder blocks between regalloc and codegen. I'm going to try this tomorrow. If it works we can probably still change ReorderBlocks to use contiguous blocks for loops and simplify LSRA.
Attached patch Fix (deleted) — Splinter Review
Talked about this with sstangl, we'd like to get it in soon.
Attachment #576704 - Flags: review?(sstangl)
Comment on attachment 576704 [details] [diff] [review] Fix Review of attachment 576704 [details] [diff] [review]: ----------------------------------------------------------------- Thanks. Having loops contiguous makes stepping through generated code extremely pleasant. This probably isn't sufficient to remove code from LSRA, but it does nicely reorder the blocks in most circumstances by mimicking IonBuilder patterns. Great idea. ::: js/src/ion/IonAnalysis.cpp @@ +628,5 @@ > Vector<unsigned int, 0, IonAllocPolicy> successors; > InlineList<MBasicBlock> done; > > MBasicBlock *current = *graph.begin(); > + unsigned int nextSuccessor = current->numSuccessors() - 1; A comment explaining this strange choice of ordering may be useful =] @@ +637,5 @@ > while (true) { > if (!current->isMarked()) { > current->mark(); > > if (nextSuccessor < current->lastIns()->numSuccessors()) { Would you mind changing this code to "current->numSuccessors()"? Use of lastIns() directly is deprecated. @@ +643,5 @@ > if (!successors.append(nextSuccessor)) > return false; > > current = current->lastIns()->getSuccessor(nextSuccessor); > + nextSuccessor = current->lastIns()->numSuccessors() - 1; This can just be current->numSuccessors() - 1, as it is above.
Attachment #576704 - Flags: review?(sstangl) → review+
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: