Closed
Bug 631637
Opened 14 years ago
Closed 11 years ago
JM: Measure per-opcode codegen size
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: dvander, Assigned: dvander)
References
Details
(Whiteboard: [MemShrink:P3])
Attachments
(2 files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
text/plain
|
Details |
This patch, for every script, emits the following statistics about each op it compiled:
(1) How many times that op was encountered
(2) The total number of inline bytes generated for that op
(3) The total number of out-of-line bytes generated for that op
(4) The total number of sync/vmcall bytes for that op, either inline or OOL
(this is a subset of 2 and 3)
Assignee | ||
Comment 1•14 years ago
|
||
This is the result of logging into Quora and opening four questions.
We appear to generate 10MB of inline code and 11MB of OOL code. Of that 21MB, about 9MB goes to sync/vmcalls.
The top offending opcodes are:
* call (about 130 bytes on il and ool path)
* callprop, setprop, getprop (120-160 bytes on ool path)
* name (90 bytes on ool path)
* getelem (100 bytes on il path, 150 bytes on ool path)
* lambda (65 bytes on il path, 55 in sync)
* getgname (55 bytes on il path, 81 on ool path).
Based on this, I think:
(1) We generate way too much sync code, accounting for almost 50% of generated code.
(2) Very common ops that have warmups (like CALL) should probably be purely an IC, with no inline or OOL paths.
Comment 2•14 years ago
|
||
(In reply to comment #1)
> (1) We generate way too much sync code, accounting for almost 50% of generated
> code.
Definitely. Good analysis in bug 631658.
> (2) Very common ops that have warmups (like CALL) should probably be purely an
> IC, with no inline or OOL paths.
I haven't refreshed myself on the details of compiling CALL lately, but I wonder how likely we are to hit a given CALL opcode. If likely, then it seems like compiling it in the first round is fine, because we'll compile it later, anyway. I'm not sure how likely is "likely", though. Even if it's only 70%, for CALL that could save us 0.3*4.7MB = 1.4MB, or 7% of the total jitcode allocation in this example.
Assignee | ||
Comment 3•14 years ago
|
||
On this Quora run, 2684865 bytes (2.5MB) went to just updating the PC in sync paths!
Assignee | ||
Comment 4•14 years ago
|
||
On techcrunch.com, opcode breakdown is basically identical to quora. Interesting.
33.5MB of inline code, 38.5MB of ool code. Of that ~70MB, 31MB is sync code, and 8.3MB of that is PC updating.
Assignee | ||
Comment 5•14 years ago
|
||
Another techcrunch.com workload:
* 8MB to sync code
(8123504 bytes)
* 2MB to PC syncing
(2175795 bytes)
* 7MB to vmcall sequences overall (this includes PC/SP updating)...
(6951788 bytes)
A vm call is:
* 15 bytes for regs.pc update
* 9 bytes for regs.sp update
* 5 bytes for regs.fp update
* 12 bytes for move+call
* 3 bytes to move VMFrame -> arg0
So, of that 8MB of sync code,
* 29% goes to updating regs.pc
* 23% goes to call instructions
* 17% goes to updating regs.sp
* 15% goes to stack syncing
* 10% goes to updating regs.fp
* 6% goes to moving VMFrame -> arg0
Comment 6•14 years ago
|
||
Can we look at computing pc for those (few) native methods and accessors that do bytecode inspection using some side mapping from mjit-generated eip to bytecode pc? Does our debugger support already have something like this?
/be
Comment 7•14 years ago
|
||
Nice work, dvander!
Updated•14 years ago
|
Blocks: MemShrinkTools
Updated•13 years ago
|
Whiteboard: [MemShrink:P3]
Comment 8•11 years ago
|
||
JM was removed, Baseline shares IC stub code and Ion generates much smaller code due to type information (and is only used for relatively hot code anyway).
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•