Closed
Bug 1322019
Opened 8 years ago
Closed 8 years ago
Print stack transition in dis() function output.
Categories
(Core :: JavaScript Engine, enhancement)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla54
People
(Reporter: arai, Assigned: arai)
References
Details
Attachments
(6 files, 6 obsolete files)
(deleted),
patch
|
nbp
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
nbp
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
nbp
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
nbp
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
nbp
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
nbp
:
review+
|
Details | Diff | Splinter Review |
Currently we have stack convention data in Opcodes.h, as comment, like this:
> * Swaps the top two values on the stack. This is useful for things like
> * post-increment/decrement.
> * Category: Operators
> * Type: Stack Operations
> * Operands:
> * Stack: v1, v2 => v2, v1
We could use similar data to print dummy stack transition, next to bytecode in dis() output.
something like this:
00000: getgname "foo" # val
00005: dup # val val
00006: callprop "hello" # val obj[name]
00011: swap # obj[name] val
00012: getgname "a" # obj[name] val val
00017: getgname "b" # obj[name] val val val
00022: add # obj[name] val (lval+rval)
00023: call 1 # rval
00026: return #
00027: retrval #
(we could do some more formatting tho)
I'm doing it manually everytime modifying bytecode, and it should be nice if it's done automatically.
Assignee | ||
Comment 1•8 years ago
|
||
or, perhaps we could create separated python script to annotate dis() output,
if introducing that to Opcodes.h makes it complicated.
Assignee | ||
Comment 2•8 years ago
|
||
I'll go with python script as a first step.
Summary: Print stack transition in dis() function output. → Add a script to print stack transition in dis() function output.
Assignee | ||
Comment 3•8 years ago
|
||
Here's WIP patch.
Assignee | ||
Comment 4•8 years ago
|
||
and output of the following command:
cat dump.txt | ./js/src/vm/show_stack.py `pwd`
where dump.txt contains the bytecode section of the output of the following:
dis(async function*() { for await (x of y) { if (x) break; } })
with bug 1331092 patch applied.
Assignee | ||
Comment 5•8 years ago
|
||
currently it cannot handle lambda/lambda_arrow, since it can print multiple-line operand.
Assignee | ||
Comment 6•8 years ago
|
||
Updated to handle more operands
also add comments for jump targets.
Attachment #8836326 -
Attachment is obsolete: true
Assignee | ||
Comment 7•8 years ago
|
||
output for dis(async function*() { for await (x of y) { if (x) break; } })
Attachment #8836327 -
Attachment is obsolete: true
Assignee | ||
Comment 8•8 years ago
|
||
now it ignores all non-disassembly line, so it can be used like
$ ./obj-dir/dist/bin/js | ./js/src/vm/show_stack.py `pwd`
dis(async function*() { for await (x of y) { if (x) break; } })
Ctrl+D
Assignee | ||
Comment 9•8 years ago
|
||
forgot to update the patch
Attachment #8836450 -
Attachment is obsolete: true
Assignee | ||
Comment 10•8 years ago
|
||
[Opcodes.h]
JSOP_SYMBOL
removed "," that means the separator between operands, while JSOP_SYMBOL
has only one operand
JSOP_STRING
renamed "string" to "atom", to associate it with "atomIndex" operand
[show_stack.py]
It receives the output of dis() function from stdin, ignoring all not-parsible lines.
It prints the calculated output when it find empty line or EOF.
The output is in the following format:
pc: opcode operands # stack
like the following for for `10 + x`:
00000: int8 10 # 10
00002: getgname "x" # 10 x
00007: add # (10 + x)
It also prints the information about jump target,
like the following for for `y = x ? 10 : 20;`:
00000: bindgname "y" # global
00005: getgname "x" # global x
00010: ifeq 23 (+13) # global
00015: jumptarget # global
00016: int8 10 # global 10
00018: goto 26 (+8) # global 10
# from ifeq @ 00010
00023: jumptarget # global
00024: int8 20 # global 20
# from goto @ 00018
00026: jumptarget # global 10
00027: setgname "y" # 10
00032: pop #
It calculates the stack for each opcode, from PC 0, by following the execution (jump, branch, try-catch-finally).
Stack is calculated like following ways (maybe too simplified):
1. let next_list be an list with (PC=0, stack=empty)
2. while next_list is not empty:
a. pick an item from next_list
b. apply Stack uses/defs to stack, for opcode pointed by PC
c. remember the stack for PC
d. for each possible jump target of current opcode:
i. if the jump target is not yet checked:
1. append (PC=jump target, stack=clone of current stack) to next_list
When applying Stack uses/defs, it tries to keep name, like:
int8 10 # 10
int8 20 # 10 20
add # (10 + 20)
for the following Stack uses/defs for JSOP_ADD:
> * Stack: lval, rval => (lval + rval)
This is done by consuming stack 10, 20 by assigning them to lval, rval, in operands_to_vars function,
and pushing "(lval + rval)" with lval and rval replaced with the value, in expand_expr function.
Also, it uses the atom from operands, like:
00005: string "b" # x "b"
for the following stack uses/defs for JSOP_STRING:
> * Operands: uint32_t atomIndex
> * Stack: => atom
by associating "SOMETHINGIndex" to "SOMETHING"
when the Stack uses/defs are variadic, or needs special handling for readability,
it's expanded to simple uses/defs in expand_stack function.
[make_opcode_doc.py]
separated code that parses Opcodes.h into opcode.py,
and also made it parse more fields (display_name, image),
and convert operands, stack_uses, stack_defs into an array (operands_array, stack_uses_array, stack_defs_array),
and also it now returns the list of opcodes.
Assignee: nobody → arai.unmht
Attachment #8836453 -
Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8838642 -
Flags: review?(luke)
Comment 11•8 years ago
|
||
Do you suppose you could get the person who reviewed the original Opcodes.h docs to review this patch as well? I don't feel I have quite enough context.
Assignee | ||
Comment 12•8 years ago
|
||
Comment on attachment 8838642 [details] [diff] [review]
Add js/src/vm/show_stack.py to show stack transition for given bytecode.
forwarding to Waldo.
feel free to redirect.
Attachment #8838642 -
Flags: review?(luke) → review?(jwalden+bmo)
Comment 13•8 years ago
|
||
Currently we have the BytecodeParser, which for at least generating error messages for JS expressions. The bytecode parser reconstruct the stacks with the offset from which the expression comes from, and has special tags [1] for merging branches. The benefit of re-using the BytecodeParser is to spot error messages issues at the same time.
Do you think it would be possible to output this stack-data as part of the dis() output?
In which case you could convert the current script in a style-check script which validates the consistency between the Macro and the comment.
[1] http://searchfox.org/mozilla-central/source/js/src/jsopcode.cpp#318-319
Flags: needinfo?(arai.unmht)
Assignee | ||
Comment 14•8 years ago
|
||
Comment on attachment 8838642 [details] [diff] [review]
Add js/src/vm/show_stack.py to show stack transition for given bytecode.
(In reply to Nicolas B. Pierron [:nbp] from comment #13)
> Currently we have the BytecodeParser, which for at least generating error
> messages for JS expressions. The bytecode parser reconstruct the stacks
> with the offset from which the expression comes from, and has special tags
> [1] for merging branches. The benefit of re-using the BytecodeParser is to
> spot error messages issues at the same time.
>
> Do you think it would be possible to output this stack-data as part of the
> dis() output?
>
> In which case you could convert the current script in a style-check script
> which validates the consistency between the Macro and the comment.
>
> [1] http://searchfox.org/mozilla-central/source/js/src/jsopcode.cpp#318-319
thanks, looks like I can build this on BytecodeParser+ExpressionDecompiler.
I'll rewrite it.
Flags: needinfo?(arai.unmht)
Attachment #8838642 -
Flags: review?(jwalden+bmo)
Assignee | ||
Comment 15•8 years ago
|
||
Going back to original plan.
it's almost working.
now it's time to optimize some more.
Summary: Add a script to print stack transition in dis() function output. → Print stack transition in dis() function output.
Assignee | ||
Comment 16•8 years ago
|
||
when opcode defined 2 values, BytecodeParser doesn't trace which value it was.
so, the stack transition would have to say "# (A or B) (A or B)" or something like that.
or we need to modify BytecodeParser to track the index of ndefs.
Assignee | ||
Comment 17•8 years ago
|
||
by using struct for Bytecode.offsetStack array element type,
we can add the index for ndefs to track when an opcode defines 2 values.
then, the struct will become 48-64 bits (iiuc).
so we have another bits available.
now I'm thinking about making UnknownOffset and IgnoreOffset an annotation, for
given offset/index,
and display the possible value in stack dump, like following:
js> dis(() => (x && y)())
flags: LAMBDA EXPRESSION_CLOSURE ARROW
loc op
----- --
main:
00000: getgname "x" # x
00005: and 17 (+12) # x
00010: jumptarget # x
00011: pop #
00012: getgname "y" # y
# from and @ 00005
00017: jumptarget # merged<x>
00018: undefined # merged<x> undefined
00019: call 0 # merged<x>(...)
00022: return #
00023: retrval # !!! UNREACHABLE !!!
for UnknownOffset (the merge case), we don't have enough space to store 2 or more offsets/indices,
(unless storing them in separated space and pointing the index there or something similar)
so the above example shows the 1st item of the merge.
it would be better than just showing "<merged>" in stack dump.
Assignee | ||
Comment 18•8 years ago
|
||
maybe displaying all possibilities for the merge case is just overkill.
since the stack dump will become too long (or spans into multiple lines)
Assignee | ||
Comment 19•8 years ago
|
||
enumerating all possibilities will result in recursion while decompiling expression (if we want to test expression decompiler as well there)
I'll go with just using the first item, not overwriting with joining newer value.
Assignee | ||
Comment 20•8 years ago
|
||
wow, just noticed that we're emitting JSOP_RETRVAL for arrow function with expression body, that's always unreachable :P
Assignee | ||
Updated•8 years ago
|
Assignee | ||
Comment 21•8 years ago
|
||
Before applying the main change, I'd like to improve the decompiler output.
Currently unary operator is added outside of parens, like,
!(foo)
+ (foo)
but the operator should be inside the parens, like
(!foo)
(+ foo) // maybe (+foo) ?
also, "void" had no parens.
so also added enclosing parens.
Attachment #8836451 -
Attachment is obsolete: true
Attachment #8838642 -
Attachment is obsolete: true
Attachment #8840324 -
Flags: review?(nicolas.b.pierron)
Assignee | ||
Comment 22•8 years ago
|
||
Also, supported several more opcodes.
* JSOP_DELNAME
* JSOP_DELPROP
* JSOP_DELELEM
* JSOP_STRICTDELPROP
* JSOP_STRICTDELELEM
(delete EXPR)
* JSOP_GETXPROP
OBJ.PROP or OBJ[PROP]
* JSOP_FUNAPPLY
FUNC(...) // FUNC here contains ".apply"
* JSOP_SUPERCALL
* JSOP_SPREADSUPERCALL
super(...)
* JSOP_SUPERFUN
super
* JSOP_EVAL
* JSOP_SPREADEVAL
* JSOP_STRICTEVAL
* JSOP_STRICTSPREADEVAL
eval(...)
* JSOP_NEW
* JSOP_SPREADNEW
new FUNC(...)
* JSOP_TYPEOF
* JSOP_TYPEOFEXPR
(typeof EXPR)
Attachment #8840330 -
Flags: review?(nicolas.b.pierron)
Assignee | ||
Comment 23•8 years ago
|
||
there are several cases that opcode keeps the value on the stack, with or without removing other values.
those cases should be reflected more to BytecodeParser::simulateOp.
JSOP_CHECKISOBJ was handled in ExpressionDecompiler::decompilePC but it should be handled in BytecodeParser::simulateOp.
Attachment #8840331 -
Flags: review?(nicolas.b.pierron)
Assignee | ||
Comment 24•8 years ago
|
||
offsetStack of BytecodeParser::simulateOp should always be non-null.
so the `if (offsetStack)` is not necessary.
Attachment #8840333 -
Flags: review?(nicolas.b.pierron)
Assignee | ||
Comment 25•8 years ago
|
||
and here's the main change.
* in dis() output display the following 2 information
* stack dump for each opcode, that shows the stack state after the opcode
just like stack comment in BytecodeEmitter
each stack value is shown as expression decompiler output
(with dedicated mode that displays more internal state)
if the value comes from 2 or more paths, it's displayed like
merged<EXPR>
where EXPR is the first item that reaches the opcode
also, when the value is marked as "ignore", it's displayed like
ignored<EXPR>
* if the opcode is jumped from different place, the opcode, offset, and
the reason of the jump
* changed the element type of BytecodeParser::Bytecode.offsetStack to `struct OffsetAndDefIndex`.
* track the index inside `ndefs` (defIndex), to track the case the opcode defines 2 values
* store ignored case and merged case separately from offset, to use the offset in stack dump
* Added some properties to Bytecode class, only in DEBUG build that supports dis()
* stackDepthAfter/offsetStackAfter
captures the stack after the opcode,
(just like stackDepth/offsetStack that captures the stack before the opcode)
stack dump should show the state after the opcode
* jumpOrigins
holds the offsets of opcode that jumps to this opcode, with the reason of the jump
to annotate the opcode in dis() output
since this is variable length array, I used Vector,
and Bytecode ctor should receive LifoAlloc in DEBUG mode
* added dedicated stack dump mode to BytecodeParser and ExpressionDecompiler
that calculates extra information and also exposes internal state
BytecodeParser
* simulate more opcodes (BytecodeParser::simulateOp) to make stack transition clearer
* records jumps to bytecode (Bytecode.jumpOrigins)
* capture the stack after each opcode (Bytecode.offsetStackAfter)
ExpressionDecompiler
* do not bother with self-hosted code
(that tries to display argument name in normal mode)
* display more clearer name for some opcodes, like
JSOP_NEWARRAY returns "ARRAY" instead of "[]"
* handle merged/ignored cases and generate merged<EXPR> and ignored<EXPR> output
there's one place that I don't understand.
in FindStartPC, the else branch for the following code looks like not used by any caller.
I added MOZ_CRASH there and triggered try run, but no test hit it.
> // If the current PC has fewer values on the stack than the index we are
> // looking for, the blamed value must be one pushed by the current
> // bytecode, so restore *valuepc.
> if (index < size_t(parser.stackDepthAtPC(current)))
> *valuepc = parser.pcForStackOperand(current, index);
> else
> *valuepc = current;
can we just remove the else branch and assert the condition?
Attachment #8840342 -
Flags: review?(nicolas.b.pierron)
Comment 26•8 years ago
|
||
Comment on attachment 8840324 [details] [diff] [review]
Part 1: Put unary operator inside parens in expression decompilation.
Review of attachment 8840324 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit-test/tests/basic/expression-autopsy.js
@@ +89,5 @@
>
> +check("o[(!o)]");
> +check("o[(~o)]");
> +check("o[(+ o)]");
> +check("o[(- o)]");
nit: for sanity reasons, can you add:
check("o[(- (o + 1))]");
Attachment #8840324 -
Flags: review?(nicolas.b.pierron) → review+
Comment 27•8 years ago
|
||
Comment on attachment 8840330 [details] [diff] [review]
Part 2: Decompile more opcodes.
Review of attachment 8840330 [details] [diff] [review]:
-----------------------------------------------------------------
This patch is correct r=me with test cases for these new values, when possible.
If you think the following nits are worth following feel free to apply them as well:
::: js/src/jsopcode.cpp
@@ +1257,5 @@
> case JSOP_GETGNAME:
> case JSOP_GETNAME:
> case JSOP_GETINTRINSIC:
> + if (op == JSOP_DELNAME) {
> + if (!write("(delete "))
nit: I would prefer if this logic is easier to follow, such as:
case JSOP_DELNAME:
return write("( delete") &&
write(loadAtom(pc) &&
write(")");
@@ +1334,3 @@
> }
> + if (op == JSOP_DELPROP || op == JSOP_STRICTDELPROP) {
> + if (!write(")"))
nit: What do you think of:
bool hasDelete = op == JSOP_DELPROP || op == JSOP_STRICTDELPROP;
return (hasDelete ? write("(delete") : true) &&
decompilePCForStackOperand(pc, -1) &&
(IsIdentified(prop)
? write(".") && quote(prop, '\0')
: write("[") && quote(prop, '\'') && write("]")) &&
(hasDelete ? write(")") : true);
@@ +1370,5 @@
> + {
> + return false;
> + }
> + if (op == JSOP_DELELEM || op == JSOP_STRICTDELELEM) {
> + if (!write(")"))
nit: same here.
Attachment #8840330 -
Flags: review?(nicolas.b.pierron) → review+
Comment 28•8 years ago
|
||
Comment on attachment 8840331 [details] [diff] [review]
Part 3: Reflect the case that stack values are kept instead of newly pushed in BytecodeParser::simulateOp.
Review of attachment 8840331 [details] [diff] [review]:
-----------------------------------------------------------------
Nice catch!
Do you think we should remove the default case, and force any new JSOP_Opcode to be defined here as well?
Attachment #8840331 -
Flags: review?(nicolas.b.pierron) → review+
Updated•8 years ago
|
Attachment #8840333 -
Flags: review?(nicolas.b.pierron) → review+
Comment 29•8 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #28)
> Do you think we should remove the default case, and force any new
> JSOP_Opcode to be defined here as well?
To answer my-self, no because some of these instruction are mutating the objects which are given to them, and this could change the way these are display in error messages.
Comment 30•8 years ago
|
||
Comment on attachment 8840342 [details] [diff] [review]
Part 5: Print stack transition in dis() function output.
Review of attachment 8840342 [details] [diff] [review]:
-----------------------------------------------------------------
Really nice work!
This would help everybody working on the front-end for sure, while helping maintaining good error messages.
::: js/src/jsopcode.cpp
@@ +269,5 @@
> +
> + enum : uint8_t {
> + Normal = 0,
> +
> + // Ignored this value in while expression decompilation.
nit: in while -> in the
@@ +364,4 @@
> class Bytecode
> {
> public:
> + Bytecode(BYTECODE_CTOR_PARAMETER)
nit: Adding an unused argument should be fine, especially since it is likely to be inlined anyway. Add an "explicit" keyword.
@@ +1406,5 @@
> + uint32_t depth = parser->stackDepthAfterPC(pc);
> +
> + for (uint32_t i = 0; i < depth; i++) {
> + if (i) {
> + if (sp->put(" ") < 0)
nit: Spaces are already part of some de-compiled expressions, would it makes sense to use ", ", or "; " instead ?
@@ +1414,5 @@
> + const OffsetAndDefIndex& offsetAndDefIndex
> + = parser->offsetForStackOperandAfterPC(script->pcToOffset(pc), i);
> + // This will decompile the stack for the same PC many times.
> + // We'll avoid optimizing it since this is a testing function
> + // and it won't worth managing cached expression here.
nit: it won't *be* worth
@@ +1782,5 @@
> +#ifdef DEBUG
> + // For stack dump, argument name is not necessary.
> + && !isStackDump
> +#endif /* DEBUG */
> + ) {
nit: put the opening brace on a new line.
@@ +1929,5 @@
> write("(...)");
> case JSOP_NEWARRAY:
> +#ifdef DEBUG
> + if (isStackDump)
> + return write("ARRAY");
nit: write("[...]"); ?
Also add a reference to the simulateOp function.
@@ +1983,5 @@
> + if (isStackDump) {
> + // Special decompilation for stack dump.
> + switch (op) {
> + case JSOP_ARGUMENTS:
> + return write("ARGUMENTS");
nit: arguments
@@ +2091,5 @@
> + case JSOP_RESUME:
> + return write("RVAL");
> +
> + case JSOP_SPREADCALLARRAY:
> + return write("ARRAY");
nit: [...] as well?
Attachment #8840342 -
Flags: review?(nicolas.b.pierron) → review+
Comment 31•8 years ago
|
||
(In reply to Tooru Fujisawa [:arai] from comment #25)
> > // If the current PC has fewer values on the stack than the index we are
> > // looking for, the blamed value must be one pushed by the current
> > // bytecode, so restore *valuepc.
> > if (index < size_t(parser.stackDepthAtPC(current)))
> > *valuepc = parser.pcForStackOperand(current, index);
> > else
> > *valuepc = current;
>
> can we just remove the else branch and assert the condition?
Tracking this back through history highlight the following comment:
http://searchfox.org/mozilla-central/rev/5eef25c0c7e49baccf1aed118d5553e99e6b80f7/js/src/jsopcode.cpp#5811
I do not recall seeing temporary stack slots for JSOP_Opcode for a while, so maybe an assertion would be fine.
Assignee | ||
Comment 32•8 years ago
|
||
Thank you for reviewing :D
(In reply to Nicolas B. Pierron [:nbp] from comment #30)
> @@ +1406,5 @@
> > + uint32_t depth = parser->stackDepthAfterPC(pc);
> > +
> > + for (uint32_t i = 0; i < depth; i++) {
> > + if (i) {
> > + if (sp->put(" ") < 0)
>
> nit: Spaces are already part of some de-compiled expressions, would it makes
> sense to use ", ", or "; " instead ?
all spaces are now inside parens (otherwise the expression doesn't interact well with enclosing expression, and it's a bug in decompiler), so it should be clear enough.
and also I'd like to use the same format as stack comment in BytecodeEmitter.
Assignee | ||
Comment 33•8 years ago
|
||
changed JSOP_INITELEM_ARRAY and JSOP_INITELEM_INC to generate "[...]".
Attachment #8840611 -
Flags: review?(nicolas.b.pierron)
Updated•8 years ago
|
Attachment #8840611 -
Flags: review?(nicolas.b.pierron) → review+
Assignee | ||
Comment 34•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/bcb3bb41c374eaa64fd64d651148b59b70acf377
Bug 1322019 - Part 1: Put unary operator inside parens in expression decompilation. r=nbp
https://hg.mozilla.org/integration/mozilla-inbound/rev/20cd9a2ede17439c9a8c87fc7c6542ff93945a24
Bug 1322019 - Part 2: Decompile more opcodes. r=nbp
https://hg.mozilla.org/integration/mozilla-inbound/rev/c905752969fb007959730ae170449dc1e97cd62d
Bug 1322019 - Part 3: Reflect the case that stack values are kept instead of newly pushed in BytecodeParser::simulateOp. r=nbp
https://hg.mozilla.org/integration/mozilla-inbound/rev/445ad080f55cd1f6a3dd99a6656645e97f45b748
Bug 1322019 - Part 4: Remove unnecessary if in BytecodeParser::simulateOp. r=nbp
https://hg.mozilla.org/integration/mozilla-inbound/rev/2c034824894414f8c69f124588b6d9e4b655c291
Bug 1322019 - Part 5: Print stack transition in dis() function output. r=nbp
https://hg.mozilla.org/integration/mozilla-inbound/rev/5e464ee0fb565d7aa21536fac6a880a8c3264db0
Bug 1322019 - Part 6: Decompole NEWARRAY+INITELEM_ARRAY/INITELEM_INC to [...]. r=nbp
Assignee | ||
Comment 35•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/6f3c80a94aabb00c6dbe71490af990b7b0c28d9c
Bug 1322019 - followup: Change a testcase for decompilation to follow the change. r=bustage
Comment 36•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/bcb3bb41c374
https://hg.mozilla.org/mozilla-central/rev/20cd9a2ede17
https://hg.mozilla.org/mozilla-central/rev/c905752969fb
https://hg.mozilla.org/mozilla-central/rev/445ad080f55c
https://hg.mozilla.org/mozilla-central/rev/2c0348248944
https://hg.mozilla.org/mozilla-central/rev/5e464ee0fb56
https://hg.mozilla.org/mozilla-central/rev/6f3c80a94aab
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox54:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
You need to log in
before you can comment on or make changes to this bug.
Description
•