Closed Bug 691788 Opened 13 years ago Closed 13 years ago

Add type behavior info to PCCOUNTS

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla11

People

(Reporter: bhackett1024, Assigned: bhackett1024)

References

(Blocks 1 open bug)

Details

Attachments

(4 files, 10 obsolete files)

Right now the script PCCOUNTS can be used to track information about execution in both a dynamic and static sense (e.g. dynamic: where are the hotspots, static: which code is ever executed). Additional metrics have been added to estimate the quality of the jitcode generated. We should add additional metrics about the types that show up during execution. This is needed to better understand and analyze TI's behavior for a writeup, and it should be useful to supplement the existing data when analyzing performance faults (more of the 'why' rather than the 'what' for the quality of generated jitcode). Mainly: - For each opcode, what are the type distributions of the pushed and/or popped values, in terms of both inferred and observed types. - Counts of property accesses and calls that require type barriers. - Counts of accesses on packed arrays and definite properties. - Counts of accesses on missing properties and integer arithmetic which overflows.
Attached patch WIP (obsolete) (deleted) — Splinter Review
Tracks information about observed types at name and property accesses, whether those accesses were inferred as monomorphic or polymorphic, and whether property accesses and calls have type barriers. Also updates pccounts output to be both human and machine readable. Still needs tracking for type behavior of arithmetic and characteristics of element and property accesses.
What is ARITH_UNKNOWN and ARITH_OTHER intended for?
What I'd intended for the ARITH ones: ARITH_INT: known int inputs, int output ARITH_OVERFLOW: known int inputs, double output ARITH_DOUBLE: known double inputs, double output ARITH_OTHER: known inputs of other types ARITH_UNKNOWN: polymorphic inputs Since this is just about the known types it saves needing to do dynamic testing to figure out how to bucket an access.
Attached patch WIP v2 (obsolete) (deleted) — Splinter Review
Counts arithmetic and elem accesses, still needs to count the type of prop accesses
Attachment #564722 - Attachment is obsolete: true
Attached patch Prop access patch (obsolete) (deleted) — Splinter Review
Couldn't figure out a way to instrument prop accesses without editing the IC stubs themselves. This patch increments the following counters for {GET,CALL}PROP under the following conditions: PROP_DEFINITE: is definite prop, baked in slot access PROP_SHAPE: shape-guarded IC hit, _including_ the 'warming' up hits to ic::GetProp/ic::CallProp stubs PROP_UNKNOWN: hits to stubs after IC has been disabled Brian, I'm not sure my understanding of IC in JM is correct or if the above is the right way to go. If not, please rewrite.
Looks good, we should be able to start getting measurements off benchmarks and websites now.
Attached patch WIP v3 (obsolete) (deleted) — Splinter Review
Updated patch. - Replace saving/restoring the counter reg using frame.addressOfTop() with masm.push/pop - Fix offset calculation in IC stubs
Attachment #565429 - Attachment is obsolete: true
Attachment #565456 - Attachment is obsolete: true
Each Firefox session dumps to its own pccounts dump file.
Both patches are diffed against m-c 78609:46a6d0fd13d5
Attached patch WIP v3.1 (obsolete) (deleted) — Splinter Review
Fix bug in arg inc/dec opcodes
Attachment #566777 - Attachment is obsolete: true
Attached patch WIP v3.1 (obsolete) (deleted) — Splinter Review
Apparently I don't know how to qref...
Attachment #567475 - Attachment is obsolete: true
Attached patch scraper (obsolete) (deleted) — Splinter Review
Script to scrape type information from a pccounts log and collate the info at a domain granularity.
Attached patch revised (obsolete) (deleted) — Splinter Review
Fix a few codegen bugs.
Attachment #567703 - Attachment is obsolete: true
Attached patch scraper (deleted) — Splinter Review
Update scraper to report ratios among each category for adding to tables.
Attachment #567755 - Attachment is obsolete: true
Attached patch patch (8aeb207c9a2f) (obsolete) (deleted) — Splinter Review
Updated patch, restructures PC count storage to use less memory and make it easier to track more refined counters for particular opcodes. As a followup I want to add an about:jitperf or something to expose this information and allow people to better understand the JIT's behavior and identify performance faults.
Attachment #567813 - Attachment is obsolete: true
Attachment #571171 - Flags: review?(sphink)
Comment on attachment 571171 [details] [diff] [review] patch (8aeb207c9a2f) Review of attachment 571171 [details] [diff] [review]: ----------------------------------------------------------------- This looks like really useful stuff. As for an about:jitperf, David Flanagan (djf) wrote CoverMonkey, a great command-line tool using the previous output format. Cedric Vivier wrote an addon that borrows the CoverMonkey analysis to give a live view of what code is executing. Cedric, I don't remember the name or details of your tool, and I couldn't dig up a link. Do you have something you could show Brian & Shu? It's well beyond any addon I'd be able to write for this stuff; I really wish we could've recorded the demo at the all-hands. This change will totally break both of their tools. I don't think that should hold up landing this patch, but I'd like to get their views on how to expose the information they were using. djf - in particular, this patch replaces the -D output with a more direct dump of the pccounts information, and in non-DEBUG builds it won't show opcode names, just the numeric opcode values. (Previously, it wouldn't work at all in a non-DEBUG build, so that's not so bad.) It also loses the various offsets that you were using to do your reachability analysis, so you'll probably end up needing both dumps (the decompilation + pccounts.) It'd probably be easiest to put the subset of the counters you care about back into the decompilation output. CoverMonkey - http://www.davidflanagan.com/2011/08/covermonkey-cod.html ::: js/src/jsinfer.cpp @@ +2308,5 @@ > void > ScriptAnalysis::addTypeBarrier(JSContext *cx, const jsbytecode *pc, TypeSet *target, Type type) > { > + target->addType(cx, type); > + return; I'm guessing this wasn't an intended part of this patch... :) ::: js/src/jsopcode.cpp @@ +370,5 @@ > + return countAccessNames[which - BASE_LAST - 1]; > + if (elementOp(op)) > + return countElementNames[which - ACCESS_LAST - 1]; > + if (propertyOp(op)) > + return countPropertyNames[which - ACCESS_LAST - 1]; nit: could this be either [which - (BASE_LAST + 1)] or [which - BASE_COUNT] (or a similar name?) Took me a second to figure out where the 1-bias was coming from. @@ +394,5 @@ > + > + int len = js_CodeSpec[op].length; > + jsbytecode *next = (len != -1) ? pc + len : pc + js_GetVariableBytecodeLength(pc); > + > + Sprint(sp, "%05u %5u", pc - script->code, JS_PCToLineNumber(cx, script, pc)); JS_PCToLineNumber is a linear scan, which makes the whole thing quadratic. If we ever need to care about performance, we should switch to a shared op iterator. Doesn't seem worth fixing right now, though. @@ -306,5 @@ > uintN len; > > SprintCString(sp, "loc "); > - if (script->pcCounters) > - Sprint(sp, "counts%*s x ", COUNTS_LEN - strlen("counts"), ""); djf, cedricv: note that this eliminates the output format you were using and moves it to a separate counts-only dump. Would you rather we left at least the 1st 3 counts here? (interpreter, trace jit, method jit) They'll soon become 2 with the removal of the tracer. ::: js/src/jsscript.h @@ -543,3 @@ > #define JS_SCRIPT_INLINE_DATA_LIMIT 4 > uint8 inlineData[JS_SCRIPT_INLINE_DATA_LIMIT]; > -#endif What's this? ::: js/src/jstracer.cpp @@ -7219,5 @@ > - if (!wasInImacro && cx->hasRunOption(JSOPTION_PCCOUNT)) { > - JSScript *script = cx->fp()->script(); > - if (script->pcCounters) { > - int offset = cx->regs().pc - script->code; > - LIns *pcCounter_addr_ins = w.nameImmpNonGC(&script->pcCounters.get(JSPCCounters::TRACEJIT, offset)); If you're removing this, then I'd rather you removed the tjit slot in the counts at the same time. ::: js/src/methodjit/Compiler.cpp @@ -2803,5 @@ > } > } > } > > - if (script->pcCounters || pcLengths) { I need pcLengths to be filled in independently of pcCounters, for JIT code registration. @@ +2950,5 @@ > + types::TypeFlags flags = types->baseFlags(); > + bool objects = !!(flags & types::TYPE_FLAG_ANYOBJECT) || !!types->getObjectCount(); > + > + if (objects && !!(flags & types::TYPE_FLAG_STRING)) > + return false; nit: I ignored another !! earlier, but these are just too noisy. (I assume these are from shu?) A boolean operator is enough to coerce both sides, and the main reason for using 'bool' in place of an 'int' is to eliminate the need for these sorts of things in assignments. @@ +2961,5 @@ > +} > + > +void > +mjit::Compiler::updatePCTypes(jsbytecode *pc, FrameEntry *fe) > +{ JS_ASSERT(script->pcCounts); @@ +2971,5 @@ > + if (frame.peekTypeInRegister(fe) && reg == frame.tempRegForType(fe)) { > + JS_STATIC_ASSERT(Registers::ReturnReg != Registers::ArgReg1); > + reg = Registers::ArgReg1; > + } > + masm.push(reg); Is there a way to know if this register is live here, to avoid the push/pop? I suppose it's not worth bothering about. @@ +2990,5 @@ > + case JSVAL_TYPE_OBJECT: counter = OpcodeCounts::ACCESS_OBJECT; break; > + default:; > + } > + if (counter) > + masm.bumpCounter(&counts.get(counter), reg); counter can't be zero or otherwise false here, can it? @@ +3051,5 @@ > + > +void > +mjit::Compiler::updateArithCounters(jsbytecode *pc, FrameEntry *fe, > + JSValueType firstUseType, JSValueType secondUseType) > +{ JS_ASSERT(script->pcCounters)
Attachment #571171 - Flags: review?(sphink) → review+
Cedric, David -- see top of comment 16.
Steve, Thanks for the heads up! Am I understanding correctly that all the same information is being collected, but that the -D dump format is no longer reporting it all? Or has the information collected changed, too? If it is just the reporting, could we keep the current format for -Dcoverage and switch to the new format with -Dtypes or something like that? I remember someone (forget who) saying that in the future the PCCOUNTS info would be available through the debugging API. That sounds like a good long term solution. I'd rather not chase an evolving textual dump format now and then switch to the debug API later, if we could keep the current dump format (even just in debug builds) as an option. But if necessary, I can probably adapt CoverMonkey to use the new format. Having to sync up a counts report with a separate disassembly sounds sort of tricky, but I haven't looked at any of this code in a couple of months, so I don't really know.
Attached file sample output (obsolete) (deleted) —
Sample dump output with the patch. Each line looks like: 00045 9 getlocal [interp: 113, mjit: 888, mjit_code: 5328, infer_poly: 888, observe_int32: 888] In order, this is the op's bytecode offset, the line number of the op, the op name, and all the counts accumulated for the op. David, does this have all the information you need? If so, I would really rather switch to this format outright --- it adds a lot of complexity to support multiple output formats, and if all the information you need is available here then it should be straightforward to update CoverMonkey. I don't expect the dump format to change after this. This format is designed so that new counts can be added while still being both human- and machine-readable. Short term, I want to expose this information to addons, for the above mentioned about:jitperf. Cedric, it would be great if I could get a look at / get the code for your addon.
Brian, I love the named fields that hold the counts and other data. If you change the square brackets to curly braces, then its JSON and becomes super-easy to parse from JS and other scripting langauges. One of the things that CoverMonkey does is a reachability analysis to identify dead code. The opcode name doesn't do me any good without a full disassembly that includes jump offsets. So with your output format, I'd have to take the bytecode offset and go look at the actual bytecodes and deal with those directly. I could do that, but it would mean that I can't just pipe the -D output to a file and then analyze it in a separate pass. I'd have to add my own flag to spidermonkey that sets -D but adds in the disassembly. Or, CoverMonkey would have to re-compile the JavaScript source to regenerate the opcodes. I'm guessing that you include the opcode name because it is useful information, but that you include only the opcode name so that you still have a readable output with fixed column widths.... How about leaving your format as is, except that you also stick another full copy of the disassembly at the end of each line, after all the counts. Switches give multiline disassembly for all their offsets in the current -D format, and I don't need that. And the current -D includes source code for anonymous functions and I certainly don't need that! Or, if you don't want to do that by default, could you do a -Dd "dump with disassembly" option that includes the extra info? Actually, instead of putting the disassembly at the end of the line, you could even include it inside the [] (or {}) along with the count info. {..., op: "goto 19"} While I'm raving about JSON, would you consdier puting the pc and linenum values inside the curly braces too and making each line an JSON object? Then each script could be a JSON array and the entire dump could be another JSON array of scripts. If you format it carefully, you could probably still get pretty good human readability of the raw output, and you would certainly make it easier to write scripts that analyze the output
(In reply to David Flanagan from comment #18) > Steve, > > Thanks for the heads up! Am I understanding correctly that all the same > information is being collected, but that the -D dump format is no longer > reporting it all? Or has the information collected changed, too? There is additional information being collected, but all the old information is still present. The -D dump format is now very different and tailored specifically to reporting the counts. It does *not* have enough information for you to do your analysis. > If it is just the reporting, could we keep the current format for -Dcoverage > and switch to the new format with -Dtypes or something like that? I'm not sure of the exact syntax, but I'm leaning towards doing exactly that. > I remember someone (forget who) saying that in the future the PCCOUNTS info > would be available through the debugging API. That sounds like a good long > term solution. I'd rather not chase an evolving textual dump format now and > then switch to the debug API later, if we could keep the current dump format > (even just in debug builds) as an option. Yes, I told you that. There's a now-bitrotted patch in bug 671602 where jorendorff did exactly that, but the debug API has changed a fair bit since then, and it only exposed the raw counts. You need additional information. (In reply to Brian Hackett from comment #19) > In order, this is the op's bytecode offset, the line number of the op, the > op name, and all the counts accumulated for the op. David, does this have > all the information you need? If so, I would really rather switch to this > format outright --- it adds a lot of complexity to support multiple output > formats, and if all the information you need is available here then it > should be straightforward to update CoverMonkey. No, it doesn't have enough information. David does a reachability analysis that requires the offsets from various branching instructions. I just attempted to explain exactly what he's doing and why, but I think I'd better let him do it. :) > I don't expect the dump format to change after this. This format is > designed so that new counts can be added while still being both human- and > machine-readable. I think we're talking about two separate use cases, though. David is (ok, was) doing code coverage tools, for which really only the interp/TM/JM (or just the sum of them) counts are relevant, plus line numbers and control flow graph information. Not that incorporating the additional counts wouldn't be cool, but it feels like if you want to do that it would be better to use both sets of outputs. Brian, any objections to keeping just the basic counts in the decompiled output? I can put a patch to re-add them in a separate bug, if you want. > Short term, I want to expose this information to addons, for the above > mentioned about:jitperf. Cedric, it would be great if I could get a look at > / get the code for your addon. Cedric really wanted control flow tracing for his addon instead of op or basic block counts, but his current addon is probably pretty close to being useful with your new info.
Whoops, collided. Listen to David, not me.
(In reply to Brian Hackett from comment #19) > Short term, I want to expose this information to addons, for the above > mentioned about:jitperf. Cedric, it would be great if I could get a look at > / get the code for your addon. Brian, The CoverMonkey tool is at https://github.com/davidflanagan/CoverMonkey It does all the analysis of the -D output that is created by a spidermonkey on exit hook. Cedric's add-on is at: https://github.com/neonux/CodeInspector It uses CoverMonkey for coverage and # of execution analysis. The trick with the add on is that it updates live, rather than waiting for -D dumps when the program exits. So it required a patched version of Firefox to get those live dumps. I forget whether that patch is now in the tree or not.
Attached patch revised patch (8aeb207c9a2f) (deleted) — Splinter Review
The JSON idea sounds really good. This patch changes the output to interleave JSON for the counts with the standard script disassembly (nothing is printed in release builds).
Assignee: general → bhackett1024
Attachment #571171 - Attachment is obsolete: true
Attached file sample output (deleted) —
Attachment #571655 - Attachment is obsolete: true
Brian, Great! Assuming that you do something reasonable with the disassembly for switch statements (like removing newlines so the interleaving strictly alternates lines) then I should quite easily be able to update CoverMonkey to use this new format. David
Opcodes with multiple lines of disassembly will be fully printed before the count information for the op: 00012: 2 tableswitch defaultOffset 33 low 0 high 3 0: 15 1: 15 2: 15 3: 24 {"interp": 1} I'll check this in after the next aurora branch next week.
(Sorry for late replies, I guess this bug got lost in my bugmail. Don't hesitate to ping me on irc if I don't reply swiftly [cedricv]) (In reply to Steve Fink [:sfink] from comment #16) > Comment on attachment 571171 [details] [diff] [review] [diff] [details] [review] > patch (8aeb207c9a2f) > > I couldn't dig up a link. Do you have something you could show Brian & Shu? > It's well beyond any addon I'd be able to write for this stuff; I really > wish we could've recorded the demo at the all-hands. Actually there is one video demo about it :) http://neonux.com/tmp/live-coverage-allhands-demo.ogv Other than that, David posted the links to get the code running (with patch attached to the bug above) (In reply to David Flanagan from comment #23) > So it required a patched version of Firefox to get > those live dumps. I forget whether that patch is now in the tree or not. The patch is attached to https://bugzilla.mozilla.org/show_bug.cgi?id=687134 It's not in the tree, and is not in a state to be in tree as it is now. Moreover clean up and better API, we'll likely also need an API to start/stop the counters. -- I really like the JSON idea as well. Otoh could we make the format less tied to the opcodes (by definition very Spidermonkey-specific - maybe even version-specific) and just expose line and column info [1] ? [1]: looks pretty close to land, https://bugzilla.mozilla.org/show_bug.cgi?id=568142
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla11
Blocks: 771118
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: