Closed
Bug 606629
Opened 14 years ago
Closed 13 years ago
JM: Fine-grained profiling
Categories
(Core :: JavaScript Engine, enhancement)
Tracking
()
RESOLVED
FIXED
People
(Reporter: sfink, Assigned: sfink)
References
Details
(Whiteboard: [fixed-in-tracemonkey])
Attachments
(4 files, 3 obsolete files)
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
dmandelin
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
dmandelin
:
review-
|
Details | Diff | Splinter Review |
(deleted),
patch
|
dmandelin
:
review+
|
Details | Diff | Splinter Review |
In trying to track down a performance problem, I added some expensive profiling options to JM.
One increments a counter for every PC accesses, and then dumps out the bytecode stream with call counts. (It would be much simpler to turn off JITting and do this directly in the interpreter, but I was experimenting.)
Another maintains a per-script histogram of opcode execution counts for mjitted code. This could be useful for figuring out what sorts of things the mjit does badly with.
What I was trying to work up to, but didn't quite make it yet, was tracking all of the various slow(er) paths through JM code. For example, it would be interesting to track how many times each guard is hit, and even more interesting to aggregate those into types of guards.
The attached patch is just a proof of concept. It leaks memory, the dump trigger is all wrong, etc. But be gentle with me; it's my first time doing any masm code.
It also depends on my earlier comments patch, but I think it's just to demarcate my hacked-in profile incrementers in the jaegerspew output. It should be trivial to rip those 2 lines part out and apply to the current tree.
Comment 1•14 years ago
|
||
I started bug 587963 for JM profiling. Unfortunately I haven't got very far, my nascent patch isn't even worth posting here. The plan was for either per-function or per-basic-block profiling.
Assignee | ||
Comment 2•14 years ago
|
||
dmandelin: feel free to suggest another reviewer if someone else would be more appropriate.
I cleaned things up a bit. One thing that is perhaps uglier than it ought to be is that because I am annotating the disassembly output, and disassembly creates GC-able strings and things, I can't just kick off the profiling dump in JS_ShutDown or mjit::ReleaseScriptCode. So I have the shell invoking them before it's torn down too much stuff. This has a major drawback that it only works for the shell, not for the browser. It's also kind of messy.
The JSOp histogram could be done at shutdown since it doesn't rely on the disassembly, but it's also of questionable utility so I didn't bother. (I threw it in because I thought it might be interesting to correlate benchmark values with JSOp counts.)
Alternative approaches:
1. Stop using the disassembly and just dump out offset,count pairs that could be combined manually (or with a script) with the other JMFLAGS output to figure out what goes with what.
2. Hook into JSContext teardown or compartment teardown and dump out all scripts there. I attempted this, but I still couldn't manage to hit it when it was in the right state.
3. Recreate enough of the world to allow dumping, then tear it back down.
4. Rewrite disassembly to use only private memory.
5. Leave as-is. If someone wants this for the browser, they'll have to add calls at the appropriate times.
6. Dump out a profile whenever a toplevel Execute() ends, and zero out the counters.
One additional problem with the current state is that JMFLAGS=help lies if you aren't using the JS shell, because it claims that these additional profile options are available when they aren't, really.
njn: I ended up not doing anything additional for basic block profiling or function call profiling. That's mostly because I realized that the disassembly version already contains the filename *and line numbers*, so it's pretty easy to map them back to the source code. I added a header line to the disassembly to make it easier to see what's what.
So function entry counts == number of times the 1st instruction after "main:" is executed, and basic block counts == the count at the 1st instruction of each basic block. I lose the information on which instructions begin a basic block (opinfo.safePoint as per dvander via njn), and the profiling is more expensive than it could be if only basic blocks incremented the counters, so more work might make sense, but I don't know if the incremental win is worth it.
I'll wait for any comments before doing any more work on this.
Assignee: general → sphink
Attachment #485450 -
Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #486523 -
Flags: review?(dmandelin)
Attachment #486523 -
Flags: feedback?(nnethercote)
Assignee | ||
Comment 3•14 years ago
|
||
Comment 4•14 years ago
|
||
Before I review this: We already have code for getting op histograms. There is the opmeter stuff in there, which does exactly that; and the reprmeter stuff, which does a histogram of (op, arg1_type, ..., argk_type). So that part seems redundant.
The PC profiling is a good idea.
Assignee | ||
Comment 5•14 years ago
|
||
Sounds like a very valid r- reason. This patch rips out the histogram. I was ambivalent about it anyway. We'll have to talk sometime about what types of expensive profiling would be helpful (eg PIC hit counts? slow paths?).
Attachment #486523 -
Attachment is obsolete: true
Attachment #486700 -
Flags: review?(dmandelin)
Attachment #486700 -
Flags: feedback?(nnethercote)
Attachment #486523 -
Flags: review?(dmandelin)
Attachment #486523 -
Flags: feedback?(nnethercote)
Comment 6•14 years ago
|
||
Comment on attachment 486700 [details] [diff] [review]
JaegerSpew PC profiling option
Basically looks good. I just have some nits.
> /* If pc != NULL, includes a prefix indicating whether the PC is at the current line. */
Please document what the |counts| parameter means here.
> JS_FRIEND_API(JSBool)
>-js_DisassembleAtPC(JSContext *cx, JSScript *script, JSBool lines, FILE *fp, jsbytecode *pc)
>+js_DisassembleAtPC(JSContext *cx, JSScript *script, JSBool lines, FILE *fp, jsbytecode *pc, int* counts)
> {
> jsbytecode *next, *end;
> uintN len;
Now time for the column header bikeshedding. I think "freqncy" should be "count" (seems more apt), and "offst" should be "off" (more idiomatic to SpiderMonkey code). Maybe "JSOp" should be "op", but I'm neutral on that.
>+ if (counts)
>+ fputs("freqncy x ", fp);
>+ fputs("offst ", fp);
>+ if (lines)
>+ fputs("line", fp);
>+ fputs(" JSOp\n", fp);
>+ if (counts)
>+ fputs("-------- ", fp);
>+ fputs("----- ", fp);
> #ifdef JS_METHODJIT_SPEW
>+#ifdef DEBUG
#if defined(DEBUG) && defined(JS_METHODJIT_SPEW) seems better to me.
>+ if (IsJaegerSpewChannelActive(JSpew_PCProf)) {
>+ pcProfile = (int *)cx->malloc(sizeof(int) * script->length);
Put an OOM check here. I know it's not that important for profiling code but we should maintain good habits there.
>+ memset(pcProfile, 0, script->length * sizeof(int));
>+ }
>+#endif
>+
>--- a/js/src/methodjit/MethodJIT.cpp
>+++ b/js/src/methodjit/MethodJIT.cpp
>@@ -833,23 +833,31 @@ mjit::JITScript::release()
> void
> mjit::ReleaseScriptCode(JSContext *cx, JSScript *script)
> {
> // NB: The recompiler may call ReleaseScriptCode, in which case it
> // will get called again when the script is destroyed, so we
> // must protect against calling ReleaseScriptCode twice.
>
> if (script->jitNormal) {
>+#ifdef DEBUG
>+ if (script->jitNormal->pcProfile)
>+ cx->free(script->jitNormal->pcProfile);
I find that debugging is easier if we zero out members after freeing them.
>+#endif
> script->jitNormal->release();
> script->jitArityCheckNormal = NULL;
> cx->free(script->jitNormal);
> script->jitNormal = NULL;
> }
>
> if (script->jitCtor) {
>+#ifdef DEBUG
>+ if (script->jitCtor->pcProfile)
>+ cx->free(script->jitCtor->pcProfile);
Zeroing again.
>+#endif
> script->jitCtor->release();
> script->jitArityCheckCtor = NULL;
> cx->free(script->jitCtor);
> script->jitCtor = NULL;
> }
> }
>diff --git a/js/src/methodjit/MethodJIT.h b/js/src/methodjit/MethodJIT.h
>--- a/js/src/methodjit/MethodJIT.h
>+++ b/js/src/methodjit/MethodJIT.h
>@@ -222,16 +222,20 @@ struct JITScript {
> char *jcheck = (char *)ptr;
> return jcheck >= jitcode && jcheck < jitcode + code.m_size;
> }
>
> void sweepCallICs();
> void purgeMICs();
> void purgePICs();
> void release();
>+
>+#ifdef DEBUG
>+ int *pcProfile;
Please comment about what this is. In particular, what the length is, and what pcProfile[i] means.
For non-debug code, I would also ask for this to be replaced by a C++ class that either knows the length, or requires you to pass in an argument that gives the length. That might be overkill for relatively simple debug/profiling code, though.
>+#endif
I forgot this part above:
>+ masm.move(ImmPtr(&pcProfile[0]), r1);
I think simply |pcProfile| is more idiomatic in SpiderMonkey. (Aside: ptr arithmetic also seems like the more usual style if you want a pointer to element k, although I personally like to see the array syntax there.)
Attachment #486700 -
Flags: review?(dmandelin)
Assignee | ||
Comment 7•14 years ago
|
||
>> #ifdef JS_METHODJIT_SPEW
>>+#ifdef DEBUG
>
>#if defined(DEBUG) && defined(JS_METHODJIT_SPEW) seems better to me.
The body of the #ifdef JS_METHODJIT_SPEW was actually wider than the inner #ifdef DEBUG, but I changed it to two separate sections and it's better now. Ok, you're right.
>>+#ifdef DEBUG
>>+ if (script->jitNormal->pcProfile)
>>+ cx->free(script->jitNormal->pcProfile);
>>+#endif
>> script->jitNormal->release();
>> script->jitArityCheckNormal = NULL;
>> cx->free(script->jitNormal);
>> script->jitNormal = NULL;
>
>I find that debugging is easier if we zero out members after freeing them.
I had that originally, but took it out because script->jitNormal gets NULLed a couple of lines later. Oh well. I put it back.
>>+ masm.move(ImmPtr(&pcProfile[0]), r1);
>
>I think simply |pcProfile| is more idiomatic in SpiderMonkey. (Aside: ptr
>arithmetic also seems like the more usual style if you want a pointer to
>element k, although I personally like to see the array syntax there.)
My personal habit is to use &foo[0] as documentation -- foo is meant to be an array of values, not a pointer to a single value, and more specifically we're doing something with an element of an array. And also it makes me nervous that C is implicitly converting arrays to pointers for me (it's a good thing, but I like the minimal documentation of it. Not that this is relevant in this case.)
But I'd rather conform to house style. Changed.
Thanks!
Attachment #486700 -
Attachment is obsolete: true
Attachment #486797 -
Flags: review?(dmandelin)
Attachment #486700 -
Flags: feedback?(nnethercote)
Comment 8•14 years ago
|
||
(In reply to comment #7)
> >I find that debugging is easier if we zero out members after freeing them.
>
> I had that originally, but took it out because script->jitNormal gets NULLed a
> couple of lines later. Oh well. I put it back.
Well, then it was OK to begin with--it shouldn't matter if you can't reach that field anymore.
> >>+ masm.move(ImmPtr(&pcProfile[0]), r1);
> >
> >I think simply |pcProfile| is more idiomatic in SpiderMonkey. (Aside: ptr
> >arithmetic also seems like the more usual style if you want a pointer to
> >element k, although I personally like to see the array syntax there.)
>
> My personal habit is to use &foo[0] as documentation -- foo is meant to be an
> array of values, not a pointer to a single value, and more specifically we're
> doing something with an element of an array. And also it makes me nervous that
> C is implicitly converting arrays to pointers for me (it's a good thing, but I
> like the minimal documentation of it. Not that this is relevant in this case.)
>
> But I'd rather conform to house style. Changed.
And, I was wrong about that, too--I rechecked and grep gave me a few dozen examples of &foo[0].
Anyway, I'm about to r+ your patch. Feel free to change these two things back to the way you like.
Updated•14 years ago
|
Attachment #486797 -
Flags: review?(dmandelin) → review+
Assignee | ||
Comment 9•14 years ago
|
||
Whoops, I was too slow. I'll leave it alone.
To the checkin-needed angel: a=NPOTB
Keywords: checkin-needed
Assignee | ||
Comment 10•14 years ago
|
||
Ugh. There's a problem with this patch: it changes the jsopcode.h API for the (DEBUG-only) symbols js_Disassemble and js_Disassemble1. Specifically, I added the extra 'counts' parameter with a default of NULL:
extern JS_FRIEND_API(JSBool)
js_Disassemble(JSContext *cx, JSScript *script, JSBool lines, FILE *fp, int* counts = NULL);
I figured this was safe because the old API is still present, but this header gets compiled as C code by jsd and therefore errors out during the compile. (I'd tested with release builds of the whole tree, and debug builds of just js/src, so I missed this earlier.)
I'll attach two working alternatives.
Keywords: checkin-needed
Assignee | ||
Comment 11•14 years ago
|
||
This patch adds js_DisassembleWithCounts and js_Disassemble1WithCounts to avoid mucking up the C API. It's kind of ugly and messy, but is the safest way to avoid breaking any callers.
I'm marking both patches for review, so you can r- (at least) one of them.
Attachment #489881 -
Flags: review?(dmandelin)
Assignee | ||
Comment 12•14 years ago
|
||
Alternatively, since this API is (1) DEBUG-only and (2) not called from anything in our tree when compiled as C code, we could just say that it's C++ only and be done with it.
For reference, there are a handful of js_Disassemble callers within js/src, all of them C++, and no callers outside. js/jsd/ includes the header in C source code, but contains no js_Disassemble* calls.
Opinions?
Attachment #489885 -
Flags: review?(dmandelin)
Comment 13•14 years ago
|
||
Comment on attachment 489885 [details] [diff] [review]
Restrict js_Disassemble to C++ code
It's an internal API, so I say let it be C++.
Attachment #489885 -
Flags: review?(dmandelin) → review+
Comment 14•14 years ago
|
||
Comment on attachment 489881 [details] [diff] [review]
Add to the C API of js_Disassemble
r-ing in favor of the other patch.
Attachment #489881 -
Flags: review?(dmandelin) → review-
Comment 15•14 years ago
|
||
The thing about our old C friend APIs is that they are old... and "friendly" (like too friendly).
OTOH if we say "C++ only" I think we ought to have good reason, like it uses RAII with dtors and such.
Just changing an entirely C-flavored friend API to C++ only may break more than jsd, in downstream embeddings. We can do this, for sure, but is it doing real good (e.g., improving the API based on C++'s features that aren't in C)?
/be
Assignee | ||
Comment 16•14 years ago
|
||
Just to be sure we're talking about the same thing -- this change does not make the whole API C++ only. It only makes js_Disassemble and js_Disassemble1 C++-only, which are "internal" already because they're only available when DEBUG is defined.
So for a release build, the API does not change at all, whether used from C or C++.
Does that seem reasonable to you?
Comment 17•14 years ago
|
||
(In reply to comment #15)
> The thing about our old C friend APIs is that they are old... and "friendly"
> (like too friendly).
>
> OTOH if we say "C++ only" I think we ought to have good reason, like it uses
> RAII with dtors and such.
I figured that internal APIs are going toward C++, and that making it nicer for internals hackers with a default argument would be worth it.
> Just changing an entirely C-flavored friend API to C++ only may break more than
> jsd, in downstream embeddings. We can do this, for sure, but is it doing real
> good (e.g., improving the API based on C++'s features that aren't in C)?
The one question I had for myself was, how internal is this API? If it's fully internal, then I would think going to C++ has no cost. But if it's one of those that has leaked out, then of course we have to think differently. I don't know how to tell which is which, but I figured if someone complained, we could make it usable from C again. Is there a good way to know what's *really* internal?
Assignee | ||
Comment 18•14 years ago
|
||
Comment on attachment 489885 [details] [diff] [review]
Restrict js_Disassemble to C++ code
Need to resolve the API question before landing, so pestering Brendan with sr?
Attachment #489885 -
Flags: superreview?(brendan)
Comment 19•14 years ago
|
||
Comment on attachment 489885 [details] [diff] [review]
Restrict js_Disassemble to C++ code
Sorry, was not looking for sr?me -- you don't need SR in js/src.
/be
Attachment #489885 -
Flags: superreview?(brendan)
Assignee | ||
Comment 20•14 years ago
|
||
Comment on attachment 489885 [details] [diff] [review]
Restrict js_Disassemble to C++ code
Ok, sorry. I stumbled across the way things are done here, I guess.
My problem is that I don't want to land this patch until your (brendan's) API concerns are settled.
It's not a high priority thing to land. But I wanted to mark it appropriately so I know when it's landable.
Perhaps there's a better API that would make it more worth changing? As in, a more generic way to pass in additional information that would eliminate the need to change the API when adding things like 'counts'?
Attachment #489885 -
Flags: review?(brendan.eich)
Assignee | ||
Comment 21•14 years ago
|
||
Comment on attachment 489885 [details] [diff] [review]
Restrict js_Disassemble to C++ code
Oops, wrong address.
Attachment #489885 -
Flags: review?(brendan.eich) → review?(brendan)
Comment 22•14 years ago
|
||
Comment on attachment 489885 [details] [diff] [review]
Restrict js_Disassemble to C++ code
I don't care enough to cause further change -- dmandelin's r+ is enough, any C downstreams that use js_Disassemble will have to take the pain.
/be
Attachment #489885 -
Flags: review?(brendan)
Assignee | ||
Comment 23•13 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/b6c3be49412b
Note that after all that, I intend to replace this mechanism with the one from bug 637393, which reverts the API change since I move the necessary information to the JSScript.
Whiteboard: [fixed-in-tracemonkey]
Assignee | ||
Comment 24•13 years ago
|
||
Whiteboard: [fixed-in-tracemonkey]
Assignee | ||
Comment 25•13 years ago
|
||
Whiteboard: [fixed-in-tracemonkey]
Comment 26•13 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/b6c3be49412b
http://hg.mozilla.org/mozilla-central/rev/5e69cb4c544b
http://hg.mozilla.org/mozilla-central/rev/86da0588d96d
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•