Closed
Bug 503424
Opened 15 years ago
Closed 15 years ago
TM/nanojit: add built-in support for compiled-trace and -guard profiling
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
status1.9.2 | --- | beta1-fixed |
People
(Reporter: n.nethercote, Assigned: jseward)
References
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(1 file, 12 obsolete files)
(deleted),
patch
|
graydon
:
review+
|
Details | Diff | Splinter Review |
If we want to improve nanojit's codegen, it makes sense to focus on the areas where it will make a difference. So it would be useful to know which are the hot traces. I don't think we have that facility at the moment.
This bug is a proposal to add a facility whereby each compiled trace has a counter that is incremented each time it is executed. Than at termination it would print out a list (and the code) of the hottest blocks (top 10, top 100, whatever), indicating what proportion of trace executions they account for. Then we can look at improving the codegen for those hot blocks. Valgrind has a facility like this and it has proven very useful.
Comment 1•15 years ago
|
||
Me want. It would also be nice to record at least the total time taken (and thus the average) by each trace.
Assignee | ||
Comment 2•15 years ago
|
||
+1 for the block counting. The timing stuff (I presume you mean rdtsc
adding-up) could be useful too; but it could confuse the issue if we
get tied up in delays due to cache misses, which are independent of
the code generation quality, mostly.
Comment 3•15 years ago
|
||
Looks like the right thing to inspect.
Reporter | ||
Comment 4•15 years ago
|
||
Some preliminary design notes:
- We need to add a counter for every trace.
- We need a way of incrementing the counter each time a trace is executed. This code can be added (conditionally, of course) in Assembler::gen() at the end of the loop. It's better to increment at the start of a trace rather than the end so that the trace is still counted even if there's an early exit.
- We need a way of iterating through all the traces at the end, to find the hot ones.
- We need a way of printing out LIR and asm for the hot fragments at the end. So these cannot be disposed of along the way. Or perhaps the asm can be disposed of -- if the LIR is kept we could just re-assemble it in order to print out the native code.
Because the LIR and asm code pages tend to get destroyed as we go (I think; I don't really understand the code memory handling very well) it seems best to store all the necessary stuff in a separate table. For each trace, it might contain:
- the counter
- a copy of the LIR
- a copy of the asm (unless reassembling at the end is easy)
- some info indicating where in the original JavaScript it came from, if possible
Making copies of all that code will be slow, but this doesn't have to be fast, and then we wouldn't have to worry about holding onto LIR/asm code that we would normally dispose of (which would uglify our normal code paths).
Comment 5•15 years ago
|
||
(In reply to comment #4)
> Some preliminary design notes:
>
> - We need to add a counter for every trace.
>
> - We need a way of incrementing the counter each time a trace is executed.
> This code can be added (conditionally, of course) in Assembler::gen() at the
> end of the loop. It's better to increment at the start of a trace rather than
> the end so that the trace is still counted even if there's an early exit.
I had assumed the generation of the counter would be in jstracer.cpp, but I think your proposal is better as it will then be available to all nanojit users.
> - We need a way of iterating through all the traces at the end, to find the hot
> ones.
>
> - We need a way of printing out LIR and asm for the hot fragments at the end.
> So these cannot be disposed of along the way. Or perhaps the asm can be
> disposed of -- if the LIR is kept we could just re-assemble it in order to
> print out the native code.
I often don't need to look at the LIR/asm--JS source location info is often enough. What I have sometimes wanted and never had is description of the trace in terms of the JS source lines it goes through. (I see that you already mention that below, too. I believe it is fairly easy because the tracer knows what JS source position it is at when it records each bytecode.) But I'm sure the LIR/asm will be useful sometimes as well.
> Because the LIR and asm code pages tend to get destroyed as we go (I think; I
> don't really understand the code memory handling very well) it seems best to
> store all the necessary stuff in a separate table. For each trace, it might
> contain:
>
> - the counter
> - a copy of the LIR
> - a copy of the asm (unless reassembling at the end is easy)
> - some info indicating where in the original JavaScript it came from, if
> possible
>
> Making copies of all that code will be slow, but this doesn't have to be fast,
> and then we wouldn't have to worry about holding onto LIR/asm code that we
> would normally dispose of (which would uglify our normal code paths).
I think the copying is entirely reasonable. If it does have an excessive perf cost, which I doubt it will, it can be optional.
Reporter | ||
Comment 6•15 years ago
|
||
(In reply to comment #5)
> I often don't need to look at the LIR/asm--JS source location info is often
> enough.
I assume you're mostly looking at tracing things -- for those working on NJ code generation, the LIR/asm is very important :)
Assignee | ||
Comment 7•15 years ago
|
||
(In reply to comment #4)
Keeping a copy of the LIR and asm around sounds big and complex.
Maybe it's not necessary: all we need to store is an integer key
which uniquely identifies the fragment. When profiling, print
the code etc out in the normal TMFLAGS=blah way, along with the
key. When the profiled code is running, inc counters in the usual
way. When profiled run finished, just print the (key,counter)
pairs. We can then look in the logfile to find the blocks. Not
so convenient, but easier to implement.
Assignee | ||
Comment 8•15 years ago
|
||
(In reply to comment #7)
Am working on a patch to do this. (commenting here so as to check
i'm not duplicating work w/ anyone else -- yell if I am).
Reporter | ||
Comment 9•15 years ago
|
||
(In reply to comment #8)
>
> Am working on a patch to do this. (commenting here so as to check
> i'm not duplicating work w/ anyone else -- yell if I am).
That's what the "assigned" field is for :)
Assignee: general → jseward
Assignee | ||
Comment 10•15 years ago
|
||
Assignee | ||
Comment 11•15 years ago
|
||
(In reply to comment #10)
(description):
Patch adds a simple profiler, which counts, for each fragment,
every time the fragment passes the loopEntry point (gal told
me this is the right place to put the counter increment; I have
no idea).
At the end of the run, the top 50 blocks are shown, one line for
each block, including the percentage of total block runs, and
the cumulative statistics too. This makes it very easy to see
which the hot blocks are.
This of course only works for debug builds. It is integrated
with and controlled by the existing TMFLAGS= mechanism. To get
profiling, you must do TMFLAGS=fragprofile.
Each printed cost line has a "FragID=XXXXXX" part, where XXXXXX
is a 6-digit number. This makes it possible to find the fragment
in the normal debug printing, because those FragID numbers are
printed there too.
Hence the minimal useful invokation is TMFLAGS=fragprofile,minimal ...
Code from both tracer and regexps is profiled.
Only x86 is supported in this patch, although other archs should be
trivial. It's only a matter of implementing Assembler::asm_inc_profctr
to generate the load-inc-store insns.
The implementation is mostly trivial. class Fragment gets two new
fields, profHash (to identify the block) and profCounts, to hold the
count. The only real difficulty is to extract those fields at the
time the Fragment is destroyed. This is difficult because making
Fragment::~Fragment virtual doesn't work, since there are some Fragments
which are really VMFragments, but others within nanojit which are
vanilla Fragments, and hence whose destruction is not observable
outside nanojit. My least-worst solution (in the patch) is to
have, in Fragmento, an optional finalizer callback used when destroying a
Fragment. The profile machinery in jstracer.cpp uses this to harvest
counts any time any Fragment is destroyed.
Example output for trace-tests.js follows.
----------------- Per-fragment execution counts ------------------
Total count = 153260112
0: (32.63% 50004998) 32.63% 50004998 FragID=000893
1: (39.15% 60004996) 6.52% 9999998 FragID=005744
2: (45.68% 70004994) 6.52% 9999998 FragID=005746
3: (50.25% 77012463) 4.57% 7007469 FragID=007873
4: (54.77% 83933739) 4.52% 6921276 FragID=007779
5: (59.28% 90855015) 4.52% 6921276 FragID=007795
6: (63.80% 97776291) 4.52% 6921276 FragID=007836
7: (68.31% 104697567) 4.52% 6921276 FragID=007847
8: (72.83% 111618843) 4.52% 6921276 FragID=007859
9: (74.14% 113631756) 1.31% 2012913 FragID=007747
10: (75.46% 115644552) 1.31% 2012796 FragID=007814
11: (76.77% 117657286) 1.31% 2012734 FragID=007816
12: (78.08% 119660304) 1.31% 2003018 FragID=007737
13: (79.38% 121663210) 1.31% 2002906 FragID=007808
14: (80.61% 123536928) 1.22% 1873718 FragID=007727
15: (81.83% 125410583) 1.22% 1873655 FragID=007804
16: (82.95% 127124279) 1.12% 1713696 FragID=007731
17: (84.06% 128837912) 1.12% 1713633 FragID=007806
18: (85.15% 130493991) 1.08% 1656079 FragID=007821
19: (86.23% 132150062) 1.08% 1656071 FragID=007823
20: (87.31% 133805849) 1.08% 1655787 FragID=007757
21: (88.39% 135461605) 1.08% 1655756 FragID=007759
22: (89.33% 136901602) 0.94% 1439997 FragID=007724
23: (90.27% 138341599) 0.94% 1439997 FragID=007768
24: (91.21% 139781596) 0.94% 1439997 FragID=007784
25: (92.15% 141221593) 0.94% 1439997 FragID=007801
26: (93.08% 142661590) 0.94% 1439997 FragID=007828
27: (94.02% 144101587) 0.94% 1439997 FragID=007839
28: (94.35% 144601585) 0.33% 499998 FragID=001431
29: (94.59% 144961583) 0.23% 359998 FragID=007771
30: (94.82% 145321581) 0.23% 359998 FragID=007787
31: (95.06% 145681579) 0.23% 359998 FragID=007831
32: (95.29% 146041577) 0.23% 359998 FragID=007842
33: (95.52% 146401575) 0.23% 359998 FragID=007850
34: (95.76% 146761573) 0.23% 359998 FragID=007864
35: (95.99% 147121570) 0.23% 359997 FragID=007830
36: (96.23% 147481567) 0.23% 359997 FragID=007841
37: (96.44% 147805522) 0.21% 323955 FragID=007775
38: (96.65% 148129477) 0.21% 323955 FragID=007791
39: (96.86% 148453432) 0.21% 323955 FragID=007833
40: (97.08% 148777387) 0.21% 323955 FragID=007844
41: (97.29% 149101342) 0.21% 323955 FragID=007855
42: (97.50% 149425297) 0.21% 323955 FragID=007869
43: (97.71% 149749250) 0.21% 323953 FragID=007835
44: (97.92% 150073203) 0.21% 323953 FragID=007846
45: (98.13% 150397147) 0.21% 323944 FragID=007778
46: (98.34% 150721091) 0.21% 323944 FragID=007794
47: (98.55% 151045035) 0.21% 323944 FragID=007858
48: (98.77% 151368939) 0.21% 323904 FragID=007872
49: (98.88% 151540897) 0.11% 171958 FragID=007803
Assignee | ||
Updated•15 years ago
|
Attachment #389800 -
Flags: review?(gal)
Reporter | ||
Comment 12•15 years ago
|
||
Nice! This will be extremely useful.
It gives equal weight to every fragment... that's probably good enough to start with.
Some nits about the output:
> 0: (32.63% 50004998) 32.63% 50004998 FragID=000893
> 1: (39.15% 60004996) 6.52% 9999998 FragID=005744
> 2: (45.68% 70004994) 6.52% 9999998 FragID=005746
> 3: (50.25% 77012463) 4.57% 7007469 FragID=007873
I find it easier to read with the timing columns swapped, ie. self-time before cumulative time.
Putting a header line giving each column a title would be good. (Then you wouldn't need "FragID=" on every line.)
For the FragID, is the number in exactly the same form as that printed for the "minimal" output? I see stuff like this:
Recording starting from trace-test.js:299@1610
which doesn't look quite the same as a 6-digit FragId. If it's in exactly the same form, it makes it much easier for Vim users to bounce between the table and the actual fragment (in Vim you can use '#' and '*' to do forward/backward searches for the exact word that the cursor is currently on).
Also, you put "gal" for review rather than "gal@uci.edu"... not sure if that'll work?
Comment 13•15 years ago
|
||
Yeah, gal@uci.edu is my bugzilla account for historic reasons. :gal is the short-hand.
Random style nit: func(arg, arg), not func( arg, arg ). Also not assert ( this ).
Could we do the ranking with a script, and just output raw data? Or maybe write it to a file? The stats gathering and ranking code looks a bit out of place in jstracer.cpp (IMO).
I don't particularly like the entanglement of tracer logging/verbose bits and nanojit. Can we disentangle this a bit? I wonder if we can just use regular lir to model the counters. It would minimally affect register allocation, but probably not noticably. If we are really worried about that we could add LIR_XINC or something like that.
The finalizer hook is also a bit ugly. I wonder whether its necessary.
In general, I think this functionality should be in nanojit. That would solve the entanglement problem and instead of the finalizer the code can be ifd-defed straight into the destructor.
Assignee | ||
Comment 14•15 years ago
|
||
Major update to patch. This not only profiles fragments but also
exits, making it possible to see exactly where we're leaving trace.
Also, for each fragment, it now shows the static exit count and the
static instruction byte count, so as to give a bit of a handle on
generated code bloat.
Is WIP. Do not review. Works well enough to run v8-richards, but
asserts (for debug builds) on trace-tests. Further debugging needed.
Attachment #389800 -
Attachment is obsolete: true
Attachment #389800 -
Flags: review?(gal)
Assignee | ||
Comment 15•15 years ago
|
||
Cleaned up version of the patch in comment #14. Suitable for
general use/testing. Survives trace-test.js. Changes:
* don't assert when Fragments are recycled; handle correctly
* also profile exits created by jsregexp (untested)
* fix space leaks in result-printing code
* minor readability changes in output printing
Attachment #392628 -
Attachment is obsolete: true
Assignee | ||
Comment 16•15 years ago
|
||
Here's an example of what the output now looks like, for perfect.js.
Recommended usage: TMFLAGS=fragprofile,minimal,recorder
First there's a flat profile with one line per frag, containing start
counts for the frag, plus cumulative info, number of static exits and
static instructions for the frag, and the FragID so you can find it
easily in the recorder (or any other phase) output.
Then follows the per-exit count. They're self explanatory. Each
exit is labelled with a GuardID, which is printed as part of LIR,
so again the precise exits are findable in the debug output.
This is a pretty boring example. v8-richards or v8-crypto produce
more interesting results.
----------------- Per-fragment execution counts ------------------
Total count = 2686
0: (72.23% 1940) 72.23% 1940 (se 10, sib 405) FragID=000005
1: (90.80% 2439) 18.58% 499 (se 8, sib 322) FragID=000001
2: (100.00% 2686) 9.20% 247 (se 10, sib 405) FragID=000003
3: (100.00% 2686) 0.00% 0 (se 0, sib 0) FragID=000002
4: (100.00% 2686) 0.00% 0 (se 0, sib 0) FragID=000004
Total static instruction bytes (sib) = 1132, total static exits (se) = 28
Analysis by exit counts
FragID=000005, total count 1940:
GuardID=012 248 (12.78%)
Looped (013) 1692 (87.22%)
FragID=000001, total count 499:
GuardID=009 1 ( 0.20%)
Looped (010) 498 (99.80%)
FragID=000003, total count 247:
GuardID=012 1 ( 0.40%)
Looped (013) 246 (99.60%)
FragID=000002, total count 0:
Looped (000) 0 ( 0.00%)
FragID=000004, total count 0:
Looped (008) 0 ( 0.00%)
Assignee | ||
Updated•15 years ago
|
Summary: TM/nanojit: add counters that count how many times each compiled trace is executed → TM/nanojit: add built-in support for compiled-trace and -guard profiling
Comment 17•15 years ago
|
||
>----------------- Per-fragment execution counts ------------------
>
>Total count = 2686
>
> 0: (72.23% 1940) 72.23% 1940 (se 10, sib 405) FragID=000005
> 1: (90.80% 2439) 18.58% 499 (se 8, sib 322) FragID=000001
May I suggest removing the text labels from the data rows and using column headers instead (the spacing is still bad in this version):
Exec-Count Cum-Exec-Count Static-Exits Bytes Frag-ID
(72.23% 1940) 72.23% 1940 10 405 000005
(90.80% 2439) 18.58% 499 8 322 000001
IMO we currently have a problem with hard-to-read, hard-to-machine-parse debugging output. Your example is better than most of what we have but I want it better yet. :-) I would also submit that the exit count stats are not in fact self-explanatory for someone who doesn't already kind of know what you are doing here.
Assignee | ||
Comment 18•15 years ago
|
||
All previous versions of patch under certain circumstances fail to harvest
profile info from Fragments that are delete'd, giving nonsense profile data
for some programs (not all). Found whilst chasing small testcases related to
v8-richards.
This version fixes it. Also incorporates output formatting suggestions per
dmandelin in comment #16.
Attachment #392693 -
Attachment is obsolete: true
Assignee | ||
Comment 19•15 years ago
|
||
Rebased to TM tip of about four hours back. Expect further
breakage in days to come, given that Fragmento is scheduled
for demolition.
Attachment #392922 -
Attachment is obsolete: true
Assignee | ||
Comment 20•15 years ago
|
||
Attachment #394128 -
Attachment is obsolete: true
Assignee | ||
Comment 21•15 years ago
|
||
.. and add an extra assertion checking consistency of counts.
Attachment #394847 -
Attachment is obsolete: true
Assignee | ||
Comment 22•15 years ago
|
||
Attachment #394943 -
Attachment is obsolete: true
Assignee | ||
Comment 23•15 years ago
|
||
Attachment #396627 -
Attachment is obsolete: true
Comment 24•15 years ago
|
||
(In reply to comment #13)
> Could we do the ranking with a script, and just output raw data? Or maybe write
> it to a file? The stats gathering and ranking code looks a bit out of place in
> jstracer.cpp (IMO).
I don't think this case is really doing *so* much formatting (basically collect, sort, and print the top 50) that it warrants depending on an external processing script (that itself is likely to bitrot). If we were doing histograms and graphs, sure; but this is just a tabular listing. We already have plenty of stat-tabulating code over in js/src: gc stats, scope stats, hashtable stats, etc. Some of it even does histograms :)
> I don't particularly like the entanglement of tracer logging/verbose bits and
> nanojit. Can we disentangle this a bit?
I've filed bug 513339 and bug 513341 to try to pull some of the logging noise out of both jstracer and the rest of nanojit. I agree it's getting a bit overgrown, but gathering up all the chatter and depositing it somewhere is a different task from getting this profiler working in the first place.
> instead of the finalizer the code can be ifd-defed
> straight into the destructor.
Fragments have no destructors anymore; they're arena-allocated and released en-masse. I think it's OK to keep the fragment book-keeping outside as the fragment topology and lifecycle is really only something TM knows about. We're *deleting* that knowledge from NJ presently, as it was insufficient anyway.
Comments on the patch:
- Unless I'm misreading, I think the finalizer function *pointer* stored in the tracemonitor is vestigial from an earlier version of this patch; nobody appears to store into it or call through it, so it should be removed. All finalization calls occur directly to js_FragProfiling_FragFinalizer.
- avmplus::List<> is going away, use nanojit::Seq<>
- Likewise avmplus::SortedMap<>, use nanojit::HashMap<>
- Seems you're incrementally bubble-sorting the final ranked array? We already use std::sort() and qsort() elsewhere in spidermonkey, I'm sure it's OK to here.
- Style: this being C++, you can loop-scope a loop variable.
for (Foo* bar = ...; ...; ...) { ... }
rather than
Foo *bar;
for (...; ...; ...) { ... }
- Style: while (true) rather than while (1)
- Style: void foo() rather than void foo(void)
- Style: as gal said, foo(Bar *b) rather than foo( Bar *bar )
- As of njn's logging work, we're directing output through to js_LogController.printf. Same meaning, but it centralizes control over that stream.
- On line 4823, you're adding a commented-out call to the finalizer. You either want to un-comment this or remove it, I'm not sure which.
- Considering that there's no hash function involved -- they're just sequential identifier-numbers -- rename profHash and guardID in GuardPI, and rename profHash in VMFragment to fragID or fragmentID. Generally try to avoid two structures with same-named fields.
- The use of a static profHashCtr is no good, we're possibly running multithreaded. If you want you can use JS_ATOMIC_INCREMENT, or else store a counter in the (per-thread) trace monitor and copy it into the VMFragment when you construct it.
Otherwise nice work, sorry the lifecycles of everything have been changing underfoot.
Reporter | ||
Comment 25•15 years ago
|
||
(In reply to comment #24)
>
> - As of njn's logging work, we're directing output through to
> js_LogController.printf.
I can't take credit for that, I think it was jseward :)
Assignee | ||
Comment 26•15 years ago
|
||
(In reply to comment #24)
Graydon, thanks for the comments. Should all be no problem to fix,
but one thing I had a query on:
> - The use of a static profHashCtr is no good, we're possibly running
> multithreaded. If you want you can use JS_ATOMIC_INCREMENT, or else store a
> counter in the (per-thread) trace monitor and copy it into the VMFragment when
> you construct it.
I'd love to use JS_ATOMIC_INCREMENT, but that's going to bork for the TR
embedding, no? Could we just copy JS_ATOMIC_INCREMENT into the NJ tree
so as to have an independent copy? I'd be happy to do that, and tidy up.
Comment 27•15 years ago
|
||
Perhaps then you can have the Fragment constructor take a fragID in verbose mode (default to 0) and move the profHashCtr out to jstracer.cpp; then you can use JS_ATOMIC_INCREMENT in the VMFragment constructor call to the Fragment constructor.
Comment 28•15 years ago
|
||
Atomic operations are very slow. Make sure you know what you are doing. That might skew measurements significantly. Also, we _definitively_ don't want atomic profiling updates in product builds ever (I overheard we might want to run more profiling all the time).
Comment 29•15 years ago
|
||
Yeah. The other option -- a per-TraceMonitor counter -- makes more sense to me anyway. We always have a TraceMonitor handy when we're constructing a VMFragment. Just pass one in.
Assignee | ||
Comment 30•15 years ago
|
||
Attachment #396789 -
Attachment is obsolete: true
Assignee | ||
Comment 31•15 years ago
|
||
Attachment #397852 -
Attachment is obsolete: true
Assignee | ||
Comment 32•15 years ago
|
||
(In reply to comment #31)
> Created an attachment (id=399823) [details]
> rebased; w/ correct code size measurements, & addresses review comments
Rebased w/ much help from Graydon, following deletion of
Fragmento and of distinct loop-top labels. Also incorporates
Graydon's fix for correct code size measurement as taken from
bug 512744.
Assignee | ||
Comment 33•15 years ago
|
||
Attachment #399823 -
Attachment is obsolete: true
Assignee | ||
Updated•15 years ago
|
Attachment #400188 -
Flags: review?(graydon)
Assignee | ||
Comment 34•15 years ago
|
||
(In reply to comment #33)
> updated to address all remaining review issues
Regarding this, from comment #24
> - Seems you're incrementally bubble-sorting the final ranked array?
> We already use std::sort() and qsort() elsewhere in spidermonkey,
> I'm sure it's OK to here.
This is sort-of true. But the final ranked array is only N_TOP_BLOCKS
in size, regardless of how many Fragments are processed. This avoids
constructing an array with potentially thousands of entries, one for
each Fragment, sorting it, and throwing away all but the top 50
entries. Since the array is of fixed size, the cost for processing N
Fragments is O(50 x N), which isn't too bad given that this is
non-production code.
So I've left it as-is.
Assignee | ||
Comment 35•15 years ago
|
||
Attachment #400188 -
Attachment is obsolete: true
Attachment #400188 -
Flags: review?(graydon)
Comment 36•15 years ago
|
||
Comment on attachment 400791 [details] [diff] [review]
rebased; made thread-safe; conform to house style; minor tidying
Thanks for your patience during the review cycle. I'll land from here.
Attachment #400791 -
Flags: review+
Comment 37•15 years ago
|
||
"I don't particularly like the entanglement of tracer logging/verbose bits and
nanojit. Can we disentangle this a bit? I wonder if we can just use regular lir
to model the counters. It would minimally affect register allocation, but
probably not noticably. If we are really worried about that we could add
LIR_XINC or something like that."
The comment above was not addressed as far as I can tell. The checkin breaks x86-64 and sparc (dvander and leon will fix those quickly I am sure). Review from adobe for that many NJ changes would have been nice too. This isn't an r- comment. Just saying that the patch could be nicer. We can probably evolve it further as we go along. It seems useful enough to land it now instead of constantly rebasing it.
Assignee | ||
Comment 38•15 years ago
|
||
(In reply to comment #37)
> "I don't particularly like the entanglement of tracer
> logging/verbose bits and nanojit. Can we disentangle this a bit?
It's less entangled than it was previously. Now, Fragment and
GuardRecord have counter fields, and nanojit knows to add code to
increment them if requested. You have to tell NJ where the loop label
of a trace is, and give it a unique identifier for each new Fragment,
but that's about it. The problem of extracting and presenting the
results is all on the TM side.
> I wonder if we can just use regular lir to model the counters.
That leads to an inconsistency, in which the fragment-entry counter
inc can be expressed in LIR, but the exit inc, which is part of the
exit block, can't be. At least, not without generating contorted LIR
which reads the condition just before an guard, and if true,
increments the relevant exit counter. And that would lead to a
situation where the output of each LIR_ov would be used twice, which
IIUC would break at least the x86 backend.
> The checkin breaks x86-64 and sparc
I don't have a sparc box to test on, and up till now x86-64 hasn't
been runnable from the tree.
> Review from adobe for that many NJ changes would have been nice too.
Yes, my bad there.
Comment 39•15 years ago
|
||
Phew!
http://hg.mozilla.org/tracemonkey/rev/ce354a3e3b47 for initial checkin,
http://hg.mozilla.org/tracemonkey/rev/72d4907df250 to fix the windows build (mistakenly breaking *every* browser build, because browser builds have different ... header visibility?), and so finally
http://hg.mozilla.org/tracemonkey/rev/70f5a2e93b94
A little more multi-platform testing would be good in the future.
Whiteboard: fixed-in-tracemonkey
Comment 40•15 years ago
|
||
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Comment 41•15 years ago
|
||
Comment 42•15 years ago
|
||
Updated•15 years ago
|
Flags: wanted1.9.2?
Comment 43•15 years ago
|
||
status1.9.2:
--- → beta1-fixed
Updated•15 years ago
|
Flags: wanted1.9.2? → wanted1.9.2+
Comment 44•15 years ago
|
||
js/src/trace-test/tests/basic/arith.js
js/src/trace-test/tests/basic/testRegExpTest.js
js/src/trace-test/tests/basic/testShiftLeft.js
js/src/trace-test/tests/basic/testSideExitInConstructor.js
Flags: in-testsuite+
You need to log in
before you can comment on or make changes to this bug.
Description
•