Closed Bug 400902 Opened 17 years ago Closed 17 years ago

Specialized GC arena for double values

Categories

(Core :: JavaScript Engine, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED

People

(Reporter: igor, Assigned: igor)

References

Details

(Keywords: perf)

Attachments

(3 files, 45 obsolete files)

(deleted), patch
Details | Diff | Splinter Review
(deleted), text/plain
Details
(deleted), patch
igor
: review+
Details | Diff | Splinter Review
Currently GC to allocate double values uses a common GC arena structure. This is inefficient for the following reasons. 1. Since the minimal allocation cell for GC arena is 2 CPU words, this allocates almost twice memory then necessary on 64 bit CPU. 2. It adds the finalization phase for double values. Such phase is not necessary since the double values do not have a finalizer. As such it should be possible to discover free double cells during the allocation. 3. It uses one byte of GC flags per jsdouble when with a specialized arena only the single bit to denote marked/used double cell is necessary. 4. A specialized GC arena should allow to minimize the number of ifs taken during the allocation and speed-up numerically-intensive scripts. Thus it would nice to support such specialized GC arena for double values.
Attached patch v 0.5 (obsolete) (deleted) — Splinter Review
Here is a patch prototype that implements a specialized GC arena for double values using one bit per double for bookkeeping besides JSGCArenaInfo. In addition the patch provides function to allocate double values bypassing weak roots when the caller already has a strong root. For numerically intensive test case like: function test(loops, N) { var sum = 0.0; do { for (var j = 0; j != N; ++j) sum += 1 / (j + 0.1); gc(); } while (--loops != 0); return sum; } var sum = test(40, 20*1000); print(sum); the patch speedups its execution using an optimized build of js shell on my 1.1 GHz Pentium-M laptop with Fedora Core 7 from 804 ms down to 656 ms or by 18% according to the best of 5 runs using: /bin/bash -c 'time ~/m/trunk/mozilla/js/src/Linux_All_OPT.OBJ/js ~/m/b.js > /dev/null' I have not done any testing of the patch.
Attached patch v 0.51 (obsolete) (deleted) — Splinter Review
This is the previous patch with a small optimization to clear marked bits for double arenas only when there is at least one live double value. It decreased the run time for the test down to 631ms or made it faster by 21%.
Attachment #285948 - Attachment is obsolete: true
Keywords: perf
Recording patch dependency: using just one bit of flags per double requires hiding js_GetGCThingFlags from the rest of code.
Depends on: 400687
Attached patch v 0.6 (obsolete) (deleted) — Splinter Review
New version cleanups usage of various number allocation functions to make it explicit when function weakly roots its result or when it requires a rooted pointer to store the result.
Attachment #285949 - Attachment is obsolete: true
Attached patch v 0.9 (obsolete) (deleted) — Splinter Review
Patch with fixes and cleanups that passes the test suite and bloat tests.
Attachment #286097 - Attachment is obsolete: true
Attached patch v 0.91 (obsolete) (deleted) — Splinter Review
The new version fixed a bug in js_IsAboutToBeFinalized code that caused crashes after using GMail for a while.
Attachment #286221 - Attachment is obsolete: true
the last patch made the test from bug 161110 comment 1 to run faster by 11%. This is with an optimized build of JS shell under Fedora 7 on Pentium-M 1.1GHz laptop. Note that the patch when run under the shell does not measure the performance of GC, it is affected only by the allocation time of double values.
Attached patch v 0.92 (obsolete) (deleted) — Splinter Review
The new version fixes the assert in gc_root_traversal to take into account the list of specialized arenas for double values.
Attachment #286223 - Attachment is obsolete: true
I did measurements with n-body.js. Using a patch for bug 401254 I set the maximum GC heap to various sizes. Given the structure of the benchmark it effectively gives the amount of memory consumed by newly allocated double values before the GC runs. I run the benchmark using ~/m/trunk/mozilla/js/src $ /bin/time ./Linux_All_OPT.OBJ/js -e 'gcparam("maxBytes", N)' ~/m/n-body.js 50000 where N is given in the table bellow. On my 1.1 GHz Pentium-M laptop under Fedora 7 I got for an optimized build of js shell N 20K 24K 32K 320K 3200K base 15.0 12.2 10.6 9.2 9.3 with patch 16.7 12.2 10.2 8.6 8.7 difference -11% 0% 4% 7% 6% The slow down at low N is expected. The JS initialization/compilation of the script would take 5 4K arenas, 1 for objects, 1 for functions and 3 for double and strings. Without the patch the string and double GC things are allocated from the single arena which gives 2 full and one almost empty arena. So the engine has almost the whole arena for allocation of doubles before it need to run GC. With the patch the string and double arenas no longer shared so one end up with one full and one almost full string arenas and one half full double arena. That arena has less room to allocate the double things before the GC runs. Thus the GC runs more frequently. Since the GC time in this case is dominated by string scanning the slow down is inevitable. For big N (the one that happens in practice) N the speedup is 6-7%.
Attached patch v0.92b (obsolete) (deleted) — Splinter Review
Update the depends on the patch from bug 400687 comment 7.
Moving to blocking list because of perf
Flags: blocking1.9+
Attached patch v 0.95 (obsolete) (deleted) — Splinter Review
The new version comes with better comments.
Attachment #286285 - Attachment is obsolete: true
Attachment #287412 - Attachment is obsolete: true
Priority: -- → P2
In reply to some concerns raised in email: > Take, for instance, bug 400902: the bug report doesn't have > memory allocation profiles to indicate how much improvement this would be > for real-world JS; we need do those measurements before approving/landing > work like this. This bug was motivated by poor performance of numerically intensive scripts. It should not affect the memory profile as from my experience there are very few long-term heap-allocated numbers in SM. For example, after the startup that number was less than 100 on a build couple of month ago and after having 10 or so pages open that stayed below 200. On the other hand for numerically intensive scripts it reduces the heap usage by 10% and make scripts run faster. I will redo the measurements and add a comment/profiles to the bug.
I'm in favor of Igor's approach here. If it can be done for small strings too and show measurable wins on benchmarks *and* Firefox tinderbox'ed metrics (Ts, Tdhtml, etc.), it should go into 1.9. Numbers stored as doubles and small strings are leaves in the live thing graph, so less scary to optimize than objects -- assuming GC safety (rooting) models do not change. /be
Attached patch counter of double values (deleted) — Splinter Review
This a patch against a trunk to measure the amount of double values during a browser run. After each GC the patch adds a line to log describing how many values were allocated between GC runs, how many survived GC and how many were freed. Here is a typical profile (I annotated few initial events): new marked freed 28, 21, 7 28, 39, 10 51, 64, 26 startsup 92, 112, 44 10, 121, 1 1, 121, 1 1, 121, 1 after idle 267, 177, 211 192, 176, 193 227, 169, 234 1, 169, 1 gmail started 254, 246, 177 60, 211, 95 11, 211, 11 1, 211, 1 187, 274, 124 53, 300, 27 489, 470, 319 124, 491, 103 80, 491, 80 93, 439, 145 0, 439, 0 82, 213, 308 0, 213, 0 196, 213, 196 0, 213, 0 214, 249, 178 110, 285, 74 21, 284, 22 228, 320, 192 720, 321, 719 804, 352, 773 8, 353, 7 3963, 1901, 2415 137, 1901, 137 1848, 1901, 1848 22, 1907, 16 873, 384, 2396 1, 384, 1 17, 384, 17 121, 420, 85 74, 438, 56 10, 438, 10 1, 438, 1 13, 448, 3 0, 448, 0 1517, 448, 1517 127, 473, 102 22, 473, 22 21, 473, 21 1, 473, 1 24, 473, 24 1, 473, 1 1, 473, 1 1, 473, 1 2, 473, 2 1, 473, 1 1, 473, 1 The numbers confirms that there are relatively few long-term double values.
here more numbers after running random benchmarks with tests for numbers: new marked freed ... running http://www.computerbytesman.com/js/jsbench/index.htm 1, 459, 1 52705, 462, 52702 8861, 460, 8863 26, 460, 26 running http://celtickane.com/projects/jsspeed.php 2, 447, 2 1055, 474, 1028 110045, 472, 110047 1109, 471, 1110 1, 471, 1
Igor we still making progress here? Beta2 freeze is coming up - this ready for that our should we get into b3?
(In reply to comment #17) > Igor we still making progress here? Beta2 freeze is coming up - this ready for > that our should we get into b3? > I need to add more comments top the patch. Otherwise it is ready and I will ask for a review within day or two.
Igor - ping - sorry to be a pest just want to get this in well ahead of b3 to shake stuff out..
Attached patch v1 (quilt patch) (obsolete) (deleted) — Splinter Review
Here is a patch update to sync with CVS. It depends on the patch from bug 413744 comment 6.
Attachment #288593 - Attachment is obsolete: true
Attached patch v1 (obsolete) (deleted) — Splinter Review
This is a patch against CVS. It would be interesting to know how does it affects the benchmarks.
Attachment #300102 - Attachment is obsolete: true
Attached file sunspider in js shell (obsolete) (deleted) —
I think the slowdowns in crypto and date benchmarks are concerning, but otherwise the wins we expected are there.
Attached patch v2 (obsolete) (deleted) — Splinter Review
I should have learned that lengthy bit manipulation is expensive even if it does not have any branches. Thus the new patch constructs a free list for doubles to do the bit manipulation only once in a tight loop.
Attachment #300148 - Attachment is obsolete: true
svn checkout http://svn.webkit.org/repository/webkit/trunk/SunSpider SunSpider cd SunSpider ./sunspider --shell=/path/to/js
Attached patch v3 (obsolete) (deleted) — Splinter Review
Apparently the benchmarks is rather sensitive to the structure of STORE_(NUMBER|INT|UINT) macros. The new version optimizes them to get a better performance.
Attachment #300200 - Attachment is obsolete: true
Attached patch v4 (obsolete) (deleted) — Splinter Review
The new patch uses uint8, not jsuword, as a base bitmap unit for double flags. In this way the unit corresponds to 8 doubles of 64 bytes. That corresponds to one cache line on the latest CPUs or two lines on a relatively recent CPU. Since the allocation of double values tends to come in bursts, in most cases the flag byte have all its bits unset meaning that the free list construction puts all the 8 bytes into the list. That should be very quick due to cache friendliness of the code.
Attachment #300287 - Attachment is obsolete: true
Attached file sunspider in js shell (thread-safe build) (obsolete) (deleted) —
Here is the results for an optimized build of the js shell compiled with -Os and JS_THREADSAFE. I used 30 runs instead of default 10 as otherwise there are way too much noise on my 1.1GHz Pentium-M laptop under Fedora8.
Attachment #300155 - Attachment is obsolete: true
I plan to ask for a review of the latest patch later today after testing it better.
Status: NEW → ASSIGNED
Attached patch v5 (obsolete) (deleted) — Splinter Review
The new version replaces: -extern JSBool -js_NewDoubleValue(JSContext *cx, jsdouble d, jsval *rval); +extern jsval +js_NewUnrootedDoubleValue(JSContext *cx, jsdouble d); where JSVAL_NULL is used as an error indicator. The also moved the operation counter incrementing to RefillDValFreeList to do bulk increments. It makes SunSpider faster over the previous
Attachment #300364 - Attachment is obsolete: true
Attached file sunspider in js shell (thread-safe build) (obsolete) (deleted) —
This is an updated result for for an optimized build of the js shell compiled with -Os and JS_THREADSAFE with --runs=30.
Attachment #300365 - Attachment is obsolete: true
Apparently the patch managed to speedup some benchmarks by over 40% : nsieve-bits: 1.53x as fast sha1: 1.41x as fast
Attached patch v6 (obsolete) (deleted) — Splinter Review
Here is version for a review. Patch consists of 2 parts: The first part adds a special GC arena for double values. Since a double value do not need a finalizer, the patch uses just one bit of flags for each double to indicate that the corresponding GC cell is occupied. The bits are reset during the marking phase of garbage collection. If after the marking phase a GC arena does not have any bits set, it is is released. Otherwise it is added to the global list of dval arenas. Later the allocator (see RefillDValFreeList) scans the flags area of each arena on the list looking for free cells. If necessary, a new arena is allocated and the last ditch GC is run to get more space for double values. The scanning code stops when it finds a byte with at least one bit unset indicating that among 8 corresponding cells at least one is unused. After that the code puts all free cells from this chunk of 8 cells to the context-local list. The code also sets all the bits in the flag byte to 1 to avoid setting them later when allocating the double values. The second part of the patch changes internal API for allocating double values to be as direct as possible. In particular, the patch adds: /* * GC-allocate a new jsdouble number. Returns JSVAL_NULL when the allocation * fails. Otherwise the caller must root the result immediately after the * the call. */ extern jsval js_NewUnrootedDoubleValue(JSContext *cx, jsdouble d); The implementation just takes values from the context-local list delegating all the rooting to the caller.
Attachment #300387 - Attachment is obsolete: true
Attachment #300483 - Flags: review?(brendan)
Attached patch diff -b for v6 (obsolete) (deleted) — Splinter Review
Attached patch v7 (obsolete) (deleted) — Splinter Review
The new version makes sure that JS_GCMETER includes the stats about the doubles and makes the stats more useful. In addition there is a small optimization in js_NewUnrootedDoubleValue/RefillDValFreeList to store only the second element on the local free list in cx->dvalFreeList as the first element would be immediately consumed in any case.
Attachment #300483 - Attachment is obsolete: true
Attachment #300513 - Attachment is obsolete: true
Attachment #300681 - Flags: review?(brendan)
Attachment #300483 - Flags: review?(brendan)
Attached patch diff -b for v7 (obsolete) (deleted) — Splinter Review
Comment on attachment 300681 [details] [diff] [review] v7 DVal => Double (things still line up in structs since this is only two more chars), same for DVAL => DOUBLE. Weak => WeaklyRooted to match Unrooted (longer, but less likely to confuse -- Weak is broader, without Root in context to narrow its meaning). jsbit.h nit: put backslashes in column 79 per prevailing style. jsgc.c big intro comment nit: s/the double numbers/doubles/g and rewrap as if by vim Qj down to end of paragraph. Related nit: a blank comment line on each side of excerpted code or equation is nicer to read (the DVALS_PER_ARENA comment does this, it's very readable). Another nearby nit: "Boundary's offset from the start..." should be "The boundary's offset from the start...." I missed the rationale for changing GC_ARENA_SHIFT if JS_GC_USE_MMAP. Big-time nit: one-line major comments before members in JSGCArenaInfo could be single-line comments (no /* and */ on their own lines). Another nit: this solution wants to be indented two spaces inside its comment-line: * n_max = floor(M/(B*s + 1)) * B + floor((max(M % (B*s + 1), 1) - 1) / s) should fit in 79 columns if I'm not misreading. s/The function/This function/ in comments introducing the GetGCThingFlags* functions. I will finish reviewing after lunch. /be
Attached patch v8 (obsolete) (deleted) — Splinter Review
Attachment #300681 - Attachment is obsolete: true
Attachment #300682 - Attachment is obsolete: true
Attachment #300735 - Flags: review?(brendan)
Attachment #300681 - Flags: review?(brendan)
(In reply to comment #38) > Created an attachment (id=300735) [details] > v8 That patch addresses the nits from the comment 37.
Attached patch v8 for real (obsolete) (deleted) — Splinter Review
The previous patch was not a normal CVS diff.
Attachment #300735 - Attachment is obsolete: true
Attachment #300737 - Flags: review?(brendan)
Attachment #300735 - Flags: review?(brendan)
Attached patch diff -b for v8 (obsolete) (deleted) — Splinter Review
Attached patch v7 - v8 delta (obsolete) (deleted) — Splinter Review
Here is a delta showing how the nits for v7 were addressed.
Comment on attachment 300737 [details] [diff] [review] v8 for real Nits, only a few left: #define JS_TEST_BIT(map,bit) ((map)[(bit) >> JS_BITS_PER_WORD_LOG2] & \ ((jsbitmap)1 << ((bit)&(JS_BITS_PER_WORD-1)))) etc. Suggest pulling the macro bodies for this one and its next two siblings leftward two spaces, but keeping \ in column 79 as usual, to avoid the overflow line touching column 79 or 80. * For doubles we use a specialized arena which avoids the type bits as the Suggest comma before "we use" and "as the". jsuword untracedThings; /* bitset for fast search of marked but not yet traced things. */ I was ok with capitalized full-stopped sentence comments before each member declaration, but this is ok too -- except tradition has these "after-member" comments not use periods at end -- often the words form a sentence fragment. * B is number of bits in the byte or JS_BITS_PER_BYTE, s/in the/per/ ((uint32) (jsuword) (thing) & GC_ARENA_MASK) / sizeof(jsdouble)) The (uint32) (jsuword) cast chain seems unnecessary unless you need the entire result chopped to uint32, in which case the right operand of (uint32) needs to be parenthesized (all of the remaining expression). Similar comment later for: index = (((uintN) (jsuword) doubleFlags & GC_ARENA_MASK) - DOUBLES_ARENA_BITMAP_OFFSET) * JS_BITS_PER_BYTE; in RefillDoubleFreeList (extra nit here: - is at end of first line but * is at start of third -- could fit on two lines IMHO to avoid issue). Total nit, just wondering whether it is better to parenthesize outer cast's right operand. * This function return null when thing is jsdouble. s/return/returns/ * the mark flags contains a garbage as it is only initialized when s/a garbage/all garbage/ s/only initialized/initialized only/ if (JS_Now() - *timestamp > acx->runtime->gcStackPoolLifespan * 1000) JS_FinishArenaPool(&acx->stackPool); Brace the then clause because of multiline condition + consequent here. Also better to split at > with > at end of line? Maybe it all fits if you pre-scale gcStackPoolLifespan by 1000 -- I should have suggested this to crowder, since without an (int64) or (double) cast in this expression, the * 1000 will make a large lifespan wrap around in 32 bits anyway. Yeah, please pre-scale and comment accordingly, if you agree -- JS_SetGCParameter can't fail, but it could assert so DEBUG builds would catch any overlarge lifespan. Or we could just make the JSRuntime member be int64, if that packs nicely. * If d is int, return it as jsval. Otherwise allocate a new double, weekly "weakly" misspelled. r=me with these nits picked. Thanks, /be
Attachment #300737 - Flags: review?(brendan) → review+
Attached patch v9 (obsolete) (deleted) — Splinter Review
This version addresses the nits from the comment 43.
Attachment #300737 - Attachment is obsolete: true
Attachment #300738 - Attachment is obsolete: true
Attachment #300739 - Attachment is obsolete: true
Attached patch v8 - v9 delta (obsolete) (deleted) — Splinter Review
Attachment #300763 - Flags: review+
Comment on attachment 300763 [details] [diff] [review] v9 + * To allocate doubles we use a specialized arena. It can only contain numbers s/only contain/contain only/ ;-) This is more readable, thanks: + age = JS_Now() - *timestamp; + if (age > acx->runtime->gcStackPoolLifespan * 1000) but the right operand of > is of type uint32, result of * on uint32 and int, so it can overflow (imagine a lifespan of 24*3600*1000 ms or a day). Casting the left term of * to (int64) would fix this without bloating JSRuntime. There, I'm done nit-picking. To be more helpful, I think we should get this into b3 for maximum mileage before final fx3. /be
Attachment #300763 - Flags: approval1.9b3?
Attachment #300763 - Flags: approval1.9+
We pass all tests on js test suite with this?
Comment on attachment 300763 [details] [diff] [review] v9 a+ schrep assuming we have good test coverage here - and because we want to spread our js checkin's out a little :-)
Attachment #300763 - Flags: approval1.9b3? → approval1.9b3+
Sorry I missed that timestamp overflow in bug 408113. :(
Depends on: 415207
I filed bug 415207 about the time overflow code as this is not related to this bug.
Attached patch v10 (obsolete) (deleted) — Splinter Review
This is the v9 adjusted for the patch from bug 415207 landed and with English grammar in comments fixed.
Attachment #300763 - Attachment is obsolete: true
Attachment #300765 - Attachment is obsolete: true
Attachment #300889 - Flags: review+
Attachment #300889 - Flags: approval1.9b3+
Attachment #300889 - Flags: approval1.9+
Attached patch v10 for real (obsolete) (deleted) — Splinter Review
The previous attachment had a wrong patch.
Attachment #300889 - Attachment is obsolete: true
Attachment #300893 - Flags: review+
Attachment #300893 - Flags: approval1.9b3+
Attachment #300893 - Flags: approval1.9+
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Igor, could this have caused ecma/TypeConversion/9.5-2.js to fail with (Math.pow(2,31)+1) << 0 expected: -2147483647 actual: 2147483649 reason: wrong value ?
(In reply to comment #54) > Igor, could this have caused ecma/TypeConversion/9.5-2.js to fail with > (Math.pow(2,31)+1) << 0 expected: -2147483647 actual: 2147483649 reason: wrong > value ? I backed out the patch due to this.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached patch v11 (obsolete) (deleted) — Splinter Review
v10 has too much cut and paste in STORE_INT and STORE_UINT macros. Here is is the delta: --- /home/igor/m/ff/mozilla/quilt.A13010/js/src/jsinterp.c 2008-02-01 13:02:59.000000000 -0800 +++ js/src/jsinterp.c 2008-02-01 12:08:43.000000000 -0800 @@ -138,33 +138,33 @@ #define STORE_INT(cx, n, i) \ JS_BEGIN_MACRO \ jsval v_; \ \ if (INT_FITS_IN_JSVAL(i)) { \ v_ = INT_TO_JSVAL(i); \ } else { \ SAVE_SP_AND_PC(fp); \ - v_ = js_NewUnrootedDoubleValue(cx, d); \ + v_ = js_NewUnrootedDoubleValue(cx, (jsdouble) (i)); \ ok = v_ != JSVAL_NULL; \ if (!ok) \ goto out; \ } \ STORE_OPND(n, v_); \ JS_END_MACRO #define STORE_UINT(cx, n, u) \ JS_BEGIN_MACRO \ jsval v_; \ \ if ((u) <= JSVAL_INT_MAX) { \ v_ = INT_TO_JSVAL(u); \ } else { \ SAVE_SP_AND_PC(fp); \ - v_ = js_NewUnrootedDoubleValue(cx, d); \ + v_ = js_NewUnrootedDoubleValue(cx, (jsdouble) (u)); \ ok = v_ != JSVAL_NULL; \ if (!ok) \ goto out; \ } \ STORE_OPND(n, v_); \ JS_END_MACRO #define FETCH_NUMBER(cx, n, d) \
Attachment #300893 - Attachment is obsolete: true
Attachment #300918 - Flags: review?(brendan)
Comment on attachment 300893 [details] [diff] [review] v10 for real Removing approval for now until we can confirm why it's failing. Please renom when ready.
Attachment #300893 - Attachment is obsolete: false
Attachment #300893 - Flags: approval1.9b3+ → approval1.9b3-
Attached file sunspider in js shell (thread-safe build) (obsolete) (deleted) —
(In reply to comment #31) > Apparently the patch managed to speedup some benchmarks by over 40% : > > nsieve-bits: 1.53x as fast > sha1: 1.41x as fast > This was all wrong due to the bug fixed in the patch from comment 56. That bug would return to scripts wrong value that much more frequently fits integers resulting in less allocation. The real numbers are here: ** TOTAL **: 1.032x as fast ... crypto: 1.018x as fast aes: *1.006x as slow* md5: 1.038x as fast sha1: 1.038x as fast Those numbers were way too high to be true in any case.
I wondered about this, after finally reading the patch (and coming down with the flu) last night. Alternative idea: maintain a doubleStack for temporary double operands only, tagged as jsvals using pseudo-booleans where the top bits index the doubleStack. Specialize jsinterp.c macros to allocate from doubleStack instead of GC heap iff making a double that is inevitably pushed at fp->sp today. This helps only if most doubles are not stored in objects. Extension to the current patch's idea: extend GC_POKE (as the get/set/call optimizations bug is doing) to be a write barrier for heap slots, use it to bound the set of doubles referenced only via DOUBLE-tagged jsvals on the operand stack (across multiple frames). Run a local GC when filling up the page from which local doubles are allocated, and run a more specialized copying GC to move a local double to the GC heap only when it escapes (detected via the write barrier). I had a patch along the latter lines at one point. It may be in a bug. But first, comments and measurements to distinguish temporary operand doubles from those that escape to object slots in the GC heap is needed. /be
Status: REOPENED → ASSIGNED
My last comment did not define terms, but it should be clear that a local double can be referenced from multiple operand stack slots: function f() { let x = 22, y = z; return (x/y) + (x/y); } and if the local double is written to an argv or vars slot, it could be written to several, or propagated from one to another. These stores should not need a write barrier. /be
(In reply to comment #59) > Alternative idea: maintain a doubleStack for temporary double operands only, > tagged as jsvals using pseudo-booleans where the top bits index the > doubleStack. This should go to another bug. This patch does improved n-body by 18% and 3bit-bits-in-byte by 24%. Plus it makes GC-ing the garbage with doubles really faster. For example, a benchmark like: function test() { var N = 1 << 24; var sum = 0; for (var i = 0; i != N; ++i) { sum += i + 0.1; } var timestamp = Date.now(); gc(); print(Date.now() - timestamp); } test(); prints without the patch in 98ms and with the patch 14ms, a speedup by a factor of 7.
Another bug is the right thing. The false speedup measured with the buggy patch does suggest uint32 values that do not fit in jsval-ints are important, at least for some benchmarks. We could target that type separately and avoid the FPU where possible: uintStack as well as doubleStack (these need not be very big). But of course going through the write barrier will convert to double, so we want fast doubles too. Since the interpreter has to type-check all values, extra tag checks including pseudo-boolean ones, so long as they are well-ordered and predicted (target code in the pipe), should not hurt the common case performance. /be
Comment on attachment 300918 [details] [diff] [review] v11 Thinking about the speedup this wins, and the work that remains to do some kind of local GC, requiring a write barrier, I'm reminded of Ken Thompson's epigram: "When in doubt, use brute force". The brute force solution that's most "brute" is to make jsval big enough to hold jsdouble. This is what tamarin-tracing does, and I hear it's what Lua is going to do. For 64-bit pointers you have to make some restrictions, but it seems survivable. If we adopted tamarin-tracing's value-boxed-in-double scheme, we would inflate the dslots vectors of live objects by 2x. This could be bad, for sure, but it might not be in light of other heap workloads surrounding the dslots allocations that are malloc'ed (small-enough dslots are GC-alloc'ed). The double-sized jsval could accomodate true int32 and uint32 values. Probably SunSpider results would greatly improve. Food for thought, perhaps even an experiment to measure perf win. /be
Attachment #300918 - Flags: review?(brendan)
Attachment #300918 - Flags: review+
Attachment #300918 - Flags: approval1.9b3?
Attachment #300918 - Flags: approval1.9+
(In reply to comment #62) > The false speedup measured with the buggy patch does suggest uint32 values that > do not fit in jsval-ints are important, at least for some benchmarks. See bug 415455 for a proof.
It would be interesting to know the SunSpider stats on a 64-bit platform where Without the patch SpiderMonkey uses 16 bytes per each double. The patch fixes that to use 8 bytes as in the 32-bit case.
R.e. approval are you ok waiting to land this until Mon or Tues? We've been setting down for b3 and if we can kick builds off tomorrow am for them we can get B3 out the door soon. This leaves a little more bake time if there are any last-minute issues with this patch.
I can land this later today.
Let's get this and bug 415455 fixed for b3 -- would like to get bacon in too, but at least these two. /be
Comment on attachment 300918 [details] [diff] [review] v11 Sure - esp since Igor is out this week
Attachment #300918 - Flags: approval1.9b3? → approval1.9b3+
Status: ASSIGNED → RESOLVED
Closed: 17 years ago17 years ago
Resolution: --- → FIXED
I backed out again due to test failures.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Crowder is this something you can take a look at since Igor is out for the next few days?
With this patch applied, I am not getting the same mochitest failure.
(In reply to comment #74) > With this patch applied, I am not getting the same mochitest failure. > With v11 that was checked in an failed on the unit test boxes?
Yes, with v11. Sorry I was not more precise in my previous comment. I am having other trouble running MochiTests, but this on at least seems to consistently succeed here, -with- the v11 patch applied. My other failures seem to be unrelated.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1202092920.1202097295.8927.gz#err0 I think -this- is the correct log of the failed test (from tbox).... I was able to get this test to fail once locally, but not reliably.
I definitely think sayrer's suggestion on IRC of valgrinding this could prove productive, but I don't have the ability to do that at the moment. If someone else could do so, that would be great.
I wonder if this is a new kind of GC-hazard. Perhaps an unrooted double value has been collected, the slot reused by some other double value, and then referenced again, producing sometimes-inconsistent results. I tried eyeballing the patch for hazards along those lines, but didn't see any offhand. I also tried running the MochiKit-Color tests with gczeal set to 2, without any failure.
Attachment #300763 - Flags: review+
Attachment #300763 - Flags: approval1.9b3+
Attachment #300763 - Flags: approval1.9+
Attachment #300889 - Flags: review+
Attachment #300889 - Flags: approval1.9b3+
Attachment #300889 - Flags: approval1.9+
Attachment #300737 - Flags: review+
Attachment #300893 - Flags: review+
Attachment #300893 - Flags: approval1.9+
Attachment #300918 - Flags: review+
Attachment #300918 - Flags: approval1.9b3+
Attachment #300918 - Flags: approval1.9+
Attached patch v12 (obsolete) (deleted) — Splinter Review
This is a minimal patch that just adds support for specialized arena allocation for double values. It does not introduce special API for allocating unrooted double values. That will go to a separated bug as it is not clear what caused the tinderbox test regression. It could be potentially a GC hazard exposed by the patch when the code stopped using the weak roots for doubles. The new patch contains essentially the same code in jsgc.[ch] and jscntxt.h and adjust the rest of code to use js_NewDoubleGCThing(cx) as a replacement for js_NewGCThing(cx, GCX_DOUBLE, 0). The patch also removes js_LockGCThing as it is no longer used.
Attachment #300893 - Attachment is obsolete: true
Attachment #300918 - Attachment is obsolete: true
Attachment #302130 - Flags: review?(brendan)
Attached file sunspider in js shell (thread-safe build) (obsolete) (deleted) —
I forgot to mention in the comment 80 that v12 patch keeps the size of GC arena at 4K. In the previous patches I decreased that to 2K to avoid potential internal fragmentation. But that was unsounded as I do not see changes in the fragmentation rates with that switch. On the other hand 4K arena improves the timing for some tests. For example, nbody.js now runs 1.22 faster even without weak root avoidance optimization.
Attachment #300389 - Attachment is obsolete: true
Attachment #300951 - Attachment is obsolete: true
Attached patch v13 (obsolete) (deleted) — Splinter Review
The new version is a diff against the current CVS.
Attachment #302130 - Attachment is obsolete: true
Attachment #303721 - Flags: review?(brendan)
Attachment #302130 - Flags: review?(brendan)
I see rather strange results with the patch from comment 82 under jsshell with bits-in-byte benchmarks. Here is the results before and after the patch for an optimized single-threaded shell build with GCC 4.1.2 under Fedora8 on my Pentium-M 1.1 GHz laptop: TEST COMPARISON FROM TO DETAILS ============================================================================= ** TOTAL **: 1.080x as fast 473.6ms +/- 0.2% 438.6ms +/- 0.2% significant ============================================================================= bitops: 1.080x as fast 473.6ms +/- 0.2% 438.6ms +/- 0.2% significant 3bit-bits-in-byte: 1.085x as fast 217.6ms +/- 0.4% 200.5ms +/- 0.2% significant bits-in-byte: 1.075x as fast 255.9ms +/- 0.2% 238.1ms +/- 0.3% significant And here the same repeated for a build compiled with JS_THREADSAFE=1: TEST COMPARISON FROM TO DETAILS ============================================================================= ** TOTAL **: *1.108x as slow* 415.1ms +/- 0.2% 459.9ms +/- 0.2% significant ============================================================================= bitops: *1.108x as slow* 415.1ms +/- 0.2% 459.9ms +/- 0.2% significant 3bit-bits-in-byte: *1.064x as slow* 184.0ms +/- 0.2% 195.8ms +/- 0.2% significant bits-in-byte: *1.143x as slow* 231.2ms +/- 0.2% 264.1ms +/- 0.2% significant That is, the patch makes bits-in-byte to run 7.5% faster without JS_THREADSAFE. With JS_THREADSAFE it runs 14% slower. This is reproducible. This are strange results given the test in question: // Copyright (c) 2004 by Arthur Langereis (arthur_ext at domain xfinitegames, tld com) // 1 op = 2 assigns, 16 compare/branches, 8 ANDs, (0-8) ADDs, 8 SHLs // O(n) function bitsinbyte(b) { var m = 1, c = 0; while(m<0x100) { if(b & m) c++; m <<= 1; } return c; } function TimeFunc(func) { var x, y, t; for(var x=0; x<350; x++) for(var y=0; y<256; y++) func(y); } TimeFunc(bitsinbyte); The test never allocates anything when run and the patch does not touch jsinterp.c. In fact during the test the patch never calls any functions outside the interpreter except for possible malloc calls when allocating the inlined frames. Initially I suspected that an extra JSContext/JSRuntime fields that the patch add sare reposnsible for the slowness. But with the field moved toward the end of JSContext, I got exactly the same results. Plus it does not explain why thread-unsafe build is faster with the patch. To add to the mystery the patch continues to allocate the same single 16K chunk to allocate all the GC arenas to initialize the runtime and run the patch. Moreover, the initialization/shutdown cost meassured as the time to run js -e 'Math.sin()' is only about 11 ms (4%) and stays the same with and without the patch.
I played with the benchmark a little bit. If I change a source little bit so it will have different bytecodes, then depending on the changes the results with-patch/without-patch shifts by that 10% margin both ways for threadsafe and non-threadsafe builds. The only reasonable explanation that I can come with is TLB cache misses for the code. js_Interpret is 40-50 K and does not fit TLB AFAIK. So depending on the bytecodes and their position adding/removing code before js_Interpret in the linked image may shift the code implementing bytecode cases to different pages and cause extra TLB hits. To verify this it would be nice to force js_Interpret to be aligned on 4K boundary. But GCC 4.1.2 on my Fedora-8 laptop does not support that.
I confirm that bitops regression is caused by Translation Look-Aside Buffer misses. For that I grouped the bytecodes that the benchmark uses into the same place in the interpreter. After that change a shell built with or without patch, with or without JS_THREADSAFE run the benchmark with the same timing. I guess this means 2 things: 1. The interpreter should group bytecodes based on their frequency. 2. It would be nice to force js_Interpret to start on a page boundary.
It is not only TLB misses in js_Interpret that contributes to flotation of benchmark results. It was possible to affect the performance of bits-in-byte and other benchmarks that never calls other functions when in js_Interpret except for the name lookup during CALLNANE just by moving various bytecode cases. The difference in performance was up to 20%. I guess cache associativity (both for TLB and L1) plays a role here as well.
Attached patch v14 (obsolete) (deleted) — Splinter Review
This is just a sync with CVS
Attachment #304206 - Flags: review?(brendan)
Attachment #303721 - Attachment is obsolete: true
Attachment #303721 - Flags: review?(brendan)
Comment on attachment 304206 [details] [diff] [review] v14 >+ * A GC arena contains fixed number of flag bits for each thing in its heap, "a fixed number" >+ * We can not use do-while loop here as a->u.untracedThings can be "cannot" (generaly nit, pre-existing here, I know). Looks great otherwise -- please land ASAP after mochitest js/tests and general firefox surfing all look good. /be
Attachment #304206 - Flags: review?(brendan)
Attachment #304206 - Flags: review+
Attachment #304206 - Flags: approval1.9+
Igor, what's the bug # for the special API for allocating unrooted double values?
I have not files one.
(In reply to comment #88) > please land ASAP after mochitest js/tests and general > firefox surfing all look good. The trouble is that I was not able to reproduce the mochi test failures. That is why I removed all that weak root avoidance optimizations as the test failure could indicate that some code in the interpreter relied on weak rooting.
Attached patch v14b (obsolete) (deleted) — Splinter Review
Here is a version to check in with that addresses the nits from comment 88: /* - * A GC arena contains fixed number of flag bits for each thing in its heap, + * A GC arena contains a fixed number of flag bits for each thing in its heap, * and supports O(1) lookup of a flag given its thing's address. * @@ -2164,19 +2164,19 @@ TraceDelayedChildren(JSTracer *trc) /* - * We can not use do-while loop here as a->u.untracedThings can be - * zero before the loop as a leftover from the previous iterations. - * See comments after the loop. + * We cannot use do-while loop here as a->u.untracedThings can be zero + * before the loop as a leftover from the previous iterations. See + * comments after the loop. */
Attachment #304206 - Attachment is obsolete: true
Attachment #305339 - Flags: review+
Attachment #305339 - Flags: approval1.9+
(In reply to comment #91) > (In reply to comment #88) > > please land ASAP after mochitest js/tests and general > > firefox surfing all look good. I do not have time to watch tinderbox until Monday.
I cheecked in the patch from comment 92 to the trunk: Checking in js/src/js.msg; /cvsroot/mozilla/js/src/js.msg,v <-- js.msg new revision: 3.90; previous revision: 3.89 done Checking in js/src/jsapi.c; /cvsroot/mozilla/js/src/jsapi.c,v <-- jsapi.c new revision: 3.421; previous revision: 3.420 done Checking in js/src/jsatom.c; /cvsroot/mozilla/js/src/jsatom.c,v <-- jsatom.c new revision: 3.124; previous revision: 3.123 done Checking in js/src/jscntxt.h; /cvsroot/mozilla/js/src/jscntxt.h,v <-- jscntxt.h new revision: 3.195; previous revision: 3.194 done Checking in js/src/jsdate.c; /cvsroot/mozilla/js/src/jsdate.c,v <-- jsdate.c new revision: 3.103; previous revision: 3.102 done Checking in js/src/jsgc.c; /cvsroot/mozilla/js/src/jsgc.c,v <-- jsgc.c new revision: 3.280; previous revision: 3.279 done Checking in js/src/jsgc.h; /cvsroot/mozilla/js/src/jsgc.h,v <-- jsgc.h new revision: 3.104; previous revision: 3.103 done Checking in js/src/jsnum.c; /cvsroot/mozilla/js/src/jsnum.c,v <-- jsnum.c new revision: 3.106; previous revision: 3.105 done Checking in js/src/jsnum.h; /cvsroot/mozilla/js/src/jsnum.h,v <-- jsnum.h new revision: 3.42; previous revision: 3.41 done
Status: REOPENED → RESOLVED
Closed: 17 years ago17 years ago
Resolution: --- → FIXED
I had to back this out. Something's still wrong. I'll see if I can reproduce it. On qm-xserve01 it stopped in the middle of content/base/test/test_bug352728.xhtml (not .html), continues with content/base/test/test_bug353334.html, then skips everything from content/base/test/test_bug401662.html onward. On qm-win2k3-01 we got *** 2142 ERROR FAIL | Test timed out. | | /tests/content/base/test/test_bug397234.html *** 2261 ERROR FAIL | Table right error | got 108, expected 10 | /tests/content/base/test/test_bug414190.html *** 2324 ERROR FAIL | Test timed out. | | /tests/content/base/test/test_bug5141.html On qm-centos5-01 something weird happened: ... *** 1275 INFO Running /tests/content/base/test/test_bug276037-1.html... ... *** 1316 INFO PASS | document.getElementsByTagName('SPAN').length: 6; element count: 6 *** 1317 INFO PASS | ; document.getElementsByTagNameNS('null', 'SPAN').length: 6; element count: 6 The following six lines disappeared: *** 1318 INFO PASS | document.getElementsByTagName('SPAN').length: 6; element count: 6 *** 1319 INFO PASS | ; document.getElementsByTagNameNS('null', 'SPAN').length: 6; element count: 6 *** 1320 INFO PASS | document.getElementsByTagName('SCRIPT').length: 21; element count: 21 *** 1321 INFO PASS | ; document.getElementsByTagNameNS('null', 'SCRIPT').length: 21; element count: 21 *** 1322 INFO PASS | document.getElementsByTagName('SCRIPT').length: 21; element count: 21 *** 1323 INFO PASS | ; document.getElementsByTagNameNS('null', 'SCRIPT').length: 21; element count: 21 as did the following 6 lines: *** 1394 INFO PASS | document.getElementsByTagName('script').length: 10; element count: 10 *** 1395 INFO PASS | ; document.getElementsByTagNameNS('http://www.w3.org/1999/xhtml', 'script').length: 10; element count: 10 *** 1396 INFO PASS | document.getElementsByTagName('script').length: 10; element count: 10 *** 1397 INFO PASS | ; document.getElementsByTagNameNS('http://www.w3.org/1999/xhtml', 'script').length: 10; element count: 10 *** 1398 INFO PASS | document.getElementsByTagName('span').length: 7; element count: 7 *** 1399 INFO PASS | ; document.getElementsByTagNameNS('http://www.w3.org/1999/xhtml', 'span').length: 7; element count: 7 but it seemed to have finished the other tests from that file. Then from content/base/test/test_bug367164.html onward all tests were skipped.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment on attachment 305339 [details] [diff] [review] v14b >Index: js/src/jsgc.c >=================================================================== >+/* >+ * Macros for the specialized arena for doubles. >+ * Could you maybe add here somewhere that, unlike the regular arenas, the bitmap indexes from the beginning, not from the end? >+#define DOUBLE_ARENA_BITMAP(arena) \ >+ ((uint8 *) arena - DOUBLES_ARENA_BITMAP_SIZE) When I was trying to understand this I got confused by the name |arena|. |arenaInfo| or |info| would've been more informative (though I think there's precedent for calling it |arena|). >+/* >+ * This function must not be called when thing is jsdouble. >+ */ > static uint8 * > GetGCThingFlags(void *thing) > { > JSGCArenaInfo *a; > uint32 index; > > a = THING_TO_ARENA(thing); > index = THING_TO_INDEX(thing, a->list->thingSize); > return THING_FLAGP(a, index); > } > While in the case of thing being a jsdouble a->list would be null and we'd crash, maybe assert it anyway for a clearer message? >+ doubleFlags = rt->gcDoubleArenaList.nextDoubleFlags; >+ for (;;) { >+ if (((jsuword) doubleFlags & GC_ARENA_MASK) == >+ ARENA_INFO_OFFSET) { You could use |if (IS_ARENA_INFO_ADDRESS(doubleFlags)) {| here. >+ /* >+ * We delegate assigning cx->doubleFreeList to js_NewUnrootedDoubleValue as >+ * it immediately consumes the head of the list. >+ */ js_NewDoubleGCThing? >Index: js/src/jsgc.h >=================================================================== >+/* >+ * Return a pointer to a new GC-allocated and weakly rootted jsdouble number Nit: rooted
Attached patch v15 (obsolete) (deleted) — Splinter Review
The new version addresses the comments and uses JS_INLINE functions, not macros, to implement access to double flags. It has more asserts. I plan to add a separated patch that will add explicit checks for JSVAL_TO_ macros that the corresponding GC thing is a valid one. Then I will run the tests with such patch.
Attachment #305339 - Attachment is obsolete: true
Attached patch v14-v15 delta (obsolete) (deleted) — Splinter Review
Attached patch v16 (obsolete) (deleted) — Splinter Review
The new version fixes a bug in IsDoubleBitmapEnd and adds new asserts about a valid arena. The new test passes js test suite in a debug shell.
Attachment #305419 - Attachment is obsolete: true
Attached patch v17 (obsolete) (deleted) — Splinter Review
The new version fixed just introduced bug in IsAboutToBeFinalizedDouble pointed by Peter Annema.
Attachment #305422 - Attachment is obsolete: true
Attached file Test case for the bug (deleted) —
[ Thanks for Peter Annema for finding this in a join IRC session. I owe him his favorite beverage (within a reasonable price tag ;) )] Here is the problem. The number of bits in the occupation bitmap in general is bigger then the number of doubles that can be fit into the arena. Thus the last byte of the bitmap will contain some unused bits. The code in the current patches clears those bits when initializing the whole bitmap with zeros. Now later, when the code searches for free double cells in an arena, it will stop on this last byte assuming that the unset bits indicates the free cells. The bug happens when all cells matching the bits from the last byte are occupied. There is the code that supposed to protect against the bug. But it made the bug significantly worse and harder to debug. The end result is that RefillDoubleFreeList would return null without reporting out-of-memory. It will be propagated as stop asap condition. Depending on where it happens it either stops the script or can be even ignored if the allocation is happened under xpconnect. So here it is. No craches, asserts etc. Just random mochi test suite failures that may not even happen. The attached test case demonstrates the problem. If it does not print OK, then there is a bug.
Attached patch v18 (obsolete) (deleted) — Splinter Review
The patch with a fix. I will test it thoroughly before asking for a review. The patch relies on JS_INLINE for an optimal performance but the fix can be easily backported to v14, the version that was temporary landed. But with JS_INLINE it is so much nicer...
Attached patch v19 (obsolete) (deleted) — Splinter Review
Here is the version that works and which continues to use macros, not inlined functions. Here the changes since v14b, the temporary landed version: 1. The bug is fixed. 2. RefillDoubleFreeList is optimized for the case when converting 8 consequent double cells into the linked list. This is the most common case since the most doubles comes from newly allocated arenas. It was tempting to completely unroll the loop since those 8 doubles is the exactly one cache line, but I delegated that to compiler. 3. Peter Annema observed that the formula calculating DOUBLES_PER_ARENA can be replaced by much simpler expression: #define DOUBLES_PER_ARENA \ ((ARENA_INFO_OFFSET * JS_BITS_PER_BYTE) / (JS_BITS_PER_DOUBLE + 1)) instead of the monstrous: #define DOUBLES_PER_ARENA \ (ARENA_INFO_OFFSET / (JS_BITS_PER_DOUBLE + 1) * JS_BITS_PER_BYTE + \ (JS_MAX(ARENA_INFO_OFFSET % (JS_BITS_PER_DOUBLE + 1), 1) - 1) / \ sizeof(jsdouble)) the new patch uses the new formula and updates the proof.
Attachment #305750 - Flags: review?(brendan)
Attached patch v14b - v19 delta (obsolete) (deleted) — Splinter Review
This is a delta of changes from the last reviewed and temporally landed patch.
Attachment #302135 - Attachment is obsolete: true
Attachment #305420 - Attachment is obsolete: true
Attachment #305485 - Attachment is obsolete: true
Attachment #305586 - Attachment is obsolete: true
Attached patch v19b (obsolete) (deleted) — Splinter Review
The new version improves the proof of the formula for DOUBLES_PER_ARENA based on input by Peter Annema.
Attachment #305750 - Attachment is obsolete: true
Attachment #305751 - Attachment is obsolete: true
Attachment #305759 - Flags: review?(brendan)
Attachment #305750 - Flags: review?(brendan)
Attached patch v14b - v19b delta (obsolete) (deleted) — Splinter Review
Here is the delta from v14b, the previously reviewed patch.
Blocks: 419632
Comment on attachment 305759 [details] [diff] [review] v19b Thanks to jag for great reviewing. >+ * n*s + ceil(n/B) <= M (1) >+ * ceil(n*(B*s + 1)/B) <= M (2) >+ * n <= floor(M*B/(B*s + 1)), (3) >+ * floor(M*B/(B*s + 1)), (4) Align (2). >+#define DOUBLES_PER_ARENA \ >+ ((ARENA_INFO_OFFSET * JS_BITS_PER_BYTE) / (JS_BITS_PER_DOUBLE + 1)) Much better! >+ * Get the start of the bitmap area containing double mark flages in the >+ * arena. To access the flag the code uses s/flages/flags/ and rewrap, it fits: * Get the start of the bitmap area containing double mark flags in the arena. 12345678901234567890123456789012345678901234567890123456789012345678901234567890 r/a=me with nits picked. /be
Attachment #305759 - Flags: review?(brendan)
Attachment #305759 - Flags: review+
Attachment #305759 - Flags: approval1.9+
Attached patch v19c (deleted) — Splinter Review
The new version addresses the nits from the comment 107: --- /home/igor/m/ff/mozilla/quilt.zy1099/js/src/jsgc.c 2008-02-26 20:16:51.000000000 +0100 +++ js/src/jsgc.c 2008-02-26 20:09:00.000000000 +0100 @@ -381,7 +381,7 @@ static JSBool js_gcUseMmap = JS_FALSE; * * n*B*s/B + ceil(n/B) <= M, * ceil(n*B*s/B + n/B) <= M, - * ceil(n*(B*s + 1)/B) <= M (2) + * ceil(n*(B*s + 1)/B) <= M (2) * * We define a helper function e(n, s, B), * @@ -454,8 +454,8 @@ JS_STATIC_ASSERT(UNUSED_DOUBLE_BITMAP_BI JS_ASSERT(!(arenaInfo)->list)) \ /* - * Get the start of the bitmap area containing double mark flages in the - * arena. To access the flag the code uses + * Get the start of the bitmap area containing double mark flags in the arena. + * To access the flag the code uses * * bitmapStart[index / JS_BITS_PER_BYTE] & JS_BIT(index % JS_BITS_PER_BYTE) *
Attachment #305759 - Attachment is obsolete: true
Attachment #305760 - Attachment is obsolete: true
Attachment #305810 - Flags: review+
Attachment #305810 - Flags: approval1.9+
Status: REOPENED → RESOLVED
Closed: 17 years ago17 years ago
Resolution: --- → FIXED
Comment on attachment 305810 [details] [diff] [review] v19c > JSBool > js_UnlockGCThingRT(JSRuntime *rt, void *thing) ... >+ if (--lhe->count != 0) >+ goto out; >+ JS_DHashTableOperate(rt->gcLocksHash, thing, JS_DHASH_REMOVE); > } > > rt->gcPoke = JS_TRUE; >-out: > METER(rt->gcStats.unlock++); >+ out: > JS_UNLOCK_GC(rt); > return JS_TRUE; It looks like this patch causes METER(rt->gcStats.unlock++); not to happen when decrementing an entry in rt->gcLocksHash to a nonzero value; that looks like a bug to me. (That said, not calling METER in the other "goto out" case seems correct.)
(In reply to comment #110) > >-out: > > METER(rt->gcStats.unlock++); > >+ out: > > JS_UNLOCK_GC(rt); > > return JS_TRUE; > > It looks like this patch causes METER(rt->gcStats.unlock++); not to happen when > decrementing an entry in rt->gcLocksHash to a nonzero value; This is intentional as rt->gcStats.unlock counts the number of locked things, not the total sum of lock counters.
Blocks: 422432
Flags: in-testsuite-
Flags: in-litmus-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: