Closed Bug 563938 Opened 15 years ago Closed 15 years ago

cache DST offsets to improve SunSpider score

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.9.3a5

People

(Reporter: sayrer, Assigned: Waldo)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(6 files, 1 obsolete file)

Everyone else is doing it, so I guess we have to as well. Here's what I got by caching the result of PRMJ_DSTOffset in DaylightSavingTA, from jsdate.cpp: TEST COMPARISON FROM TO DETAILS ============================================================================= ** TOTAL **: 1.67x as fast 61.8ms +/- 2.8% 36.9ms +/- 3.3% ============================================================================= date: 1.67x as fast 61.8ms +/- 2.8% 36.9ms +/- 3.3% format-xparb: 1.67x as fast 61.8ms +/- 2.8% 36.9ms +/- 3.3%
How do we plan to invalidate the cached value over time?
We could do it on each GC, and everything would probably work out OK.
(In reply to comment #2) > We could do it on each GC, and everything would probably work out OK. That's not good news for my nuclear missile silo. Could end up skipping over my launch date!
Exactly. Be careful what you do here. jsc has some code that uses a notification from osx. It clears the cache when DST changes.
I would be happy to discuss license terms for our Spidermonkey DST Manager Enterprise Edition product, which features FIPS and DOE/NNSA-Q certification, and is suitable for a wide variety of mission- and life-critical applications.
google says the WM_TIMECHANGE event will get this progressively more correct as Windows versions advance.
splendid. lets do it. love the speedup. what an awesome benchmark.
Assignee: general → jwalden+bmo
Status: NEW → ASSIGNED
(In reply to comment #4) > Exactly. Be careful what you do here. jsc has some code that uses a > notification from osx. It clears the cache when DST changes. I didn't find that code. It looks like they just clear their cache on every invocation.
I have a patch (several patches, actually) for this, doesn't seem to move the needle at all at the moment. Whether this is because my caching strategy doesn't anticipate SunSpider's needs or whether it's some other reason (including a broken implementation) remains to be seen. I'll post the first couple deckchair-rearranging patches in a second; the actual win is to be resumed tomorrow morning after my eyes un-dilate a bit...
This is just rearrangement and a constant-rename or two, plus a very slight dose of LL_* macro removal because the other is just fugly and hard to read.
Attachment #447442 - Flags: review?(sayrer)
The current code converts milliseconds into microseconds for PRMJ, which then un-converts in the opposite direction. Brilliant! Be clear about what's what at all times, and use the natural dimensionality at the entry point to the cache. Inside it we then use the natural dimensionality for calculating the DST offset, but that's private and thus not a problem.
Attachment #447443 - Flags: review?(sayrer)
Results with this are as follows: TEST COMPARISON FROM TO DETAILS ============================================================================= ** TOTAL **: 1.180x as fast 94.8ms +/- 0.6% 80.4ms +/- 0.7% significant ============================================================================= date: 1.180x as fast 94.8ms +/- 0.6% 80.4ms +/- 0.7% significant format-tofte: 1.031x as fast 54.1ms +/- 1.0% 52.4ms +/- 0.9% significant format-xparb: 1.46x as fast 40.8ms +/- 0.9% 28.0ms +/- 1.1% significant The xparb win isn't as much as in the first comment because 500 of 4000ish calls are cache misses now. Maybe that can be improved still, although claims about the test in the office suggest the only way to do so is to double down on this strategy by, say, expanding the cache range by more than a month at a time (which eventually runs into its own problems over over-zealous or unsound expansion). We'll see; more investigating to do. The JS test suite passes, but that's unlikely to exercise this caching logic in interesting ways, so I need to write some very targeted tests to address this by checking for cache consistency with reality. There are also a couple other tasks to be completed: changing some comments, maybe adding a couple more; benchmarking the browser proper; seeing what sort of DST caching stats are on actual websites and not just SunSpider, Benchmarketia; and as noted, further investigating the tests' MOs. To be continued tomorrow...
Attachment #447442 - Flags: review?(sayrer) → review+
Attachment #447443 - Flags: review?(sayrer) → review+
Patches 1 and 2 pushed to TM, with a followup bustage fix: http://hg.mozilla.org/tracemonkey/rev/8e8837abeab5 http://hg.mozilla.org/tracemonkey/rev/33b1321c6b8f http://hg.mozilla.org/tracemonkey/rev/aa32a6921eaa Further work yet remains here, of course.
Attached file Stats from running with this a little (deleted) —
I built a browser with this and surfed around for a little bit (some Gmail reading, some Reader reading, some random browsing, some searching on Bing) to see what sort of caching results I got with it -- the result is the attached file, with a little simplification. Most contexts don't even attempt to get the DST offset (65 of 73 contexts), so I removed them entirely. I also cut out all other lines noting 0 instances for brevity. In summary: most contexts created (65 of 73) never even attempted to get the DST offset. Of the remainder, most pages only attempted a handful of times (1, 7, 4, 1, 5). Most of these hit the cache slightly less to more often than not -- some success, but not enough loss to matter. The 1s never hit, but that's hardly a surprise. The remaining contexts had these stats: hit: in range: 19 misses: decrease range start: 3 increase, offset change, expand: 5 increase, offset change, new range: 7 decrease, offset change, expand: 35 decrease, offset change, new range: 8 large increase: 5 large decrease: 6 total: 88 hit: in range: 20 misses: increase range end: 1 decrease range start: 3 increase, offset change, expand: 5 increase, offset change, new range: 7 decrease, offset change, expand: 35 decrease, offset change, new range: 8 large increase: 7 large decrease: 6 total: 92 hit: in range: 118 misses: increase range end: 1 decrease range start: 1 large increase: 1 total: 121 The last one obviously benefits quite a bit. The first two, which quite possibly are from the same site performing the same action (I'm hooking in right before context destruction and don't know who's doing what in these), hit a little over 20% of the time -- not great, possibly a speedup. It's important to note that not all misses are alike. Either a wide miss or a narrow miss at an end of the range that's not within thirty days of a DST change results in one costly offset computation -- exactly what already happens. So the cost of those misses is basically a few integer arithmetic operations, pretty small. (These are "{in,de}crease range end" and "large {in,de}crease" in the logging.) It's the misses where we compute two costly offsets that are more worrisome. Those happen when a narrow miss occurs but thirty days from that end of the range crosses a DST change. At that point we only know that the offset changed, but since we don't know *where* it changed we have to re-query one more time to get the offset for the originally requested time. Then, since it must equal the offset for the range or for the 30-days-off date, we can start our new range. So how do we do on the two-lookup count? 55 of 88 and 55 of 92 lookups miss, and miss semi-expensively -- but the miss-expensively case is partially balanced, and overall it's pretty rare. Other browsing sessions I ran through on various other sites (Digg, Facebook, Twitter, &c.) broadly reflected the overall verdict: most places never compute a single DST offset, some of the ones that do hit the cache nearly every time, some compute so few DST offsets that for them the overall effect is a wash but is already negligible, and in almost every case caching DST offset computations -- at least at the current date -- only very rarely results in doubled computation. So, overall, the cache is either of negligible value or is hit often enough that it does demonstrate wins -- not a huge amount, but enough to say the optimization is basically worthwhile independent of its being in a benchmark. It's way way way off the beaten path of what the real web does, to be sure, but it has motivation beyond its applicability to a benchmark.
This updates the comments and includes some testing, but it may still need a little work. (The test might need to be split up, for example.) Nevertheless it's nearly done.
Okay, this should be good to go. I bumped the cache-testing depth up by one to catch any weird hysteretic bugs that three iterations wouldn't catch, dropping the number of timestamps tested a bit to compensate for overall test-execution time. I also intentionally flipped various conditions in the algorithm to test that the tests would fail from the resulting bugs, so I think there's little danger of the test not being correct and thereby missing a bug.
Attachment #448156 - Attachment is obsolete: true
Attachment #448611 - Flags: review?(sayrer)
Comment on attachment 448611 [details] [diff] [review] Patch 3, v1: Cache offsets for previously-seen, nearly-the-same dates, test cache behavior The implementation of purge() seems fairly brittle. I would probably set a mustFillCache flag or something. But up to you.
Attachment #448611 - Flags: review?(sayrer) → review+
http://hg.mozilla.org/tracemonkey/rev/edbf0ecd1d1a Will work on a blog post on this for the benefit of the web-developing world at large, regarding the nature of our perf-winning efforts...
OS: Mac OS X → All
Hardware: x86 → All
Whiteboard: fixed-in-tracemonkey
Target Milestone: --- → mozilla1.9.3a5
(In reply to comment #18) > http://hg.mozilla.org/tracemonkey/rev/edbf0ecd1d1a > > Will work on a blog post on this for the benefit of the web-developing world at > large, regarding the nature of our perf-winning efforts... Let's save it for when we are winning. We have an unrelated problem... the time tests are kind of slow. Can we add them to the Slow set?
The point wasn't to say we were fast, just to detail the sort of optimizations involved in making SunSpider speedier. But fair enough, there's no particular reason it need or should happen now rather than at some later, as a retrospective. As for slow, looking at this now; run in isolation the tests are all fine for me, but run with -j4 they start to be flaky. I'm trying to figure out, at least on my local system, where the crunch point is to see how much time I need to gain back to make this work, which will help with determining what possible solutions make more sense and what make less.
Hm, so -j2 they take 1:40ish each. -j4 they take 3:40ish each. Clearly some sort of parallel-performance cliff for this handful of tests. Still thinking...
Can we add --no-slow-tests or --slow-tests. This is totally killing the usefulness of our test suite for quick sanity checking right before checkin.
This is a no-brainer -- Waldo, you were rightly in favor of kicking the O(n^2) measuring tests out of the default list. This is as bad or worse, in my recent experience. /be
Split up the test into eight rather than four. Run with -j4, any time for any individual part that I can get is a good minute under the timeout limit. For an interesting data point, these tests when split into eight, run with -j4, take from 1:05 to 1:29 to run. Running one of those splits manually, in parallel with nothing, takes 8.89s. So *something* about running in parallel means you get a 10x slowdown. Fun times.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Attached patch android-build (deleted) — Splinter Review
changeset http://hg.mozilla.org/mozilla-central/rev/edbf0ecd1d1a made the jscntxt.cpp broken on build for Android platform. Changing the added macro __STDC_LIMIT_MACROS before all sys includes can solve this.
Attachment #450854 - Flags: review?
Comment on attachment 450854 [details] [diff] [review] android-build Harry, can you land the patch?
Attachment #450854 - Flags: review? → review+
And(In reply to comment #27) > (From update of attachment 450854 [details] [diff] [review]) > Harry, can you land the patch? I don't have committer access to the m-c repo.
Keywords: checkin-needed
Re-opened so we don't forget to land the 1-liner.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Followup bug is much better for auditing and bisecting and such. Sayrer already merged into m-c and resolved FIXED. Please file a new bug and re-resolve this one. /be
Bug 571751 will do the rest.
(In reply to comment #31) > Bug 571751 will do the rest. Thanks, Harry. /be
Status: REOPENED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → FIXED
Keywords: checkin-needed
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: