Closed Bug 563938 Opened 15 years ago Closed 15 years ago

cache DST offsets to improve SunSpider score

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla1.9.3a5

People

(Reporter: sayrer, Assigned: Waldo)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(6 files, 1 obsolete file)

Patch 1, v1: reorganize DST offset calculation into a class capable of caching results 15 years ago Jeff Walden [:Waldo] (deleted), patch	sayrer : review+	Details \| Diff \| Splinter Review
Patch 2, v1: Regularize and make obvious units of arguments and return values 15 years ago Jeff Walden [:Waldo] (deleted), patch	sayrer : review+	Details \| Diff \| Splinter Review
Patch 3, incomplete: Cache offsets for previously-seen, nearly-the-same dates 15 years ago Jeff Walden [:Waldo] (deleted), patch		Details \| Diff \| Splinter Review
Stats from running with this a little 15 years ago Jeff Walden [:Waldo] (deleted), text/plain		Details
Patch 3, nearly done: Cache offsets for previously-seen, nearly-the-same dates 15 years ago Jeff Walden [:Waldo] (deleted), patch		Details \| Diff \| Splinter Review
Patch 3, v1: Cache offsets for previously-seen, nearly-the-same dates, test cache behavior 15 years ago Jeff Walden [:Waldo] (deleted), patch	sayrer : review+	Details \| Diff \| Splinter Review
android-build 15 years ago Harry Li (deleted), patch	gal : review+	Details \| Diff \| Splinter Review

Robert Sayre

Reporter

Description

•

15 years ago

Everyone else is doing it, so I guess we have to as well. Here's what I got by caching the result of PRMJ_DSTOffset in DaylightSavingTA, from jsdate.cpp: TEST COMPARISON FROM TO DETAILS ============================================================================= ** TOTAL **: 1.67x as fast 61.8ms +/- 2.8% 36.9ms +/- 3.3% ============================================================================= date: 1.67x as fast 61.8ms +/- 2.8% 36.9ms +/- 3.3% format-xparb: 1.67x as fast 61.8ms +/- 2.8% 36.9ms +/- 3.3%

Chris Leary [:cdleary] (not checking bugmail)

Comment 1

•

15 years ago

How do we plan to invalidate the cached value over time?

Mike Shaver (:shaver -- probably not reading bugmail closely)

Comment 2

•

15 years ago

We could do it on each GC, and everything would probably work out OK.

Chris Leary [:cdleary] (not checking bugmail)

Comment 3

•

15 years ago

(In reply to comment #2) > We could do it on each GC, and everything would probably work out OK. That's not good news for my nuclear missile silo. Could end up skipping over my launch date!

Andreas Gal :gal

Comment 4

•

15 years ago

Exactly. Be careful what you do here. jsc has some code that uses a notification from osx. It clears the cache when DST changes.

Mike Shaver (:shaver -- probably not reading bugmail closely)

Comment 5

•

15 years ago

I would be happy to discuss license terms for our Spidermonkey DST Manager Enterprise Edition product, which features FIPS and DOE/NNSA-Q certification, and is suitable for a wide variety of mission- and life-critical applications.

Robert Sayre

Reporter

Comment 6

•

15 years ago

google says the WM_TIMECHANGE event will get this progressively more correct as Windows versions advance.

Andreas Gal :gal

Comment 7

•

15 years ago

splendid. lets do it. love the speedup. what an awesome benchmark.

Jeff Walden [:Waldo]

Assignee

Updated

•

15 years ago

Assignee: general → jwalden+bmo

Status: NEW → ASSIGNED

Robert Sayre

Reporter

Comment 8

•

15 years ago

(In reply to comment #4) > Exactly. Be careful what you do here. jsc has some code that uses a > notification from osx. It clears the cache when DST changes. I didn't find that code. It looks like they just clear their cache on every invocation.

Jeff Walden [:Waldo]

Assignee

Comment 9

•

15 years ago

I have a patch (several patches, actually) for this, doesn't seem to move the needle at all at the moment. Whether this is because my caching strategy doesn't anticipate SunSpider's needs or whether it's some other reason (including a broken implementation) remains to be seen. I'll post the first couple deckchair-rearranging patches in a second; the actual win is to be resumed tomorrow morning after my eyes un-dilate a bit...

Jeff Walden [:Waldo]

Assignee

Comment 10

•

15 years ago

Attached patch Patch 1, v1: reorganize DST offset calculation into a class capable of caching results (deleted) — Details — Splinter Review

This is just rearrangement and a constant-rename or two, plus a very slight dose of LL_* macro removal because the other is just fugly and hard to read.

Attachment #447442 - Flags: review?(sayrer)

Jeff Walden [:Waldo]

Assignee

Comment 11

•

15 years ago

Attached patch Patch 2, v1: Regularize and make obvious units of arguments and return values (deleted) — Details — Splinter Review

The current code converts milliseconds into microseconds for PRMJ, which then un-converts in the opposite direction. Brilliant! Be clear about what's what at all times, and use the natural dimensionality at the entry point to the cache. Inside it we then use the natural dimensionality for calculating the DST offset, but that's private and thus not a problem.

Attachment #447443 - Flags: review?(sayrer)

Jeff Walden [:Waldo]

Assignee

Comment 12

•

15 years ago

Attached patch Patch 3, incomplete: Cache offsets for previously-seen, nearly-the-same dates (deleted) — Details — Splinter Review

Results with this are as follows: TEST COMPARISON FROM TO DETAILS ============================================================================= ** TOTAL **: 1.180x as fast 94.8ms +/- 0.6% 80.4ms +/- 0.7% significant ============================================================================= date: 1.180x as fast 94.8ms +/- 0.6% 80.4ms +/- 0.7% significant format-tofte: 1.031x as fast 54.1ms +/- 1.0% 52.4ms +/- 0.9% significant format-xparb: 1.46x as fast 40.8ms +/- 0.9% 28.0ms +/- 1.1% significant The xparb win isn't as much as in the first comment because 500 of 4000ish calls are cache misses now. Maybe that can be improved still, although claims about the test in the office suggest the only way to do so is to double down on this strategy by, say, expanding the cache range by more than a month at a time (which eventually runs into its own problems over over-zealous or unsound expansion). We'll see; more investigating to do. The JS test suite passes, but that's unlikely to exercise this caching logic in interesting ways, so I need to write some very targeted tests to address this by checking for cache consistency with reality. There are also a couple other tasks to be completed: changing some comments, maybe adding a couple more; benchmarking the browser proper; seeing what sort of DST caching stats are on actual websites and not just SunSpider, Benchmarketia; and as noted, further investigating the tests' MOs. To be continued tomorrow...

Robert Sayre

Reporter

Updated

•

15 years ago

Attachment #447442 - Flags: review?(sayrer) → review+

Robert Sayre

Reporter

Updated

•

15 years ago

Attachment #447443 - Flags: review?(sayrer) → review+

Jeff Walden [:Waldo]

Assignee

Comment 13

•

15 years ago

Patches 1 and 2 pushed to TM, with a followup bustage fix: http://hg.mozilla.org/tracemonkey/rev/8e8837abeab5 http://hg.mozilla.org/tracemonkey/rev/33b1321c6b8f http://hg.mozilla.org/tracemonkey/rev/aa32a6921eaa Further work yet remains here, of course.

Jeff Walden [:Waldo]

Assignee

Comment 14

•

15 years ago

Attached file Stats from running with this a little (deleted) — Details

I built a browser with this and surfed around for a little bit (some Gmail reading, some Reader reading, some random browsing, some searching on Bing) to see what sort of caching results I got with it -- the result is the attached file, with a little simplification. Most contexts don't even attempt to get the DST offset (65 of 73 contexts), so I removed them entirely. I also cut out all other lines noting 0 instances for brevity. In summary: most contexts created (65 of 73) never even attempted to get the DST offset. Of the remainder, most pages only attempted a handful of times (1, 7, 4, 1, 5). Most of these hit the cache slightly less to more often than not -- some success, but not enough loss to matter. The 1s never hit, but that's hardly a surprise. The remaining contexts had these stats: hit: in range: 19 misses: decrease range start: 3 increase, offset change, expand: 5 increase, offset change, new range: 7 decrease, offset change, expand: 35 decrease, offset change, new range: 8 large increase: 5 large decrease: 6 total: 88 hit: in range: 20 misses: increase range end: 1 decrease range start: 3 increase, offset change, expand: 5 increase, offset change, new range: 7 decrease, offset change, expand: 35 decrease, offset change, new range: 8 large increase: 7 large decrease: 6 total: 92 hit: in range: 118 misses: increase range end: 1 decrease range start: 1 large increase: 1 total: 121 The last one obviously benefits quite a bit. The first two, which quite possibly are from the same site performing the same action (I'm hooking in right before context destruction and don't know who's doing what in these), hit a little over 20% of the time -- not great, possibly a speedup. It's important to note that not all misses are alike. Either a wide miss or a narrow miss at an end of the range that's not within thirty days of a DST change results in one costly offset computation -- exactly what already happens. So the cost of those misses is basically a few integer arithmetic operations, pretty small. (These are "{in,de}crease range end" and "large {in,de}crease" in the logging.) It's the misses where we compute two costly offsets that are more worrisome. Those happen when a narrow miss occurs but thirty days from that end of the range crosses a DST change. At that point we only know that the offset changed, but since we don't know *where* it changed we have to re-query one more time to get the offset for the originally requested time. Then, since it must equal the offset for the range or for the 30-days-off date, we can start our new range. So how do we do on the two-lookup count? 55 of 88 and 55 of 92 lookups miss, and miss semi-expensively -- but the miss-expensively case is partially balanced, and overall it's pretty rare. Other browsing sessions I ran through on various other sites (Digg, Facebook, Twitter, &c.) broadly reflected the overall verdict: most places never compute a single DST offset, some of the ones that do hit the cache nearly every time, some compute so few DST offsets that for them the overall effect is a wash but is already negligible, and in almost every case caching DST offset computations -- at least at the current date -- only very rarely results in doubled computation. So, overall, the cache is either of negligible value or is hit often enough that it does demonstrate wins -- not a huge amount, but enough to say the optimization is basically worthwhile independent of its being in a benchmark. It's way way way off the beaten path of what the real web does, to be sure, but it has motivation beyond its applicability to a benchmark.

Jeff Walden [:Waldo]

Assignee

Comment 15

•

15 years ago

Attached patch Patch 3, nearly done: Cache offsets for previously-seen, nearly-the-same dates (obsolete) (deleted) — Details — Splinter Review

This updates the comments and includes some testing, but it may still need a little work. (The test might need to be split up, for example.) Nevertheless it's nearly done.

Jeff Walden [:Waldo]

Assignee

Comment 16

•

15 years ago

Attached patch Patch 3, v1: Cache offsets for previously-seen, nearly-the-same dates, test cache behavior (deleted) — Details — Splinter Review

Okay, this should be good to go. I bumped the cache-testing depth up by one to catch any weird hysteretic bugs that three iterations wouldn't catch, dropping the number of timestamps tested a bit to compensate for overall test-execution time. I also intentionally flipped various conditions in the algorithm to test that the tests would fail from the resulting bugs, so I think there's little danger of the test not being correct and thereby missing a bug.

Attachment #448156 - Attachment is obsolete: true

Attachment #448611 - Flags: review?(sayrer)

Robert Sayre

Reporter

Comment 17

•

15 years ago

Comment on attachment 448611 [details] [diff] [review] Patch 3, v1: Cache offsets for previously-seen, nearly-the-same dates, test cache behavior The implementation of purge() seems fairly brittle. I would probably set a mustFillCache flag or something. But up to you.

Attachment #448611 - Flags: review?(sayrer) → review+

Jeff Walden [:Waldo]

Assignee

Comment 18

•

15 years ago

http://hg.mozilla.org/tracemonkey/rev/edbf0ecd1d1a Will work on a blog post on this for the benefit of the web-developing world at large, regarding the nature of our perf-winning efforts...

OS: Mac OS X → All

Hardware: x86 → All

Whiteboard: fixed-in-tracemonkey

Target Milestone: --- → mozilla1.9.3a5

Robert Sayre

Reporter

Comment 19

•

15 years ago

(In reply to comment #18) > http://hg.mozilla.org/tracemonkey/rev/edbf0ecd1d1a > > Will work on a blog post on this for the benefit of the web-developing world at > large, regarding the nature of our perf-winning efforts... Let's save it for when we are winning. We have an unrelated problem... the time tests are kind of slow. Can we add them to the Slow set?

Jeff Walden [:Waldo]

Assignee

Comment 20

•

15 years ago

The point wasn't to say we were fast, just to detail the sort of optimizations involved in making SunSpider speedier. But fair enough, there's no particular reason it need or should happen now rather than at some later, as a retrospective. As for slow, looking at this now; run in isolation the tests are all fine for me, but run with -j4 they start to be flaky. I'm trying to figure out, at least on my local system, where the crunch point is to see how much time I need to gain back to make this work, which will help with determining what possible solutions make more sense and what make less.

Jeff Walden [:Waldo]

Assignee

Comment 21

•

15 years ago

Hm, so -j2 they take 1:40ish each. -j4 they take 3:40ish each. Clearly some sort of parallel-performance cliff for this handful of tests. Still thinking...

Andreas Gal :gal

Comment 22

•

15 years ago

Can we add --no-slow-tests or --slow-tests. This is totally killing the usefulness of our test suite for quick sanity checking right before checkin.

Brendan Eich [:brendan]

Comment 23

•

15 years ago

This is a no-brainer -- Waldo, you were rightly in favor of kicking the O(n^2) measuring tests out of the default list. This is as bad or worse, in my recent experience. /be

Jeff Walden [:Waldo]

Assignee

Comment 24

•

15 years ago

Split up the test into eight rather than four. Run with -j4, any time for any individual part that I can get is a good minute under the timeout limit. For an interesting data point, these tests when split into eight, run with -j4, take from 1:05 to 1:29 to run. Running one of those splits manually, in parallel with nothing, takes 8.89s. So *something* about running in parallel means you get a 10x slowdown. Fun times.

Robert Sayre

Reporter

Comment 25

•

15 years ago

http://hg.mozilla.org/mozilla-central/rev/edbf0ecd1d1a

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

Harry Li

Comment 26

•

15 years ago

Attached patch android-build (deleted) — Details — Splinter Review

changeset http://hg.mozilla.org/mozilla-central/rev/edbf0ecd1d1a made the jscntxt.cpp broken on build for Android platform. Changing the added macro __STDC_LIMIT_MACROS before all sys includes can solve this.

Attachment #450854 - Flags: review?

Andreas Gal :gal

Comment 27

•

15 years ago

Comment on attachment 450854 [details] [diff] [review] android-build Harry, can you land the patch?

Attachment #450854 - Flags: review? → review+

Harry Li

Comment 28

•

15 years ago

And(In reply to comment #27) > (From update of attachment 450854 [details] [diff] [review]) > Harry, can you land the patch? I don't have committer access to the m-c repo.

Andreas Gal :gal

Updated

•

15 years ago

Keywords: checkin-needed

Andreas Gal :gal

Comment 29

•

15 years ago

Re-opened so we don't forget to land the 1-liner.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Brendan Eich [:brendan]

Comment 30

•

15 years ago

Followup bug is much better for auditing and bisecting and such. Sayrer already merged into m-c and resolved FIXED. Please file a new bug and re-resolve this one. /be

Harry Li

Comment 31

•

15 years ago

Bug 571751 will do the rest.

Brendan Eich [:brendan]

Comment 32

•

15 years ago

(In reply to comment #31) > Bug 571751 will do the rest. Thanks, Harry. /be

Status: REOPENED → RESOLVED

Closed: 15 years ago → 15 years ago

Resolution: --- → FIXED

Jacek Caban

Updated

•

15 years ago

Keywords: checkin-needed

You need to log in before you can comment on or make changes to this bug.