Closed
Bug 1004923
Opened 11 years ago
Closed 11 years ago
Refactor and optimize Windows implementation of PRMJ_Now
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla32
People
(Reporter: jandem, Assigned: jandem)
References
(Blocks 2 open bugs)
Details
Attachments
(12 files)
(deleted),
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
As the numbers in bug 603872 comment 0 indicate, this new Windows 8 API can make "new Date()" more than 4x faster on Windows. This is important for benchmarks like Peacekeeper.
But first I'm going to post some prmjtime.* cleanup patches, with some simple changes the code is a lot easier to understand.
Assignee | ||
Comment 1•11 years ago
|
||
This patch uses "double" instead of "long double", uses C++-style casts, and kills some minor C-isms.
Note that 'long double' is exactly the same as 'double' on Windows:
"Previous 16-bit versions of Microsoft C/C++ and Microsoft Visual C++ supported the long double, 80-bit precision data type. In Win32 programming, however, the long double data type maps to the double, 64-bit precision data type."
http://msdn.microsoft.com/en-us/library/9cx8xs15.aspx
Attachment #8416360 -
Flags: review?(n.nethercote)
Assignee | ||
Comment 2•11 years ago
|
||
Attachment #8416379 -
Flags: review?(n.nethercote)
Assignee | ||
Comment 3•11 years ago
|
||
Rewrite comments to use //, move variables to the point where they are used.
Attachment #8416386 -
Flags: review?(n.nethercote)
Assignee | ||
Comment 4•11 years ago
|
||
The outer do-while loop has condition "needsCalibration", so if we set needsCalibration to false and set returnedTime, we can instead return returnedTime directly.
Attachment #8416391 -
Flags: review?(n.nethercote)
Assignee | ||
Comment 5•11 years ago
|
||
Invert condition to make it easier to read.
Attachment #8416394 -
Flags: review?(n.nethercote)
Updated•11 years ago
|
Attachment #8416360 -
Flags: review?(n.nethercote) → review+
Comment 6•11 years ago
|
||
Comment on attachment 8416379 [details] [diff] [review]
Part 2 - Cleanup NowCalibrate
Review of attachment 8416379 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/prmjtime.cpp
@@ +80,5 @@
> + static const int64_t TimeToEpochIn100ns = 0x19DB1DED53E8000;
> + t -= TimeToEpochIn100ns;
> +
> + // Divide by 10 to convert to microseconds.
> + return double(t) * 0.1;
Did you multiple because fmul is faster than fdiv?
Attachment #8416379 -
Flags: review?(n.nethercote) → review+
Updated•11 years ago
|
Attachment #8416386 -
Flags: review?(n.nethercote) → review+
Comment 7•11 years ago
|
||
Comment on attachment 8416391 [details] [diff] [review]
Part 4 - Add some early returns
Review of attachment 8416391 [details] [diff] [review]:
-----------------------------------------------------------------
What a horrible function.
Attachment #8416391 -
Flags: review?(n.nethercote) → review+
Updated•11 years ago
|
Attachment #8416394 -
Flags: review?(n.nethercote) → review+
Comment 8•11 years ago
|
||
> Did you multiple because fmul is faster than fdiv?
Oh, I see that code existed elsewhere and you just moved it.
Assignee | ||
Comment 9•11 years ago
|
||
According to the documentation [0], QueryPerformanceFrequency never fails on Windows XP or newer and it also returns a non-zero frequency. I googled a bit and I don't see any reason to believe this is false, so I think it's safe to take advantage of it.
This means we can remove another indentation level in PRMJ_Now.
Also: "The frequency of the performance counter is fixed at system boot and is consistent across all processors. Therefore, the frequency need only be queried upon application initialization, and the result can be cached."
I'll optimize this more in a later patch.
[0] http://msdn.microsoft.com/en-us/library/windows/desktop/ms644905%28v=vs.85%29.aspx
Attachment #8416406 -
Flags: review?(n.nethercote)
Assignee | ||
Comment 10•11 years ago
|
||
There's a bug in the original code:
/* On some dual processor/core systems, we might get an earlier time
so we cache the last time that we returned */
calibration.last = js::Max(calibration.last, int64_t(highresTime));
returnedTime = calibration.last;
This value for returnedTime is never returned.. We return either highresTime or lowresTime, or try again.
I thought this was a problem with one of my patches but they just make this a lot more obvious. I think I'll just remove this code for now; I don't feel confident enough to change this (I'm also worried about DST etc changing the clock on purpose)
Assignee | ||
Comment 11•11 years ago
|
||
See the last comment; if we don't use this calibration.last value we may as well remove it. Also, on Windows 8 and newer we want to use a different API function anyway.
Attachment #8416430 -
Flags: review?(n.nethercote)
Assignee | ||
Comment 12•11 years ago
|
||
This patch adds a PRMJ_NowInit function and calls it from JS_Init, so that we can eliminate the PR_CallOnce call in PRMJ_Now.
This patch also moves the QueryPerformanceFrequency call to PRMJ_NowInit, as the frequency is fixed at system boot, so that NowCalibrate doesn't have to check for this.
Attachment #8416436 -
Flags: review?(n.nethercote)
Assignee | ||
Comment 13•11 years ago
|
||
Attachment #8416448 -
Flags: review?(n.nethercote)
Assignee | ||
Comment 14•11 years ago
|
||
The code has two critical sections: data_lock and calibration_lock. Now whenever we take calibration_lock, we also take data_lock, so calibration_lock is redundant.
Attachment #8416462 -
Flags: review?(n.nethercote)
Assignee | ||
Comment 15•11 years ago
|
||
These locks have different spincount, that was probably the reason there was a separate calibration lock (it has spincount 0 so that other threads will sleep immediately, as calibration can take a while).
But we also use MUTEX_SETSPINCOUNT(&calibration.data_lock, 0) on the data_lock before we calibrate (and restore it afterwards), so that basically has the same effect.
Assignee | ||
Comment 16•11 years ago
|
||
We're calling GetSystemTimeAdjustment to determine the kernel tick frequency, but this relies on undocumented behavior:
if (GetSystemTimeAdjustment(&timeAdjustment, &timeIncrement, &timeAdjustmentDisabled)) {
if (timeAdjustmentDisabled)
skewThreshold = timeAdjustment / 10.0;
else
skewThreshold = timeIncrement / 10.0;
}
According to MSDN [0], timeAdjustment and timeIncrement only have meaning if timeAdjustmentDisabled is false. I tested this and I got a 15.6 ms value (and nothing else) in this case, so apparently it works, but it's still undocumented.
GetSystemTimeAdjustment is also super slow: after removing the call the time for a Date.now() micro-benchmark improves from 2750 ms to 712 ms (!) on Windows 7. No wonder new Date has performance problems on Windows (bug 603872).
Furthermore, a stack overflow answer [1] confirms this is not the right function to use here:
"Summary: GetSystemTimeAdjustment is not the function to look at. This function only tells how and if time-changes are done. Depending on the setting of the multimedia timer interface timeBeginPeriod, the progress of time may be done more often and in smaller portions. Use NtQueryTimerResolution to receive the actual time increment."
Unfortunately NtQueryTimerResolution is an undocumented API. This patch instead hardcodes the tick frequency at 15.6 ms, this seems to be the max possible value. It means we'll recalibrate the clock if the highres and lowres timers diverge by more than 30 ms, I think that's fine.
[0] http://msdn.microsoft.com/en-us/library/windows/desktop/ms724394%28v=vs.85%29.aspx
[1] http://stackoverflow.com/a/11743614
Attachment #8416527 -
Flags: review?(n.nethercote)
Assignee | ||
Comment 17•11 years ago
|
||
Use GetSystemTimePreciseAsFileTime if it's available (Windows 8 and newer).
Attachment #8416543 -
Flags: review?(n.nethercote)
Assignee | ||
Comment 18•11 years ago
|
||
Parts 1-5:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e5f2be747968
https://hg.mozilla.org/integration/mozilla-inbound/rev/6c474c23b773
https://hg.mozilla.org/integration/mozilla-inbound/rev/a6a9419d99f2
https://hg.mozilla.org/integration/mozilla-inbound/rev/59b8b6908582
https://hg.mozilla.org/integration/mozilla-inbound/rev/32b0aa8855c8
njn, thanks for the (quick) reviews :)
Keywords: leave-open
Assignee | ||
Updated•11 years ago
|
Summary: Use GetSystemTimePreciseAsFileTime for PRMJ_Now on Windows 8 → Refactor and optimize Windows implementation of PRMJ_Now
Comment 19•11 years ago
|
||
> njn, thanks for the (quick) reviews :)
You got lucky that I had some time free on Friday night. The rest will have to wait until Monday, I'm afraid.
Assignee | ||
Comment 20•11 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #19)
> You got lucky that I had some time free on Friday night. The rest will have
> to wait until Monday, I'm afraid.
Sure, that's fine. I probably won't have time to land them before Monday anyway.
Assignee | ||
Comment 21•11 years ago
|
||
Parts 6-12 are also green on Try: https://tbpl.mozilla.org/?tree=Try&rev=947dc057d061
Comment 22•11 years ago
|
||
Comment 23•11 years ago
|
||
Comment on attachment 8416406 [details] [diff] [review]
Part 6 - QueryPerformanceFrequency never fails on Windows XP+
Review of attachment 8416406 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/prmjtime.cpp
@@ +211,4 @@
> #ifdef JS_THREADSAFE
> PR_CallOnce(&calibrationOnce, NowInit);
> #endif
> + while (true) {
Good -- I was wondering if this loop could become a |while (true)| loop.
Attachment #8416406 -
Flags: review?(n.nethercote) → review+
Comment 24•11 years ago
|
||
Comment on attachment 8416430 [details] [diff] [review]
Part 7 - Kill calibration.last, returnedTime
Review of attachment 8416430 [details] [diff] [review]:
-----------------------------------------------------------------
Can CalibrationData::last be removed now? (Maybe you do that in a later patch.)
Attachment #8416430 -
Flags: review?(n.nethercote) → review+
Updated•11 years ago
|
Attachment #8416436 -
Flags: review?(n.nethercote) → review+
Updated•11 years ago
|
Attachment #8416448 -
Flags: review?(n.nethercote) → review+
Comment 25•11 years ago
|
||
Comment on attachment 8416462 [details] [diff] [review]
Part 10 - Remove calibration lock
Review of attachment 8416462 [details] [diff] [review]:
-----------------------------------------------------------------
r=me, but I'm a bit hazy on this one -- I don't entirely follow how these spinlocks work.
Attachment #8416462 -
Flags: review?(n.nethercote) → review+
Comment 26•11 years ago
|
||
Comment on attachment 8416527 [details] [diff] [review]
Part 11 - Remove GetSystemTimeAdjustment call
Review of attachment 8416527 [details] [diff] [review]:
-----------------------------------------------------------------
Windows... sigh.
> This patch instead hardcodes the tick frequency at 15.6 ms, this seems to be the max
> possible value. It means we'll recalibrate the clock if the highres and lowres timers
> diverge by more than 30 ms, I think that's fine.
Can you put these reasoning into a comment? r=me with that.
Attachment #8416527 -
Flags: review?(n.nethercote) → review+
Updated•11 years ago
|
Attachment #8416543 -
Flags: review?(n.nethercote) → review+
Comment 27•11 years ago
|
||
Code cleanups like this are painful but worthwhile. Thanks for doing it.
Assignee | ||
Comment 28•11 years ago
|
||
Parts 6-10 (will land 11 and 12 in a few hours, to help bisection and analysis of possible perf changes):
https://hg.mozilla.org/integration/mozilla-inbound/rev/607823013467
https://hg.mozilla.org/integration/mozilla-inbound/rev/49af8c91e026
https://hg.mozilla.org/integration/mozilla-inbound/rev/d52cc82aade3
https://hg.mozilla.org/integration/mozilla-inbound/rev/844972bf5334
https://hg.mozilla.org/integration/mozilla-inbound/rev/ba31cd1620bd
(In reply to Nicholas Nethercote [:njn] from comment #24)
> Can CalibrationData::last be removed now? (Maybe you do that in a later
> patch.)
It's removed in this patch :)
(In reply to Nicholas Nethercote [:njn] from comment #26)
> Can you put these reasoning into a comment? r=me with that.
Good idea.
(In reply to Nicholas Nethercote [:njn] from comment #27)
> Code cleanups like this are painful but worthwhile. Thanks for doing it.
No problem; thanks again for the fast reviews.
Assignee | ||
Comment 29•11 years ago
|
||
And 11 + 12:
https://hg.mozilla.org/integration/mozilla-inbound/rev/897c46ceceaf
https://hg.mozilla.org/integration/mozilla-inbound/rev/49957fbf74e9
Keywords: leave-open
Comment 30•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/607823013467
https://hg.mozilla.org/mozilla-central/rev/49af8c91e026
https://hg.mozilla.org/mozilla-central/rev/d52cc82aade3
https://hg.mozilla.org/mozilla-central/rev/844972bf5334
https://hg.mozilla.org/mozilla-central/rev/ba31cd1620bd
https://hg.mozilla.org/mozilla-central/rev/897c46ceceaf
https://hg.mozilla.org/mozilla-central/rev/49957fbf74e9
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
(In reply to Jan de Mooij [:jandem] from comment #16)
> "Summary: GetSystemTimeAdjustment is not the function to look at. This
> function only tells how and if time-changes are done. Depending on the
> setting of the multimedia timer interface timeBeginPeriod, the progress of
> time may be done more often and in smaller portions. Use
> NtQueryTimerResolution to receive the actual time increment."
Note that we do use timeBeginPeriod/timeEndPeriod in the RefreshDriver to get a smaller inteval for high frequency updates; but I don't think that should matter here. Date.now() can be inaccurate, window.performance.now() is the preferred more accurate timing method.
You need to log in
before you can comment on or make changes to this bug.
Description
•