Open
Bug 1028274
Opened 10 years ago
Updated 2 years ago
IonMonkey: Range Analysis is error prone
Categories
(Core :: JavaScript Engine: JIT, defect, P5)
Core
JavaScript Engine: JIT
Tracking
()
NEW
People
(Reporter: nbp, Unassigned)
Details
Attachments
(1 file)
(deleted),
patch
|
Details | Diff | Splinter Review |
We need to add a top-comment which gives a few examples of possible representation of Ranges and which suggests that the code should be reviewed carefully by *at least* 2 peers.
I think everybody who modified this code made mistakes, and quite frequently these modifications caused some critical security issues.
What do you think?
Reporter | ||
Comment 1•10 years ago
|
||
Attachment #8443666 -
Flags: review?(sunfish)
Attachment #8443666 -
Flags: review?(hv1989)
Reporter | ||
Updated•10 years ago
|
Attachment #8443666 -
Flags: feedback?(jwalden+bmo)
Comment 2•10 years ago
|
||
Comment on attachment 8443666 [details] [diff] [review]
Add an overview of Range Analysis Ranges.
Review of attachment 8443666 [details] [diff] [review]:
-----------------------------------------------------------------
This top-level comment can probably replace the old comment starting with "Absolute ranges" currently inside the RangeAnalysis class.
::: js/src/jit/RangeAnalysis.h
@@ +6,5 @@
>
> +// Attention: Any modification to this file should be reviewed carefully by *at
> +// least* 2 peers of the Range Analysis modules.
> +//
> +// Range Analysis is the process which estimates the bounds of every operations.
"Estimates" is not the right word here. Range Analysis is the code which computes bounds and other properties of values.
@@ +18,5 @@
> +// if (x <= 20) {
> +// // x range is [10, 20]
> +// y = x + 5; // y range is [15, 25]
> +// }
> +// }
This section talking about Beta nodes feels a little out of place here. I think it would be good to start by introducing the more basic range analysis concepts first, e.g. how x&255 is always in [0,255] and so on, and then cover beta node insertion as a method for enhancing the range analysis.
@@ +24,5 @@
> +// Range Analysis goal is to replace operations by faster math operations (such
> +// as Integer) as long as we can keep the same semantic as the Double
> +// arithmetic. To do so, Range Analysis is precise on the Int32 range, and and
> +// approximate the Int52 range based with the exponent of the range (52 is
> +// coming from the size of the mantissa of Double).
"to replace operations by faster math operations" sounds like we're replacing non-math with math. If you say "to replace math operations by faster math operations", it's more clear. And it would be nice to elaborate a little, something along the lines of "For example,
@@ +37,5 @@
> +//
> +// true = F or I = false
> +// true = [ lower , upper ] = true
> +// false = ] ? ? [ = false
> +// (< pow(2, max_exponent_ + 1))
Unfortunately, I'm mystified by this diagram. It's not clear to me what either the vertical or horizontal alignment is supposed to convey.
@@ +43,5 @@
> +//
> +// - If an Int32 bound is not set, then we use the max_exponent as a sustitute.
> +//
> +// - The fractional part is always within the lower and upper bound when they exists,
> +// but it is not included in the exponent.
What does "it is not included in the exponent" mean? It sounds like it means something that isn't true; the exponent is always a conservative bound on the actual value, even when it has a fractional part.
@@ +46,5 @@
> +// - The fractional part is always within the lower and upper bound when they exists,
> +// but it is not included in the exponent.
> +//
> +// The max_exponent is not an upper bound, it is the power of 2 of the upper bit
> +// of the absolute value of integral part. For example:
This is confusing because in practice, the purpose of the max_exponent is to imply an upper bound (as well as a lower bound). I suggest describing the max_exponent in terms of the IEEE-754 floating-point representation. It's the maximum possible value of the exponent.
@@ +47,5 @@
> +// but it is not included in the exponent.
> +//
> +// The max_exponent is not an upper bound, it is the power of 2 of the upper bit
> +// of the absolute value of integral part. For example:
> +// max_exponent_ = max(exp(round(max(abs(l), abs(u)))), 0)
Is this C-ish code? If so this should use fmax and fabs instead of max and abs.
Also, this example is not correct. It should use log2 rather than exp. Also, it should round down to the nearest power of 2 (it's possible that's what you meant by round, but it's not clear).
@@ +53,5 @@
> +// INT32_MIN - 1 : max_exponent_ == 31
> +// INT32_MIN : max_exponent_ == 31 (abs(INT32_MIN) == 2 ** 31)
> +// INT32_MIN + 1 : max_exponent_ == 30 (abs(0x80000001) == 0x7fffffff)
> +// -2 : max_exponent_ == 1
> +// -1.5 : max_exponent_ == 0 (round(1.5) == 1 == 2 ^ 0)
You used ** for exponentiation above and ^ here. I kind of prefer ** because even though it's not a C++ operator, it's at least not syntactically the same as a C++ operator that does something different.
@@ +55,5 @@
> +// INT32_MIN + 1 : max_exponent_ == 30 (abs(0x80000001) == 0x7fffffff)
> +// -2 : max_exponent_ == 1
> +// -1.5 : max_exponent_ == 0 (round(1.5) == 1 == 2 ^ 0)
> +// -1 : max_exponent_ == 0
> +// -0.5 : max_exponent_ == 0 (max(-1, 0) == 0)
It's not clear why max(-1, 0) is interesting here.
@@ +67,5 @@
> +// INT32_MAX + 1 : max_exponent_ == 31 (INT32_MAX + 1 == abs(INT32_MIN))
> +//
> +//
> +// The lower_ bound is set (hasInt32LowerBound_) when the range of the value is
> +// above or equal the content of the lower_ bound value. Similarly for the
Nit: "or equal the content" should be "or equal to the content"
@@ +68,5 @@
> +//
> +//
> +// The lower_ bound is set (hasInt32LowerBound_) when the range of the value is
> +// above or equal the content of the lower_ bound value. Similarly for the
> +// upper bound.
It would be good to indicate how NaN is handled here, even if you just say "ordered and above" instead of just "above".
@@ +70,5 @@
> +// The lower_ bound is set (hasInt32LowerBound_) when the range of the value is
> +// above or equal the content of the lower_ bound value. Similarly for the
> +// upper bound.
> +//
> +// When we do not not have bound, we represent it with an open set, and when the
Nit: "not not"
@@ +71,5 @@
> +// above or equal the content of the lower_ bound value. Similarly for the
> +// upper bound.
> +//
> +// When we do not not have bound, we represent it with an open set, and when the
> +// bound exists we represent it with a close set, as it includes the fractional
Nit: "close" should be "closed"
@@ +72,5 @@
> +// upper bound.
> +//
> +// When we do not not have bound, we represent it with an open set, and when the
> +// bound exists we represent it with a close set, as it includes the fractional
> +// part. Constants which are cannot fit in the Int32 range are represented as
Nit: "which are cannot" should be "which cannot"
@@ +75,5 @@
> +// bound exists we represent it with a close set, as it includes the fractional
> +// part. Constants which are cannot fit in the Int32 range are represented as
> +// either being higher than INT32_MAX, or lower than INT32_MIN.
> +//
> +// INT32_MIN - 1 is represented as I ] ? , INT32_MIN] (< pow(2, 31 + 1))
The backwards-brace notation is inconsistent with what Range::print prints. I could be convinced that it's better, but in that case we should change Range::print to do the same thing.
@@ +88,5 @@
> +// INT32_MAX is represented as I [INT32_MAX, INT32_MAX]
> +// INT32_MAX + 1 is represented as I [INT32_MAX, ? [ (< pow(2, 31 + 1))
> +//
> +// Note, that the exponent might be more precise than the range for power of 2
> +// bounds.
Nit: This should also say "For example:" to introduce the next line, which is an example.
@@ +97,5 @@
> +// both infinities:
> +//
> +// NaN is represented as ] ?, ? [ (U inf U NaN)
> +// +Inf is represented as [INT32_MIN, ? [ (U inf)
> +// -Inf is represented as ] ?, INT32_MAX] (U inf)
It might be good to discuss the magic IncludesInfinity and IncludesInfinityAndNaN values here.
@@ +100,5 @@
> +// +Inf is represented as [INT32_MIN, ? [ (U inf)
> +// -Inf is represented as ] ?, INT32_MAX] (U inf)
> +//
> +// Corner cases:
> +// - Any known exponent is bound at infinity. (except for NaN)
I can't tell what this sentence means.
@@ +102,5 @@
> +//
> +// Corner cases:
> +// - Any known exponent is bound at infinity. (except for NaN)
> +//
> +// - An exponent at infinity should not be shrinked (unless truncated)
Nit: The past tense of shrink is shrunk.
At a higher level, I'm not sure what that word means in this context. What situation of shrinking is this?
Attachment #8443666 -
Flags: review?(sunfish)
Comment 3•10 years ago
|
||
Comment on attachment 8443666 [details] [diff] [review]
Add an overview of Range Analysis Ranges.
Review of attachment 8443666 [details] [diff] [review]:
-----------------------------------------------------------------
Since sunfish is requesting a lot of changes. I think it is better to wait to review a more up to date version.
Now it would be really good to have this extra information. Thanks!
Attachment #8443666 -
Flags: review?(hv1989)
Comment 4•10 years ago
|
||
Comment on attachment 8443666 [details] [diff] [review]
Add an overview of Range Analysis Ranges.
Review of attachment 8443666 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/RangeAnalysis.cpp
@@ +6,5 @@
>
> +// Attention: Any modification to this file should be reviewed carefully by *at
> +// least* 2 peers of the Range Analysis modules.
> +//
> +// see the comment in RangeAnalysis.h
Bold. I like this.
Since when is there a range analysis module? I don't disagree, just didn't know there was one!
::: js/src/jit/RangeAnalysis.h
@@ +6,5 @@
>
> +// Attention: Any modification to this file should be reviewed carefully by *at
> +// least* 2 peers of the Range Analysis modules.
> +//
> +// Range Analysis is the process which estimates the bounds of every operations.
Range analysis is the process of determining, for every numeric expression, a range that contains every value that expression might be at runtime.
?
@@ +18,5 @@
> +// if (x <= 20) {
> +// // x range is [10, 20]
> +// y = x + 5; // y range is [15, 25]
> +// }
> +// }
Very, very much agreed about starting with the basic concept first. Beta node issues are *so* not anything that should appear in at least the first half (or more) of any comments.
@@ +22,5 @@
> +// }
> +//
> +// Range Analysis goal is to replace operations by faster math operations (such
> +// as Integer) as long as we can keep the same semantic as the Double
> +// arithmetic. To do so, Range Analysis is precise on the Int32 range, and and
"""
The goal of range analysis is to recognize when an operation may be sped up, without loss of correctness, because its inputs occupy narrower ranges than simply "every possible double" or "every possible int32_t". Such speedups may be achieved by replacing slow operations with faster operations (that might be unsound in general, but which are provably correct in particular circumstances).
"""
@@ +24,5 @@
> +// Range Analysis goal is to replace operations by faster math operations (such
> +// as Integer) as long as we can keep the same semantic as the Double
> +// arithmetic. To do so, Range Analysis is precise on the Int32 range, and and
> +// approximate the Int52 range based with the exponent of the range (52 is
> +// coming from the size of the mantissa of Double).
Too early to talk about exactness, on int32_t or int52_t or any other. Examples first! Then you can dig down into the vagaries of our particular version of it, why they exist, and what they support.
@@ +37,5 @@
> +//
> +// true = F or I = false
> +// true = [ lower , upper ] = true
> +// false = ] ? ? [ = false
> +// (< pow(2, max_exponent_ + 1))
Agreed. I might be able to puzzle out the meaning of this if I stared at it. But that's only because I know roughly what it's talking about already. Not so helpful to someone new, wanting to understand range analysis with less or not background.
@@ +47,5 @@
> +// but it is not included in the exponent.
> +//
> +// The max_exponent is not an upper bound, it is the power of 2 of the upper bit
> +// of the absolute value of integral part. For example:
> +// max_exponent_ = max(exp(round(max(abs(l), abs(u)))), 0)
I'd assume this is pure pseudocode, with abs/max/exp taking their normal meanings. *round*, on the other hand, isn't clear -- should be roundDown, roundToZero, floor, or something else like that.
@@ +49,5 @@
> +// The max_exponent is not an upper bound, it is the power of 2 of the upper bit
> +// of the absolute value of integral part. For example:
> +// max_exponent_ = max(exp(round(max(abs(l), abs(u)))), 0)
> +//
> +// INT32_MIN - 1 : max_exponent_ == 31
In the interests of getting this back to you, since sunfish wants a bunch of changes that'll substantially change this: how many of these esoteric details should be deferred to class-overview documentation for Range specifically, rather than as an overview comment for the entire file? I would keep the range analysis idea/concept to the file overview, and I'd keep particular edge-case details of how Range specifically works in the class overview.
Attachment #8443666 -
Flags: feedback?(jwalden+bmo)
Reporter | ||
Updated•8 years ago
|
Priority: -- → P5
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•