Closed Bug 1204027 Opened 9 years ago Closed 9 years ago

Escape sequences are not allowed in reserved words

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla43
Tracking Status
firefox43 --- fixed

People

(Reporter: anba, Assigned: Waldo)

References

Details

Attachments

(2 files, 1 obsolete file)

Test cases: --- js> (class {constructor(){} st\u0061tic m(){return 0}}).m js> for (var foo o\u0066 [1]) ; js> ({g\u0065t x(){return 0}}).x js> function f() { return n\u0065w.target } js> function f() { return new.t\u0061rget } js> function f() { return n\u0065w Array } js> \u0064o { } while (0) --- Expected: Throws SyntaxError Actual: No SyntaxError ES2015, 5.1.5 Grammar Notation: > Terminal symbols of the lexical, RegExp, and numeric string grammars are shown in fixed width font, both > in the productions of the grammars and throughout this specification whenever the text directly refers to such a > terminal symbol. These are to appear in a script exactly as written. ES2015, 11.6.2 Reserved Words > NOTE The ReservedWord definitions are specified as literal sequences of specific SourceCharacter elements. A code > point in a ReservedWord cannot be expressed by a \ UnicodeEscapeSequence.
I don't think this demonstrates a requirement to throw an exception. If anything does, it's the wording in 11.6, but even that I'm not convinced requires an error. IIRC, the spec language has been the same here for a long time, at least since ES3. My memory is that Java and JS differ: In Java, a unicode escape sequence is replaced by the code point before keyword determination is done, whereas in JS a unicode escape sequence precludes the token being a keyword. Thus in JS, n\u0065w is an identifier. It is clearly a valid IdentifierName, and by 11.6.2 an Identifier is an IdentifierName that is not a Reserved Word, and clearly the escape precludes it being a Reserved Word.
(In reply to Lars T Hansen [:lth] from comment #1) > Thus in JS, n\u0065w is an identifier. It is clearly a valid > IdentifierName, and by 11.6.2 an Identifier is an IdentifierName that is not > a Reserved Word, and clearly the escape precludes it being a Reserved Word. Thanks for reminding me about this case! n\u0065w is an IdentifierName, but not an Identifier because of 12.1.1: > 12.1.1 Static Semantics: Early Errors > Identifier : IdentifierName but not ReservedWord > - It is a Syntax Error if StringValue of IdentifierName is the same String value as the StringValue of any > ReservedWord except for yield.
(In reply to André Bargull from comment #2) > n\u0065w is an IdentifierName, but not an Identifier because of 12.1.1: > > > 12.1.1 Static Semantics: Early Errors > > Identifier : IdentifierName but not ReservedWord > > - It is a Syntax Error if StringValue of IdentifierName is the same String value as the StringValue of any > > ReservedWord except for yield. Ah ok. That clause seems to be new in ES6. Strange to add a backward incompatibility for what is essentially an obscure corner case - maybe a major engine already had the restriction?
(In reply to Lars T Hansen [:lth] from comment #3) > Ah ok. That clause seems to be new in ES6. Strange to add a backward > incompatibility for what is essentially an obscure corner case - maybe a > major engine already had the restriction? It changed back and forth when comparing ES3, ES5 and ES6. ES3 disallowed to create identifiers whose name is a reserved word (ES3, 7.6), ES5 allowed them (ES5, 7.6) and ES6 reverted that again. IE never implemented the ES5 identifier syntax updates (bug 744784, comment 7), that's why TC39 probably considered it safe to make this change [1]. [1] https://github.com/tc39/tc39-notes/blob/master/es6/2013-11/nov-20.md#42-clarification-of-the-interaction-of-unicode-escapes-and-identification-syntax
If js> ({g\u0065t x(){return 0}}).x is supposed to be an error, is js> ({g\u0065t: 0}) supposed to *not* be an error? I suspect so, but that's going to be a bit ugh to deal with, because it means we can't just put the requirement here in the tokenizer, in a single place that deals with everything. :-\
Attached patch Patch (obsolete) (deleted) — Splinter Review
Attachment #8660915 - Flags: review?(arai.unmht)
Assignee: nobody → jwalden+bmo
Status: NEW → ASSIGNED
Comment on attachment 8660915 [details] [diff] [review] Patch Review of attachment 8660915 [details] [diff] [review]: ----------------------------------------------------------------- Changes to ReservedWord looks good. It might be better to use another error message for non-ReservedWord, like |as|, |of|, etc, since it's an SyntaxError for different reason. https://dxr.mozilla.org/mozilla-central/rev/9ed17db42e3e46f1c712e4dffd62d54e915e0fac/js/src/frontend/Parser.cpp#8106 > Parser<ParseHandler>::comprehensionFor(GeneratorKind comprehensionKind) > { > ... > if (!tokenStream.matchContextualKeyword(&matched, context->names().of)) This is out of ES6 spec tho, might be better to call checkUnescapedName here too? https://dxr.mozilla.org/mozilla-central/rev/9ed17db42e3e46f1c712e4dffd62d54e915e0fac/js/src/frontend/Parser.cpp#4822 > Parser<FullParseHandler>::exportDeclaration() > { > ... > if (tt == TOK_NAME && tokenStream.currentName() == context->names().from) { We should check escapeness here and if it's escaped, we should treat it as not-|from|. I expect following returns 2 statements. Reflect.parse("export { x }\nfro\\u006D", {target:"module"}); ::: js/src/frontend/TokenStream.cpp @@ +1226,5 @@ > } > > + if (const KeywordInfo* kw = FindKeyword(chars, length)) { > + // Represent keywords as keyword tokens unless told otherwise. > + if (modifier != KeywordIsName) { Maybe we could swap above 2 |if|s? ::: js/src/tests/ecma_6/extensions/keyword-unescaped-requirement-modules.js @@ +23,5 @@ > +function memberVariants(code) > +{ > + return [classesEnabled() ? "(class { constructor() {} " + code + " });" : "@", > + "({ " + code + " })"]; > +} classSyntax and memberVariants could be removed. @@ +31,5 @@ > + "\\u0069mport f from g", > + "i\\u006dport g from h", > + "import * \\u0061s foo", > + "import {} fro\\u006d bar", > + "import { x \\u0061s y } from baz", token after |from| should be string literal. @@ +35,5 @@ > + "import { x \\u0061s y } from baz", > + > + "\\u0065xport function f() {}", > + "e\\u0078port function g() {}", > + "export * fro\\u006d fnord", This is syntax error because of |fnord|, not |from| now. it should be a string literal.
Attachment #8660915 - Flags: review?(arai.unmht) → feedback+
FromClause handling in the different export rules varies, so we should split up export * from export { as far as parsing goes.
Attachment #8661738 - Flags: review?(arai.unmht)
Attached patch Updated (deleted) — Splinter Review
It seems to me that as/of and so on are keywords in context, so the error message is quite fine. (Plus if I wanted to change that, I'd have to thread a message number through stuff that's just muddied up by having it.) The point about the out-of-spec bits suggested to me that matchContextualKeyword itself should check for escapes. I made that change, and when I noticed this newly-powerful method could then be used a bunch of places in the existing code, I went and made those changes. Braced-export from-handling is so finicky. The swapped ifs made sense in an earlier iteration of this patch. ;-) This is one reason we have reviewing, to pick up the occasional forgotten gunk. I adjusted the |from "str"| bits everywhere. Too bad without super-finicky offset testing you can't detect that sort of error-in-wrong-place issue.
Attachment #8661742 - Flags: review?(arai.unmht)
Attachment #8660915 - Attachment is obsolete: true
Comment on attachment 8661738 [details] [diff] [review] Rejigger export-parsing code to make a subsequent change simpler Review of attachment 8661738 [details] [diff] [review]: ----------------------------------------------------------------- Totally reasonable change :)
Attachment #8661738 - Flags: review?(arai.unmht) → review+
Comment on attachment 8661742 [details] [diff] [review] Updated Review of attachment 8661742 [details] [diff] [review]: ----------------------------------------------------------------- Thank you for updating! r=me with minor fixes. (In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #9) > It seems to me that as/of and so on are keywords in context, so the error > message is quite fine. (Plus if I wanted to change that, I'd have to thread > a message number through stuff that's just muddied up by having it.) I see :) ::: js/src/frontend/Parser.cpp @@ +4858,5 @@ > } else { > tokenStream.ungetToken(); > } > > + if (!MatchOrInsertSemicolon(tokenStream, TokenStream::Operand)) Thank you for fixing this :D ::: js/src/tests/ecma_6/Syntax/keyword-unescaped-requirement.js @@ +49,5 @@ > + ...memberVariants("s\\u0065t 42() {}"), > + "for (var foo o\\u0066 [1]) ;", > + "for (var foo \\u006ff [1]) ;", > + "for (var foo i\\u006e [1]) ;", > + "for (var foo \\u006e9n [1]) ;", s/e// ::: js/src/tests/ecma_6/extensions/keyword-unescaped-requirement-modules.js @@ +26,5 @@ > + "\\u0065xport function f() {}", > + "e\\u0078port function g() {}", > + "export * fro\\u006d 'fnord'", > + "export d\\u0065fault var x = 3;", > + "export { q } fro\\u006d 'qSupplier' };", would be better to remove " };" part, to make sure the syntax error is thrown by |fro\\u006d|. @@ +74,5 @@ > + > + // Soon to be not an extension, maybe... > + "(for (x \\u006ff [1]) x)", > + "(for (x o\\u0066 [1]) x)", > + ]; I don't think this file is the right place to test them. keyword-unescaped-requirement.js or a new file would be better.
Attachment #8661742 - Flags: review?(arai.unmht) → review+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: