Closed Bug 1154809 Opened 10 years ago Closed 9 years ago

don't use regexps to parse css

Categories

(DevTools :: Inspector, defect)

defect
Not set
normal

Tracking

(firefox40 affected, firefox41 fixed)

RESOLVED FIXED
Firefox 41
Tracking Status
firefox40 --- affected
firefox41 --- fixed

People

(Reporter: tromey, Assigned: tromey)

References

Details

Attachments

(7 files, 35 obsolete files)

(deleted), patch
tromey
: review+
Details | Diff | Splinter Review
(deleted), patch
tromey
: review+
Details | Diff | Splinter Review
(deleted), patch
tromey
: review+
Details | Diff | Splinter Review
(deleted), patch
tromey
: review+
Details | Diff | Splinter Review
(deleted), patch
tromey
: review+
Details | Diff | Splinter Review
(deleted), patch
tromey
: review+
Details | Diff | Splinter Review
(deleted), patch
tromey
: review+
Details | Diff | Splinter Review
Once bug 1153305 lands, we should stop using regexps to parse css.
Instead, we should use the tokenizer (bug 1152033) perhaps in conjunction
with ad hoc parsers for part of the css grammar.
Mostly it is dead code, but there is also CssLogic.prettifyCSS.
And let's not forget output-parser.js
Bug 1120616 is going to add a use of CSS_PROP_RE, so this bug should wait
until that goes in so I can change it while I'm doing the rest.
Depends on: 1120616
Here are some more candidates:
- tokenizeComputedFilters in FilterWidget.js
  http://mxr.mozilla.org/mozilla-central/source/browser/devtools/shared/widgets/FilterWidget.js#680
- isValidTimingFunction in CubicBezierWidget.js (possibly other places in this file)
  http://mxr.mozilla.org/mozilla-central/source/browser/devtools/shared/widgets/CubicBezierWidget.js#822
OS: Linux → All
Hardware: x86_64 → All
While searching for css regex-based parsing in the devtools code, I found 3 regular expressions that aren't used anymore:
const GRADIENT_RE = /\b(repeating-)?(linear|radial)-gradient\(((rgb|hsl)a?\(.+?\)|[^\)])+\)/gi;
const BORDERCOLOR_RE = /^border-[-a-z]*color$/ig;
const BORDER_RE = /^border(-(top|bottom|left|right))?$/ig;
here:
http://mxr.mozilla.org/mozilla-central/source/browser/devtools/shared/widgets/Tooltip.js#37
A patch that removes this code would be nice.
OMG, so much dead code, we have these other 3:

// Used to split on css line separators
const CSS_LINE_RE = /(?:[^;\(]*(?:\([^\)]*?\))?[^;\(]*)*;?/g;

// Used to parse a single property line.
const CSS_PROP_RE = /\s*([^:\s]*)\s*:\s*(.*?)\s*(?:! (important))?;?$/;

// Used to parse an external resource from a property value
const CSS_RESOURCE_RE = /url\([\'\"]?(.*?)[\'\"]?\)/;

in http://mxr.mozilla.org/mozilla-central/source/browser/devtools/styleinspector/rule-view.js#34
CSS_RESOURCE_RE is used but in a function that isn't used (getResourceURI), the other 2 aren't used.
css-color.js is another good candidate with the color parsing regexps: http://mxr.mozilla.org/mozilla-central/source/toolkit/devtools/css-color.js
And again, more unused css parsing regexps:
const RX_UNIVERSAL_SELECTOR = /\s*\*\s*/g;
const RX_NOT = /:not\((.*?)\)/g;
const RX_PSEUDO_CLASS_OR_ELT = /(:[\w-]+\().*?\)/g;
const RX_CONNECTORS = /\s*[\s>+~]\s*/g;
const RX_ID = /\s*#\w+\s*/g;
const RX_CLASS_OR_ATTRIBUTE = /\s*(?:\.\w+|\[.+?\])\s*/g;
const RX_PSEUDO = /\s*:?:([\w-]+)(\(?\)?)\s*/g;
http://mxr.mozilla.org/mozilla-central/source/toolkit/devtools/styleinspector/css-logic.js#45
Attached patch remove dead css-parsing code (obsolete) (deleted) — Splinter Review
This patch removes the dead code.
Attached patch rewrite prettifyCSS to use tokenizer (obsolete) (deleted) — Splinter Review
This rewrites prettifyCSS to use the tokenizer.

Note that there is still a hack in here: prettifyCSS uses a regexp to
remove HTML comments:

  text = text.replace(/(?:^\s*<!--[\r\n]*)|(?:\s*-->\s*$)/g, "");

The lexer can handle these just fine -- but I left this here because
the contract of this function was not completely clear to me.  In
particular I think the "early" exit case means that any "{" followed
by a newline will abort the reformatting.

I think maybe a better approach here would be to follow the debugger:
display the sources as-is, and give the user a button to explicitly
reformat.

No tests yet, I've just tried it interactively a bit.
Also see: https://bugzilla.mozilla.org/show_bug.cgi?id=971234#c18
The last test there doesn't work with the current code, but ought to
work after the lexer-based rewrite.  The test case is already in the
patch in that bug, just commented out.
Note that prettifyCSS was also rewritten in bug 724505
(In reply to Tom Tromey :tromey from comment #11)
> Also see: https://bugzilla.mozilla.org/show_bug.cgi?id=971234#c18
> The last test there doesn't work with the current code, but ought to
> work after the lexer-based rewrite.  The test case is already in the
> patch in that bug, just commented out.

While updating output-parser.js, I looked into this, and concluded that
it should be handled in a new bug, as I think this requires new code in
OutputParser and/or FilterWidget.
Blocks: 1158288
Attached patch remove dead css-parsing code (obsolete) (deleted) — Splinter Review
This removes just the dead code.
Attachment #8593985 - Attachment is obsolete: true
Attached patch remove last CSS regexp from rule-view.js (obsolete) (deleted) — Splinter Review
This fixes up the last use of CSS_PROP_RE; fallout from bug 1120616.
Attached patch rewrite prettifyCSS to use tokenizer (obsolete) (deleted) — Splinter Review
Update prettifyCSS.  I think this makes the output a bit nicer in some
cases as well.
Attachment #8593987 - Attachment is obsolete: true
Change FilterWidget to use the lexer.
Attached patch rewrite CubicBezierWidget to use CSS tokenizer (obsolete) (deleted) — Splinter Review
Attached patch avoid regexps in css-color (obsolete) (deleted) — Splinter Review
I've attached the WIP patches.  I have not yet run them through the test suite,
though interactively they seem to work ok.  The output-parser.js patch
also depends on bug 1154356 now, to avoid patch conflicts.
Depends on: 1154356
Attached patch remove dead css-parsing code (obsolete) (deleted) — Splinter Review
Attachment #8597967 - Attachment is obsolete: true
Attached patch remove last CSS regexp from rule-view.js (obsolete) (deleted) — Splinter Review
Attachment #8597968 - Attachment is obsolete: true
Attached patch rewrite prettifyCSS to use CSSLexer (obsolete) (deleted) — Splinter Review
Attachment #8597969 - Attachment is obsolete: true
Attachment #8597970 - Attachment is obsolete: true
Attached patch rewrite CubicBezierWidget to use CSSLexer (obsolete) (deleted) — Splinter Review
Attachment #8597971 - Attachment is obsolete: true
Attached patch avoid regexps in css-color (obsolete) (deleted) — Splinter Review
Attachment #8597972 - Attachment is obsolete: true
Attached patch rewrite output-parser.js to use CSSLexer (obsolete) (deleted) — Splinter Review
Attachment #8600019 - Attachment is obsolete: true
Attachment #8597973 - Attachment is obsolete: true
Attachment #8600019 - Attachment is obsolete: false
Attached patch remove dead css-parsing code (obsolete) (deleted) — Splinter Review
Attachment #8600013 - Attachment is obsolete: true
Attachment #8600014 - Attachment is obsolete: true
Attachment #8600015 - Attachment is obsolete: true
Attachment #8600016 - Attachment is obsolete: true
Attachment #8600018 - Attachment is obsolete: true
Attachment #8600019 - Attachment is obsolete: true
Attachment #8600021 - Attachment is obsolete: true
In earlier versions of the patch I made this use parseDeclarations.
However, I no longer think this is the best approach.  This regexp is
now only used for splitting up user input in the search filter; so
instead just simplify the regexp a bit and rename it to make this
clear.
Attached patch rewrite prettifyCSS to use CSSLexer (obsolete) (deleted) — Splinter Review
Attached patch rewrite CubicBezierWidget to use CSSLexer (obsolete) (deleted) — Splinter Review
One subtlety in this patch is the change to the test case.  1.45 is
not exactly representable, and CSS and JS disagree about the
representation.

Since the actual value is unimportant, I changed the test.  However I
also filed bug 1163047 to change the CSS tokenizer to agree with JS.
Attached patch avoid regexps in css-color (obsolete) (deleted) — Splinter Review
Attached patch rewrite output-parser.js to use CSSLexer (obsolete) (deleted) — Splinter Review
This one has been updated a bit from previous iterations to handle the
case where a color-taking function appears in the value.
Attachment #8603500 - Flags: review?(pbrosset)
Attachment #8603502 - Flags: review?(pbrosset)
Attachment #8603503 - Flags: review?(pbrosset)
Attachment #8603504 - Flags: review?(pbrosset)
Attachment #8603505 - Flags: review?(pbrosset)
Attachment #8603506 - Flags: review?(pbrosset)
Attachment #8603508 - Flags: review?(pbrosset)
Attachment #8603500 - Flags: review?(pbrosset) → review+
Comment on attachment 8603502 [details] [diff] [review]
rename and clarify last CSS regexp in rule-view.js

Review of attachment 8603502 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/devtools/styleinspector/rule-view.js
@@ +28,5 @@
>  const PROPERTY_NAME_CLASS = "ruleview-propertyname";
>  const FILTER_CHANGED_TIMEOUT = 150;
>  
> +// This is used to parse user input when filtering.
> +const FILTER_PROP_RE = /\s*([^:\s]*)\s*:\s*(.*?)\s*;?$/;

Why did you get rid of the last part of the regexp that deals with !important?
Attachment #8603502 - Flags: review?(pbrosset)
Comment on attachment 8603505 [details] [diff] [review]
rewrite CubicBezierWidget to use CSSLexer

Review of attachment 8603505 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/devtools/shared/widgets/CubicBezierWidget.js
@@ +818,2 @@
>   */
> +function parseTimingFunction(value) {

I like the changes in this function, it should be a lot more robust now.
Can you add some unit tests for this function in: /browser/devtools/shared/test/unit/ ?

@@ +822,4 @@
>    }
>  
> +  let tokenStream = DOMUtils.getCSSLexer(value);
> +  let getToken = function() {

Maybe rename this to 'getNextToken' and turn into a fat arrow function:

let getNextToken = () => {
  ...
};

@@ +824,5 @@
> +  let tokenStream = DOMUtils.getCSSLexer(value);
> +  let getToken = function() {
> +    while (true) {
> +      let token = tokenStream.nextToken();
> +      console.log(JSON.stringify(token));

Get rid of this console.log
Attachment #8603505 - Flags: review?(pbrosset) → review+
Comment on attachment 8603508 [details] [diff] [review]
rewrite output-parser.js to use CSSLexer

Review of attachment 8603508 [details] [diff] [review]:
-----------------------------------------------------------------

I wanted to apply the series of patches locally but it wouldn't. Complained about applying this last patch as output-parser.js has changed since you created it.
Can you rebase please? I'm especially interested in testing this one locally.

::: toolkit/devtools/output-parser.js
@@ +21,5 @@
> +                                "-moz-repeating-linear-gradient",
> +                                "radial-gradient",
> +                                "-moz-radial-gradient",
> +                                "repeating-radial-gradient",
> +                                "-moz-repeating-radial-gradient"];

Is this list complete or only contains what we support right now?
It seems to at least lack the drop-shadow function.
Attachment #8603508 - Flags: review?(pbrosset)
Comment on attachment 8603503 [details] [diff] [review]
rewrite prettifyCSS to use CSSLexer

Review of attachment 8603503 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/devtools/styleinspector/css-logic.js
@@ +1002,5 @@
> +  // * After a "{" or ";" symbol, ensure there is a newline and
> +  //   indentation before the next non-comment, non-whitespace token.
> +  // * Additionally after a "{" symbol, increase the indentation.
> +  // * A "}" symbol ensures there is a preceding newline, and
> +  //   decreases the indentation level.

I think one more rule should be: before a "{" symbol, ensure there is a whitespace token.

@@ +1005,5 @@
> +  // * A "}" symbol ensures there is a preceding newline, and
> +  //   decreases the indentation level.
> +  //
> +  // This approach can be confused sometimes, but should do ok on a
> +  // minified file.

When can it be confused? Can you elaborate a bit more?

@@ +1010,4 @@
>    let indent = "";
>    let indentLevel = 0;
> +  let tokens = domUtils.getCSSLexer(text);
> +  let result = '';

nit: double quotes here.

@@ +1019,5 @@
> +    let endIndex = undefined;
> +    let anyNonWS = false;
> +    let isCloseBrace = false;
> +    let token;
> +    while (true) {

These 2 nested while (true) loops may be a little bit scary to the reader.
Could you extract this one into a function before the first loop (just after let pushbackToken = undefined)?
Like:

let readUntilNewLineNeeded = () => {
};

This function should return the token and set the various variables like anyNonWS, isCloseBrace, ... as it does now.

The nice thing is it would make the code in the outer while(true) loop shorter and easier to read.

@@ +1082,5 @@
> +    // Now it is time to insert a newline.  However first we want to
> +    // deal with any trailing comments.  So, we read tokens, looking
> +    // for the next non-comment, non-whitespace token.
> +    pushbackToken = undefined;
> +    while (true) {

Similarly here, it'd be nice to simply be able to see something like this in the code:
token = readUntilComment();

This would, again, make the outer while(true) less daunting.

@@ +1101,5 @@
> +    }
> +
> +    // "Early" bail-out if the text does not appear to be minified.
> +    // This heuristic was copied over from an earlier approach to
> +    // prettifying.

I don't think it helps the reader knowing that this was copied over from a previous version of the code.
Attachment #8603503 - Flags: review?(pbrosset)
Attachment #8603504 - Flags: review?(pbrosset) → review+
Attachment #8603506 - Flags: review?(pbrosset) → review+
(In reply to Patrick Brosset [:pbrosset] [:patrick] from comment #36)

> > +// This is used to parse user input when filtering.
> > +const FILTER_PROP_RE = /\s*([^:\s]*)\s*:\s*(.*?)\s*;?$/;
> 
> Why did you get rid of the last part of the regexp that deals with
> !important?

I did it so it is possible to search for "!important".

The history is that this regexp was in the code before the search filter
was implemented.  It was unused.  It was written this way because,
according to the comments, it was copied from firebug.  Then, the search
filter used it, I believe because it was a simple way to go.

Note that the previous regexp was also incorrect, as it required
a single space in "! important", but CSS allows more forms.

If you think it's important to keep, I suggest changing that part to
"!\s*important"
(In reply to Tom Tromey :tromey from comment #40)
> (In reply to Patrick Brosset [:pbrosset] [:patrick] from comment #36)
> 
> > > +// This is used to parse user input when filtering.
> > > +const FILTER_PROP_RE = /\s*([^:\s]*)\s*:\s*(.*?)\s*;?$/;
> > 
> > Why did you get rid of the last part of the regexp that deals with
> > !important?
> 
> I did it so it is possible to search for "!important".
> 
> The history is that this regexp was in the code before the search filter
> was implemented.  It was unused.  It was written this way because,
> according to the comments, it was copied from firebug.  Then, the search
> filter used it, I believe because it was a simple way to go.
> 
> Note that the previous regexp was also incorrect, as it required
> a single space in "! important", but CSS allows more forms.
> 
> If you think it's important to keep, I suggest changing that part to
> "!\s*important"
No, just needed to know :) Thanks for the explanation, that's very clear now.
Attachment #8603502 - Flags: review+
Attached patch rewrite CubicBezierWidget to use CSSLexer (obsolete) (deleted) — Splinter Review
This addresses the review comments for the CubicBezierWidget patch.
Adding unit tests revealed a couple of missing checks in
parseTimingFunction.  Thanks for asking for this.

I'm going to carry over the r+.
Attachment #8603505 - Attachment is obsolete: true
(In reply to Patrick Brosset [:pbrosset] [:patrick] from comment #38)

> I wanted to apply the series of patches locally but it wouldn't. Complained
> about applying this last patch as output-parser.js has changed since you
> created it.
> Can you rebase please? I'm especially interested in testing this one locally.

You have to apply the patch from bug 971234 first.
Hmm, I thought I mentioned that, but I guess not.  Sorry :(
I had to develop these on a single branch to avoid difficulties
with ordering and conflicts.

> ::: toolkit/devtools/output-parser.js
> @@ +21,5 @@
> > +                                "-moz-repeating-linear-gradient",
> > +                                "radial-gradient",
> > +                                "-moz-radial-gradient",
> > +                                "repeating-radial-gradient",
> > +                                "-moz-repeating-radial-gradient"];
> 
> Is this list complete or only contains what we support right now?
> It seems to at least lack the drop-shadow function.

That's bug 1158288.  It is separate because it requires additional changes.
Attachment #8604121 - Flags: review+
This adds a test for the "!important" case.
Attachment #8603502 - Attachment is obsolete: true
Comment on attachment 8604134 [details] [diff] [review]
rename and clarify last CSS regexp in rule-view.js

On irc Patrick said that carrying over the r+ was fine.
Attachment #8604134 - Flags: review+
(In reply to Patrick Brosset [:pbrosset] [:patrick] from comment #39)

> I think one more rule should be: before a "{" symbol, ensure there is a
> whitespace token.

Ok, I've implemented this, and made the other changes.
I've also added a unit test.

> > +  // This approach can be confused sometimes, but should do ok on a
> > +  // minified file.
> 
> When can it be confused? Can you elaborate a bit more?

If it sees a reason to insert a newline, but then sees a comment,
then it assumes that the comment is meant to go at the end of the
line:

    name: value; /* like this */


But this may not always be true.
This isn't likely to be a problem in minified files.

It does not always replace whitespace -- it only does so in selected
spots.  So, I think you could wind up with weird-looking output if you
pass in some kinds of strangely-formatted input.

Overall I just tried to make this do a reasonable job on "all on
one line" input, and I tried to make it work a bit better than the
status quo ante (as well as I understood it, of course).

I think this should probably all be replaced with something more
like what the debugger does: give the user a prettify button and
never prettify by default.  This would let us remove the bail-out
heuristic.  I think there's another bug for this.

And, to go the full IDE route, we should probably just be plugging
in existing 3rd party CSS formatters, not rolling our own.
Attached patch rewrite prettifyCSS to use CSSLexer (obsolete) (deleted) — Splinter Review
Attachment #8603503 - Attachment is obsolete: true
Comment on attachment 8604241 [details] [diff] [review]
rewrite prettifyCSS to use CSSLexer

This one still needs a review, I believe.
Attachment #8604241 - Flags: review?(pbrosset)
(In reply to Tom Tromey :tromey from comment #43)

> You have to apply the patch from bug 971234 first.

This is in inbound now so it should be simpler.
Comment on attachment 8604241 [details] [diff] [review]
rewrite prettifyCSS to use CSSLexer

Review of attachment 8604241 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good now. Thanks.
Attachment #8604241 - Flags: review?(pbrosset) → review+
Comment on attachment 8603508 [details] [diff] [review]
rewrite output-parser.js to use CSSLexer

Review of attachment 8603508 [details] [diff] [review]:
-----------------------------------------------------------------

I was able to apply the patch series fine now that the other bug has landed on fx-team.
I couldn't find anything wrong with a few manual tests, and I like the code changes.
Attachment #8603508 - Flags: review+
https://treeherder.mozilla.org/#/jobs?repo=try&revision=ed189498b82e
Attached patch rewrite prettifyCSS to use CSSLexer (obsolete) (deleted) — Splinter Review
The try build pointed out that I missed browser_styleeditor_pretty.js.
It needed adapting for inserting a space before a "{"; and it pointed
out that the "early exit" heuristic is fooled by trailing newlines; so
I added a trim to prettifyCSS.

I also added a unit test for this.

Carrying over the r+, I think the changes are minor.
Attachment #8604241 - Attachment is obsolete: true
Attachment #8604777 - Flags: review+
https://treeherder.mozilla.org/#/jobs?repo=try&revision=395c2b754ab7
Attached patch rewrite prettifyCSS to use CSSLexer (obsolete) (deleted) — Splinter Review
Yet another minor fix to prettifyCSS.
The previous fix regressed the style editor; this one takes a more
targeted approach to handling trailing whitespace that can confuse the
minification detection.
Attachment #8604856 - Flags: review+
https://treeherder.mozilla.org/#/jobs?repo=try&revision=149fbd4ebc3b
Attached patch remove dead css-parsing code (obsolete) (deleted) — Splinter Review
Rebased.
Attachment #8603500 - Attachment is obsolete: true
Attachment #8603504 - Attachment is obsolete: true
Attachment #8603506 - Attachment is obsolete: true
Attachment #8603508 - Attachment is obsolete: true
Attachment #8604121 - Attachment is obsolete: true
Attachment #8604134 - Attachment is obsolete: true
Attachment #8604777 - Attachment is obsolete: true
Attachment #8604856 - Attachment is obsolete: true
Attached patch rewrite prettifyCSS to use CSSLexer (obsolete) (deleted) — Splinter Review
Attached patch rewrite CubicBezierWidget to use CSSLexer (obsolete) (deleted) — Splinter Review
Attached patch avoid regexps in css-color (obsolete) (deleted) — Splinter Review
Attached patch rewrite output-parser.js to use CSSLexer (obsolete) (deleted) — Splinter Review
Attachment #8605813 - Flags: review+
Attachment #8605815 - Flags: review+
Attachment #8605816 - Flags: review+
Attachment #8605817 - Flags: review+
Attachment #8605818 - Flags: review+
Attachment #8605819 - Flags: review+
Attachment #8605820 - Flags: review+
Keywords: checkin-needed
Backed out for xpcshell failures.
https://treeherder.mozilla.org/logviewer.html#?job_id=3107845&repo=fx-team

Also, please add the reviewer information while you're revising (since you're attaching new patches anyway). It's a pain adding it to each patch when landing.
Backout:
https://hg.mozilla.org/integration/fx-team/rev/cb21fefe169a
Attached patch remove dead css-parsing code (deleted) — Splinter Review
Attachment #8605813 - Attachment is obsolete: true
Attachment #8605815 - Attachment is obsolete: true
Attachment #8605816 - Attachment is obsolete: true
Attachment #8605817 - Attachment is obsolete: true
Attachment #8605818 - Attachment is obsolete: true
Attachment #8605819 - Attachment is obsolete: true
Attachment #8605820 - Attachment is obsolete: true
Fix test_prettifyCSS.js to use the correct newline flavor on input as
well.
Attached patch avoid regexps in css-color (deleted) — Splinter Review
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2595a9689a0d
Attachment #8607050 - Flags: review+
Attachment #8607051 - Flags: review+
Attachment #8607052 - Flags: review+
Attachment #8607053 - Flags: review+
Attachment #8607054 - Flags: review+
Attachment #8607056 - Flags: review+
Attachment #8607057 - Flags: review+
Keywords: checkin-needed
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: