Closed
Bug 1494069
Opened 6 years ago
Closed 6 years ago
[mozlint] Be more explicit with arguments and return values in `filterpaths`
Categories
(Developer Infrastructure :: Lint and Formatting, enhancement)
Developer Infrastructure
Lint and Formatting
Tracking
(firefox64 fixed)
RESOLVED
FIXED
mozilla64
Tracking | Status | |
---|---|---|
firefox64 | --- | fixed |
People
(Reporter: ahal, Assigned: ahal)
References
Details
Attachments
(8 files, 1 obsolete file)
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details |
Currently the pathutils.filterpaths function accepts a linter definition and lintargs. It then extracts the values out of those that it needs. It can also modify these dictionaries in-place sometimes.
This can lead to surprising behaviour and make debugging more difficult. I'd like to change it so that we need to explicitly pass in only the values that it needs to perform the filtering. This way not only will it be easier to keep track of, but we won't accidentally modify any references in-place. Instead, we'll need to explicitly return the values that the greater mozlint module expects.
This will give us clear insight into the inputs and outputs of this function, in a sense making it more "dumb".
Assignee | ||
Comment 1•6 years ago
|
||
Scandir is a faster implementation of os.listdir since it caches file metadata
as it works. Using listdir and then calling things like os.path.isfile after
the fact will result in multiple system calls. Whereas with scandir, there will
only be a single system call per file.
Scandir is part of the stdlib in Python 3.6+, so the following can be used for
Python 2/3 compatible code:
try:
from os import scandir, walk
except ImportError:
from scandir import scandir, walk
Assignee | ||
Comment 2•6 years ago
|
||
Often we specify globs in our exclude patterns, e.g:
exclude:
- **/node_modules
- obj*
However, these globs get expanded out to *every* file that matches them. This
can sometimes be thousands or even tens of thousands of files.
We then pass these paths on to the underlying linters and tell them to
exclude them all. This causes a lot of overhead and slows down performance.
This commit implements a "collapse" function. Given a set of paths, it'll
collapse them into the smallest set of parent directories that contain the
original set, and that don't contain any extra files.
For example, given a directory structure like this:
a
-- foo.txt
-- b
-- bar.txt
-- baz.txt
-- c
-- ham.txt
-- d
-- spam.txt
Then the following will happen:
>>> collapse(['a/foo.txt', 'a/b/bar.txt', 'a/c/ham.txt', 'a/c/d/spam.txt'])
['a/foo.txt', 'b/bar.txt', 'c']
Since all files under directory 'c' are specified by the original set (both
'c/ham.txt' and 'c/d/spam.txt'), we can collapse it down to just 'c'. However
not all files under 'b' are specified (we're missing 'a/b/baz.txt'), so we
can't collapse 'b' (and by extension also can't collapse 'a').
If we had included 'a/b/baz.txt':
>>> collapse(['a/foo.txt', 'a/b/bar.txt', 'a/b/baz.txt', 'a/c/ham.txt', 'a/c/d/spam.txt'])
['a']
In both cases, the smallest set of paths that contains the original set (and
only the original set) is computed.
The collapse function has a little bit of overhead but it's not too bad.
For example collapsing all files matched by '**/node_modules' takes ~0.015s.
Collapsing two full objdirs, takes ~0.6s. But a follow up commit is planned to
make sure we stop using 'obj*' to reduce that overhead.
Depends on D7738
Assignee | ||
Comment 3•6 years ago
|
||
When using globs in exclude directorives, FileFinder will return every *file*
that gets matched. This is can be thousands of files in the case of an objdir.
While we now collapse these files down to highest possible directories, this
collapse operation can still take a noticeable amount of time (0.6s). This
simply scans topsrcdir for files that start with 'obj' to avoid the glob.
This also moves the '_activate_virtualenv' call to the top of the function
because in CI, this will cause an objdir to be created (to store the
virtualenv). If this happens *after* calculating the global excludes, we won't
catch it since it doesn't exist yet. This will result in the objdir's
virtualenv being linted and erroneous failures.
Depends on D7739
Assignee | ||
Updated•6 years ago
|
Assignee: nobody → ahal
Status: NEW → ASSIGNED
Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a2956764213e
Vendor scandir==1.9.0 to third_party/python, r=davehunt
https://hg.mozilla.org/integration/autoland/rev/fe0fb280dbfc
[mozlint] Collapse exclude paths into their smallest possible set, r=egao
https://hg.mozilla.org/integration/autoland/rev/9752f179b9c3
[lint] Explicitly list out objdirs rather than depend on 'obj*' in the global exclude, r=rwood
Comment 5•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/a2956764213e
https://hg.mozilla.org/mozilla-central/rev/fe0fb280dbfc
https://hg.mozilla.org/mozilla-central/rev/9752f179b9c3
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
status-firefox64:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Backout by ncsoregi@mozilla.com:
https://hg.mozilla.org/mozilla-central/rev/580c03c8ae38
Backed out 3 changesets for blocking 1498215. a=backout
Assignee | ||
Updated•6 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 7•6 years ago
|
||
Turns out vendoring scandir causes problems if people happen to have an older version of it installed on their system. Also I decided to try using os.listdir and do some profiling comparisons. It turns out that using scandir isn't even much of a perf improvement for mozlint's common case (where we use ** with FileFinder). There likely are edge cases where we would see some bigger improvements, but it's not really worth the headaches that vendoring scandir causes.
So I'm just going to re-implement this with os.listdir. If we ever end up vendoring scandir for some other purpose (or start requiring python 3), we can switch back to using it at that point.
Assignee | ||
Updated•6 years ago
|
Keywords: leave-open
Updated•6 years ago
|
Attachment #9014452 -
Attachment is obsolete: true
Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b7e586708ecc
[mozlint] Collapse exclude paths into their smallest possible set, r=egao
https://hg.mozilla.org/integration/autoland/rev/2064477895c3
[lint] Explicitly list out objdirs rather than depend on 'obj*' in the global exclude, r=rwood
Comment 9•6 years ago
|
||
bugherder |
Assignee | ||
Comment 10•6 years ago
|
||
Patch dump incoming. This bug kind of ended up being a bit of a catch-all for all the minor issues I found with path filtering as I was working through my mozlint refactor. The good news is that I've been adding tests along the way and I'm now way more confident that the filtering logic is doing what it's supposed to.
Also while there may be a large number of patches, I hope this makes them a lot easier to review than one giant monolithic patch.
Assignee | ||
Comment 11•6 years ago
|
||
I missed an edge case where the computed base itself was specified in the
paths input.
Assignee | ||
Comment 12•6 years ago
|
||
This fixes a latent bug that is currently not being hit (by sheer luck). Basically
the 'ignore' argument of a FileFinder object needs all paths to be relative to the
base. As luck would have it, most of the time it worked out that way if you were
running |mach lint| from the root of the repo.
However there are edge cases where this will cause an 'exclude' directive to get
missed. Plus this bug is about to be exposed 100% of the time in the next commit :).
Depends on D8842
Assignee | ||
Comment 13•6 years ago
|
||
Right now there are excludes defined in the linter definition (via the .yml
files), as well as excludes defined in lintargs (via the mach_commands.py).
This is a minor simplification that extends each linter definition's local
excludes with the global ones right off the bat. This just makes it a bit
easier to keep track of.
Depends on D5863
Assignee | ||
Comment 14•6 years ago
|
||
We were currently adding exclude paths to the "discard" set if the path contains
the include, but we weren't adding them if the include contains the path.
Depends on D5863
Assignee | ||
Comment 15•6 years ago
|
||
Currently pathutils.filterpaths computes some extra paths that the underlying linter
should try to exclude if it can. It sets these on lintargs['exclude'], and relies on
the fact that it was passed by reference to work.
However, it seems like pytest does some magic under the hood that prevents this value
from being propagated back. It will also fail if 'filterpaths' was invoked in a
subprocess.
It's much simpler and cleaner to pass this value back directly, rather than rely on a
reference.
Depends on D5862
Updated•6 years ago
|
Attachment #9017500 -
Attachment description: Bug 1494069 - [mozlint] Return extra 'excludes' directly from pathutils.filterpaths, r=rwood → Bug 1494069 - [mozlint] Verify the expected_exclude paths in test_filterpaths, r?rwood
Updated•6 years ago
|
Attachment #9017500 -
Attachment is obsolete: true
Assignee | ||
Comment 16•6 years ago
|
||
Currently pathutils.filterpaths computes some extra paths that the underlying linter
should try to exclude if it can. It sets these on lintargs['exclude'], and relies on
the fact that it was passed by reference to work.
However, it seems like pytest does some magic under the hood that prevents this value
from being propagated back. It will also fail if 'filterpaths' was invoked in a
subprocess.
It's much simpler and cleaner to pass this value back directly, rather than rely on a
reference.
Updated•6 years ago
|
Attachment #9017500 -
Attachment is obsolete: false
Updated•6 years ago
|
Attachment #9017496 -
Attachment description: Bug 1494069 - [mozlint] Fix edge case in pathutils.collapse, r?egao → Bug 1494069 - [mozlint] Fix edge case in pathutils.collapse, r?rwood
Comment 18•6 years ago
|
||
Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/fbd19192e4fc
[mozlint] Fix edge case in pathutils.collapse, r=rwood
https://hg.mozilla.org/integration/autoland/rev/3cbd82a01bc2
[mozlint] Properly exclude files in LineType linters, r=rwood
https://hg.mozilla.org/integration/autoland/rev/9c587734513a
[mozlint] Return extra 'excludes' directly from pathutils.filterpaths, r=rwood
https://hg.mozilla.org/integration/autoland/rev/53abbe1d64d4
[mozlint] Append global 'excludes' to each linter at parse time, r=rwood
https://hg.mozilla.org/integration/autoland/rev/b9beb91e9992
[mozlint] Verify the expected_exclude paths in test_filterpaths, r=rwood
https://hg.mozilla.org/integration/autoland/rev/8a1fd7460ad5
[mozlint] Make sure exclude paths always get discarded when necessary, r=rwood
Comment 19•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/fbd19192e4fc
https://hg.mozilla.org/mozilla-central/rev/3cbd82a01bc2
https://hg.mozilla.org/mozilla-central/rev/9c587734513a
https://hg.mozilla.org/mozilla-central/rev/53abbe1d64d4
https://hg.mozilla.org/mozilla-central/rev/b9beb91e9992
https://hg.mozilla.org/mozilla-central/rev/8a1fd7460ad5
Assignee | ||
Updated•6 years ago
|
Status: REOPENED → RESOLVED
Closed: 6 years ago → 6 years ago
Keywords: leave-open
Resolution: --- → FIXED
Updated•2 years ago
|
Product: Firefox Build System → Developer Infrastructure
You need to log in
before you can comment on or make changes to this bug.
Description
•