Open Bug 1446555 Opened 7 years ago Updated 5 years ago

Allow --less-context to only be applied to large files in a commit

Categories

(Conduit :: Phabricator, defect, P5)

defect

Tracking

(Not tracked)

People

(Reporter: emilio, Unassigned)

References

()

Details

(Keywords: conduit-triaged, conduit-upstream, conduit-upstream-pending)

First of all sorry if I filed in the wrong component, I haven't seen anything else :). Submitting patches to MANIFEST.json (the wpt manifest) gives you a prompt like: Diff for 'testing/web-platform/meta/MANIFEST.json' with context is 6,027,967 bytes in length. Generally, source changes should not be this large. If this file is a huge text file, try using the '--less-context' flag. If the file is not a text file, you can mark it 'binary'. Mark this file as 'binary' and continue? [y/N] If you reply "y", then the changes in the MANIFEST.json file don't show up, which makes them impossible to review. If you reply "N", then it says: Usage Exception: Aborted generation of gigantic diff. Using --less-context, it shows the diff, but suddenly the rest of the files also don't have any context, which is annoying.
Product: Developer Services → Conduit
How big is your change? Is it a big change, or just a small change to a huge file?
Flags: needinfo?(emilio)
Component: General → Phabricator (Upstream)
It's generally a very small change to a huge file. See: https://hg.mozilla.org/mozilla-central/rev/f2149d69f47cd9e7a2d0b478c2c63cb020dbfda9 for an example.
Flags: needinfo?(emilio)
(fwiw we are palnning to move this file out of tree in the summer, but it's possible that when we actually do that it will turn out that people don't like the tradeoffs, so it's not certain if the change will work)
The Phabricator docs mention some solutions for large changesets: https://secure.phabricator.com/book/phabricator/article/differential_large_changes/ If changes to MANIFEST.json must always be reviewed by a human, here is a possible (untested) workaround: have the author push the changeset to a public location, mark the file as binary using arc, and provide a link to the MANIFEST.json changeset in the review description. That way Differential can still be used to discuss the change contents with full context for everything except the large file. Reviewers can follow the link to the large file to view it. I'm unfamiliar with the MANIFEST.json contents from comment 2 and I'd like to know if changes to MANIFEST.json must always be reviewed by a human. If the file was marked binary and ignored by the reviewer would some other process catch errors in the file contents? If these large file changes happen often, and the large file changes can't be ignored, then I think the 'provide a link in the review' workaround could become tedious for authors. It might be nice if '--less-context' took an argument for these cases, like 'arc diff --less-context=BIGFILE.json'.
We have automated lints for MANIFEST.json, but I think they should generally be reviewed yes... This file changes very frequently (every time you add / modify / remove a wpt test), so I'm not sure posting links around is a great solution :/. I agree that the best so far would be to allow submitting that file without context, but the rest of them with.
(In reply to Emilio Cobos Álvarez [:emilio] from comment #5) > We have automated lints for MANIFEST.json, but I think they should generally > be reviewed yes... > > This file changes very frequently (every time you add / modify / remove a > wpt test), so I'm not sure posting links around is a great solution :/. > > I agree that the best so far would be to allow submitting that file without > context, but the rest of them with. It's not really possible to do with a simple option to "arc diff" right now. You can workaround it, but it's kind of a pain: - Generate a diff (A) locally with full context - Generate a second diff (B) locally with very little context - Hack up the diffs locally to combine them in the format you want - Pass the combined diff to stdin on "arc diff --raw"
Here's an attempt at comment 6. I used --preview so you can check the result in the browser. This might make a good small shell script. $ hg diff -U 10000 -X '*BIGFILE.json*' > full.patch $ hg diff BIGFILE.json > part.patch $ cat full.patch part.patch | arc diff --raw --preview Or extra fun, as a one-liner: $ cat <(hg diff -U 10000 -X '*BIGFILE.json*') <(hg diff BIGFILE.json) | arc diff --raw --preview
Component: Phabricator Upstream → Phabricator
We can and probably should inquire with upstream as to whether they have any new ideas or plans for dealing with this situation. I'm willing to bet, though, that if they do, a proper solution is probably not coming soon. We should, for now at least, document these workarounds in the Phabricator user guide, as I expect it'll come up now and then. smacleod: could you, or could you find someone, to send a short request upstream about this, at some point?
Component: Phabricator → Documentation
Flags: needinfo?(smacleod)
Summary: phabricator: Unable to submit a patch to MANIFEST.json that shows the diff and keeps context for other files. → Document workarounds for large files
Looks like they are happy to do what was mentioned in comment 5. Not sure about a timeline but they said it fits into their plans. Moving this to the backlog but I'll close it out when the upstream feature is added.
Keywords: conduit-triaged
Whiteboard: [phabricator-backlog] [phabricator-upstream]
Component: Documentation → Phabricator
Summary: Document workarounds for large files → Allow --less-context to only be applied to large files in a commit
This is a bigger issue than just the review process: Since the file is uploaded as binary, if changes to that file have landed in the meantime, Lando / Phabricator overrides them when landing the patch. See for example the removals that https://hg.mozilla.org/integration/autoland/rev/8abcae437121 contains. I had to fix them up in https://hg.mozilla.org/integration/autoland/rev/9bbc18da7365. Mark, this would cause a lot of pain to anyone writing WPT tests, to the point that patches that touch WPT tests can only land via lando iff they have just been rebased. Could we get something to make at least landing patches here work?
Flags: needinfo?(mcote)
FWIW bug 1473915 is waiting for review from gps and should make this no longer an issue for the MANIFEST.json file specifically by moving it out of the source tree and making it download on demand. But it might be a problem for other files, idk.
I hit the same problem as comment 12 in bug 1486252 as well... The currently situation is pretty terrible for touching MANIFEST.json, especially because it is huge and changed so frequently.
It seems that this is only a problem when changes are too far away from each other. Specifically, MANIFEST.json has two sections, one for the test list, the other for hashes of the files. If we can get rid of the hashes (bug 1488311), this problem may become less severe.
After talking with Phacility a little while ago, they agreed to put it on their road map. However we haven't heard anything since. We plan on syncing up with them again soon concerning the growing number of requests we've given them and will attempt to get them to prioritize this issue. Meanwhile, if any improvements land that do indeed render this issue less of a problem, please let me know.
Flags: needinfo?(mcote)
Whiteboard: [phabricator-backlog] [phabricator-upstream]
No longer blocks: phab-switch
Keywords: conduit-backlog
Priority: -- → P5
You need to log in before you can comment on or make changes to this bug.