Closed Bug 1514770 Opened 6 years ago Closed 6 years ago

Improve clang-format git hook support

Categories

(Developer Infrastructure :: Lint and Formatting, enhancement, P2)

enhancement

Tracking

(firefox68 fixed)

RESOLVED FIXED
mozilla68
Tracking Status
firefox68 --- fixed

People

(Reporter: Sylvestre, Assigned: Sylvestre)

References

Details

Attachments

(2 files, 2 obsolete files)

Traceback (most recent call last): File ".git/hooks/pre-commit", line 48, in <module> sys.exit(git()) File ".git/hooks/pre-commit", line 44, in git return run_clang_format(hooktype, []) TypeError: run_clang_format() takes exactly 3 arguments (2 given) https://bugzilla.mozilla.org/show_bug.cgi?id=1507007#c12
Pushed by sledru@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/d7507f185a63 Fix the clang-format hook for git r=ahal
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
This still doesn't work. It looks like you're using get_outgoing_files which is for files that have already been committed, and so is unsuitable for a pre-commit hook. I think you probably want to be using get_changed_files if you're trying to make a pre-commit hook. Although I'm not sure if you want to use "staged" or "unstaged" - staged would be more appropriate for a "git add ...; git commit" workflow whereas unstaged would be more appropriate for a "git commit -a" workflow. So I don't really know what the right answer here, just that the current thing doesn't seem to work :)
When I implemented this change, it worked. However, we might have different definition of "work" here :) Currently, and this is probably wrong, the hook will run "mach clang-format" with the -s option. This option doesn't touch the commit and just provides a diff of the change. I think we should just amend the current commit. Ehsan, what do you think about that?
Flags: needinfo?(ehsan)
Based on my understanding of the code, it might work for the second local commit, but I don't see how it would ever work in a pre-commit hook for the first local commit, since get_outgoing_files should always return an empty set for that. In my case, "not working" means clang-format didn't even get run, as far as I can tell. That is, no output to stdout and no files were modified, even though the files I was committing needed formatting.
Indeed, I usually have several local commits ... well spotted!
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Thinking about it a bit more, I think a good workflow would be to make the hook a post-commit hook which adds the results of clang-formatting to the index, and then tells the user to run `git commit --amend` to apply the formatting to the commit that was just made. This is good because (1) it allows the user to review the formatting changes and reject them if necessary, and (2) doesn't clobber any existing changes the user has made, because right after a commit the index should be empty. We can additionally add a pref or option to make the tool run `git commit --amend` automatically for people who don't want to bother with reviewing the formatting changes.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #8) > Thinking about it a bit more, I think a good workflow would be to make the > hook a post-commit hook which adds the results of clang-formatting to the > index, and then tells the user to run `git commit --amend` to apply the > formatting to the commit that was just made. This is good because (1) it > allows the user to review the formatting changes and reject them if > necessary, and (2) doesn't clobber any existing changes the user has made, > because right after a commit the index should be empty. Do you know how that would work with people doing a rebase (or an interactive rebase) by any chance? Also, I'm curious, what would be a good reason why people would reject the changes? Wouldn't it be better if the results were amended automatically?
Flags: needinfo?(ehsan)
(In reply to :Ehsan Akhgari from comment #10) > Do you know how that would work with people doing a rebase (or an > interactive rebase) by any chance? No idea :( > Also, I'm curious, what would be a good reason why people would reject the > changes? Wouldn't it be better if the results were amended automatically? In general I would expect that people wouldn't reject the changes. But sometimes they might want to manually reformat to something slightly different that is also compatible with clang-format, or to apply //clang-format off things.
I minus'd the patch for a problem specific to the patch, but while experimenting I discovered other pitfalls: - It looks like the hook is actually running async somehow, so the commit proceeds anyway, and the call to `./mach clang-format` happens in a background job. Which is kind of a disaster if you're running it on a tree that hasn't been built yet, because running `./mach clang-format` will trigger a call to `make -f client.mk -s configure` which will spew all the configure output to your terminal from a background task. - Apropos the last point, do we want to be using `./mach clang-format` if it has this behaviour of triggering `configure`? I don't know how often it happens but I'm sure some people will write code and commit it on a clean tree, and it will be surprising if it seems to start a build from a pre-commit hook. Can we just run the `clang-format` tool directly instead of going through the `mach` wrapper? Maybe this is a moot point since people who opt-in to using the commit hook will likely have an objdir, but it certainly surprised me when I tested this patch. Overall I think it's worth spending some time to do this right, or to not do it at all. Things that affect people's workflow (and certainly something that happens per commit) can have a significant impact on developer productivity. It's also a particularly sad feeling when the tools you're using get in your way rather than helping you out.
So putting this in my .git/hooks/pre-commit does exactly what I want: git diff --no-color -U0 --cached | $HOME/.mozbuild/clang-tools/clang-tidy/share/clang/clang-format-diff.py -p1 -binary=$HOME/.mozbuild/clang-tools/clang-tidy/bin/clang-format | git apply --cached -p0 i.e. it takes the diff that I'm about to turn into a commit (and nothing else that I've changed), and runs it through clang-format-diff which reformats the diff, and then applies it back onto the index so that when git creates the commit, it has the changes that I made, but formatted properly. It also has the nice side-effect of not actually touching any of my on-disk files, so it's not going to trigger spurious rebuilds. I haven't checked to see how well it plays with complex workflows like rebasing but for the simple commit case it seems to work well so far, and so I'll use that until I run into problems.
I like your approach but i am not completely convinced by Clang-format-diff itself. As it only acts on the context of the diff, it might miss some important changes (for example, a new level indentation introduced by an if or a loop). This is why i didnt go this road.
(In reply to Sylvestre Ledru [:sylvestre] from comment #15) > I like your approach but i am not completely convinced by Clang-format-diff > itself. > As it only acts on the context of the diff, it might miss some important > changes (for example, a new level indentation introduced by an if or a loop). > > This is why i didnt go this road. It is relatively easy to use a different approach though that doesn't rely on clang-format-diff.py and runs clang-format itself on the modified files. For example, see how this project on github does it: https://github.com/andrewseidl/githook-clang-format/blob/master/clang-format.hook But here is the thing... As the author of this hook correctly states (https://github.com/andrewseidl/githook-clang-format), doing things this way will result in a state where your commits will pick up other formatting changes that others haven't committed yet. For example if you change line 10 of a file and line 200 also requires a formatting fix, running clang-format-diff.py will keep your commit only changing line 10, but running clang-format itself will add a hunk for line 200 as well. So perhaps for the commit hook, using clang-format-diff.py is actually a better choice?
I also looked at this project (https://github.com/barisione/clang-format-hooks), which takes a similar approach by default (formatting only the modified lines).
Depends on: 1516733
Blocks: 1511904
Doesn't work yet on hg but works with git
Attachment #9032734 - Attachment is obsolete: true
Looks like I forgot to submit the comment. > For example if you change line > 10 of a file and line 200 also requires a formatting fix, running > clang-format-diff.py will keep your commit only changing line 10, but > running clang-format itself will add a hunk for line 200 as well. I think the later is what we want. We should consider that the tree always follows the current coding style. Therefore, we should not worry about running only clang-format only on the diff. > https://github.com/staktrace/moz-scripts/blob/master/gecko.pre-commit-hook With this script, looks like you will reformat also thirdparty code.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #8) > Thinking about it a bit more, I think a good workflow would be to make the > hook a post-commit hook which adds the results of clang-formatting to the > index, and then tells the user to run `git commit --amend` to apply the > formatting to the commit that was just made. FWIW I for one would not like that. I don't want my commits slowed down, especially when I'll almost never want to modify whatever changes clang-format makes. If I have something that has sensitive formatting and may require 'clang-format off' I'll likely run './mach clang-format' manually before committing. Perhaps the hook can be flexible enough to act both as a pre-commit or post-commit hook so that everyone can choose what works best for them?
So, I tried something different in https://phabricator.services.mozilla.com/D15118 the main idea is to run clang-format at commit phase and merged the changes directly in the commit. This is what I implemented in my previous work http://gitweb.scilab.org/?p=scilab.git;a=blob;f=git_hooks/pre-commit#l161 (using astyle as clang-format wasn't here) and it worked well for years. So, for git, the following commands: $ echo "" >> dom/presentation/Presentation.cpp $ git commit -m "foo" dom/presentation/Presentation.cpp will create an empty commit as the first change will be reverted by clang-format For hg, I don't know how to perform changes on the file in the pretxncommit hook phase. Last but not least, I cannot use vcs.outgoing because it will return all local patches while we want to work only on the files which will be part of the current commit.
Summary: git: clang-format hook is broken → Improve clang-format hook support
(In reply to Sylvestre Ledru [:sylvestre] from comment #19) > Looks like I forgot to submit the comment. > > > For example if you change line > > 10 of a file and line 200 also requires a formatting fix, running > > clang-format-diff.py will keep your commit only changing line 10, but > > running clang-format itself will add a hunk for line 200 as well. > I think the later is what we want. > We should consider that the tree always follows the current coding style. > Therefore, we should not worry about running only clang-format only on the > diff. > Then why not just reformat the whole tree on every commit? Why stop at the files being modified? Or if that's too expensive then just run a cron job somewhere that reformats every day. Either way, I still don't think it makes sense to reformat on a per-file basis - we have many files in the tree many thousands of lines long and just because I change one line somewhere doesn't mean I want my patch to include unrelated changes for other lines in the file. > > https://github.com/staktrace/moz-scripts/blob/master/gecko.pre-commit-hook > With this script, looks like you will reformat also thirdparty code. True. Personalized I don't modify third-party C++ code so it doesn't matter, but yeah for a production hook we would want to exclude those. (In reply to Jonathan Watt [:jwatt] from comment #20) > (In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #8) > > Thinking about it a bit more, I think a good workflow would be to make the > > hook a post-commit hook which adds the results of clang-formatting to the > > index, and then tells the user to run `git commit --amend` to apply the > > formatting to the commit that was just made. > > FWIW I for one would not like that. I don't want my commits slowed down, > especially when I'll almost never want to modify whatever changes > clang-format makes. If I have something that has sensitive formatting and > may require 'clang-format off' I'll likely run './mach clang-format' > manually before committing. The latest version of my hook doesn't do this anymore, the commit includes the formatting by default. I agree it is more convenient that way. But at the same time, do you want your on-disk copy of e.g. nsLayoutUtils.h being modified every time you commit a change to it? Hello expensive commits (because they trigger a rebuild)! > Perhaps the hook can be flexible enough to act both as a pre-commit or > post-commit hook so that everyone can choose what works best for them? Probably better to have two hooks on that case and let people decide which they want.
(sorry for typos, phone keyboard is atrocious and fennec caret placement adjusting is hard)
FWIW, I completely disagree with Sylvestre in comment 19 that the correct behaviour of the hook would be to commit unrelated formatting changes into your patch. Personally, if our hook did this, I would never use it. The reason is that in my experience, our reviewers will consistently ask people (rightly so) to remove unrelated changes from their commits since having a commit than does two or more unrelated things is not helpful for keeping each commit small and understandable. I personally always require this in the review requests that I receive, and I plan to continue to ask people to remove unrelated formatting changes from the review requests that I will receive in the future. Having our hook generate patches that neither I will accept as a reviewer nor I can use as an author is quite counterproductive. In my humble opinion, the correct behaviour of the hook is to do what clang-format-diff.py does, that is, only touch the modified lines in the commit. (Sure, with that, there will be files without proper formatting in the repo, but there will also be those files coming from people who don't have the hook enabled, who don't run ./mach bootstrap, etc. That is an orthogonal problem to solve.)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #22) > just because I change one line somewhere doesn't mean I want my patch > to include unrelated changes for other lines in the file. Why would that happen? Presumably we block anyone from pushing unformatted changes to the trees, so there should never be any such unrelated changes. (In reply to :Ehsan Akhgari from comment #24) > FWIW, I completely disagree with Sylvestre in comment 19 that the correct > behaviour of the hook would be to commit unrelated formatting changes into > your patch. I don't think that's what he said or what he meant since, as you say, that would be completely broken. I read his comment as "that can't happen, so we don't need to worry about that". > (Sure, with that, there will be files without proper formatting in the repo, > but there will also be those files coming from people who don't have the > hook enabled, who don't run ./mach bootstrap, etc. That is an orthogonal > problem to solve.) Umm, so maybe my assumptions are wrong? Do we not have push hooks preventing unformatted changes from landing in our trees? If not, why not? Does that mean we've gone to all this trouble and disruption to format the trees, but now are going to allow the formatting to become all broken?
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #22) > (In reply to Jonathan Watt [:jwatt] from comment #20) > > FWIW I for one would not like that. I don't want my commits slowed down, > > especially when I'll almost never want to modify whatever changes > > clang-format makes. If I have something that has sensitive formatting and > > may require 'clang-format off' I'll likely run './mach clang-format' > > manually before committing. > > The latest version of my hook doesn't do this anymore, the commit includes > the formatting by default. I agree it is more convenient that way. But at > the same time, do you want your on-disk copy of e.g. nsLayoutUtils.h being > modified every time you commit a change to it? Hello expensive commits > (because they trigger a rebuild)! That would be annoying. But what's the way around that? Always remember to format a file after every change before you build? Trigger formatting automatically when you build? Even then I'd want the fallback hook to make sure I'm not putting up any unformatted patches for review or landing in case neither of those things occur.
> I don't think that's what he said or what he meant since, as you say, that would be completely broken. I read his comment as > "that can't happen, so we don't need to worry about that". Indeed (about the hook touching unrelated part of the code :) If/when we have a bot reformatting all changes which landed without the correct formatting, you should not see that as the files should be always formatted correctly when you start working on the files. I would not like to get unrelated changes in a patch either.
(In reply to Jonathan Watt [:jwatt] from comment #25) > (In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #22) > > just because I change one line somewhere doesn't mean I want my patch > > to include unrelated changes for other lines in the file. > > Why would that happen? Presumably we block anyone from pushing unformatted > changes to the trees, so there should never be any such unrelated changes. AFAIK we currently do not block unformatted changes to the trees. I could be wrong, or maybe there's some patches somewhere that do this blocking but I'm not aware of them. Even if there is such blocking, the following scenario could occur: - edit two different parts of nsLayoutUtils.h, introducing unformatted changes in two hunks - git add -p one of the hunks - commit With Sylvestre's proposal, this will include formatting for both hunks. Or worse, the patch generated by clang-format won't even apply because you only staged one of the hunks. Basically any time you're using clang-format instead of clang-format-diff you have this problem where the `git add -p; git commit` workflow breaks in some way.
(In reply to Sylvestre Ledru [:sylvestre] from comment #19) > > For example if you change line > > 10 of a file and line 200 also requires a formatting fix, running > > clang-format-diff.py will keep your commit only changing line 10, but > > running clang-format itself will add a hunk for line 200 as well. > I think the later is what we want. > We should consider that the tree always follows the current coding style. > Therefore, we should not worry about running only clang-format only on the > diff. Note that if we *do* have a hook that prevents unformatted code from landing, then this scenario (where line 200 requires a formatting fix without any local changes) would never occur. So in this scenario running clang-format-diff is just as good as running clang-format on the file. So I propose that we should first wait for the hook to be installed, and then update this git pre-commit hook to do a clang-format-diff and apply that directly to the index. (In reply to Jonathan Watt [:jwatt] from comment #26) > That would be annoying. But what's the way around that? The way I'm proposing is to have the formatting changes go straight into the git index, which means they go into the commit but don't touch your on-disk file. So if you make some changes and commit, the commit will have the formatted changes while your on-disk version will whatever you actually typed in. This produces the slightly unexpected result where e.g. `git diff` after a `git commit -a` can produce a non-empty diff. But IMO this is the least bad of the solutions we have so far.
Severity: normal → enhancement

How can we make progress here? I've kind of lost track of what the latest proposal and status is... :-)

I am happy with this change:
https://phabricator.services.mozilla.com/D15521
except:

  • add the support of "git add -p "
  • and improve the hg hook

I will try to finish it this week

For git add -p something like the following should work:

git stash --keep-index && $(git rev-parse --show-toplevel)/mach clang-format && git add -u && git stash pop

Or something of that sort.

I think the only case it wouldn't handle is the case where there's nothing to save (both the commit and the staging area is empty), which you could do with git commit --allow-empty, because in that case the stash will actually not save anything and the pop will pop whatever was on the stack beforehand. You can detect that easily before-hand though.

Not sure if there's a better way to do it.

Ah! A much faster way could be:

git diff --staged | clang-format-diff.py -p 1 | git apply -p 0 --cached

(You may want to apply the patch both with and without --cached, so that it gets reflected both in the index and the working tree)

Thanks for the suggestion
clang-format-diff.py isn't perfect but this is better than nothing

And also, on the hg hook, the issue is that it isn't trivial to retrieve the list of touched files in the commit hook.

pretxncommit would be perfect as it provides the node id (which is necessary to retrieve the list of files) except that it has no info about the files being added.

For example, using pretxnopen and pretxnclose don't provide the node id.
Sheeshan suggested that I use $HG_NODE but it doesn't exist where I am.

I could use the hg python interface to retrieve these data but it seems pretty overkill.

Sheehan told me the following:

so you would use pretxncommit as the hook phase, add node=None to your function signature, and then do the steps a mentioned
above to get the list of changed files

Summary: Improve clang-format hook support → Improve clang-format git hook support

I took a different approach for hg:
https://phabricator.services.mozilla.com/D18883

and maybe move the git hook in bash to make it easier

Here is that I used to test the work:

git checkout dom/presentation/
git reset dom/presentation/AvailabilityCollection.cpp
sed -i -e "s|;$| ;|g" dom/presentation/AvailabilityCollection.cpp

echo "int foo( ){}" >> dom/presentation/AvailabilityCollection.cpp
# No action as no file added to git
git commit -m "foo - should not do anything"

# Should have the execution
git add dom/presentation/
git commit -m "foo - should commit the file with the change"
git show

sed -i -e "s|#include |#include    |g" dom/presentation/AvailabilityCollection.cpp
echo "int foo( ){}" >> dom/presentation/AvailabilityCollection.cpp
git diff dom/presentation/
git add -p dom/presentation/
git commit -m "foo - will commit everything when it should not"
git show

sed -i -e "s|#include |#include    |g" dom/presentation/AvailabilityCollection.cpp
echo "int bar( ){}" >> dom/presentation/AvailabilityCollection.cpp
# Nothing should happen (just renaming)
git commit --amend
# update the title + reformat it
git commit --amend -m "updated" dom/presentation/AvailabilityCollection.cpp

I would like to land that first, and then work on "git add -p" support

Priority: -- → P2
Pushed by sledru@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/cc1347e7b69b hooks_clang_format for git: Run clang-format on the changed files r=sheehan
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68
git diff --staged | clang-format-diff.py -p 1 | git apply -p 0 --cached

To explain why I am not using this:

  • because clang-format-diff.py doesn't always have the whole context, clang-format-diff.py can provide different results (we had an issue with that)
  • "mach clang-format" has more features (third party ignore, parallel execution, etc)
Attachment #9033762 - Attachment is obsolete: true
Regressions: 1541409
Depends on: 1542629
Product: Firefox Build System → Developer Infrastructure
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: