Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update and adjust pre-commit hooks #1953

Merged
merged 10 commits into from
Aug 17, 2024

Conversation

EliahKagan
Copy link
Contributor

@EliahKagan EliahKagan commented Aug 16, 2024

This updates pre-commit hooks to their latest stable versions and adjusts exclusions and tool configuration to avoid false positives, as well as adding one more hook one of whose benefits is to catch problems from false positives of an existing hook.

Bumping versions

Ruff 0.6 has come out, with some improvements, so we may as well use it. Updating the hook for it achieves that, and while doing so, I think makes sense to update the others.

Over time, hooks that are not updated will become more out of date, so that the number of breakages that have to be solved when a bug or feature consideration requires an update is greater. No changes were needed to accommodate Ruff 0.6. But more false positives occur with the newer version of codespell. This addresses them by adding some more explicitly ignored words in its pyproject.toml configuration.

Avoiding symlink path corruption

Separate from any changes introduced by those updates, there is a problem with the way the end-of-file-fixer operates. As covered in more detail in commit messages, on Windows, symlinks are often checked out as regular files. But these files should not have newlines appended to the end of them, because when such a change is committed, it is adding a newline to the end of the path the repository stores for the symlink. Then, when the symlink is actually checked out, it is broken.

This excludes files named like licenses from being scanned and "fixed" by end-of-file-fixer, since that is both the current situation where the problem happens and the most likely way it would arise in the future without being detected through use of the repository.

It also adds a pre-commit hook (not used in this repository before) to check all symlinks. This will never find that error in a local repository where it arises, because it only checks actual symlinks, but it will find it on CI after it arises if it ever does again. There may also be a further benefit to it, since dangling symlinks could be accidentally introduced in the future for other reasons.

On Windows, when `core.symlinks` is `false` or unset (since it
defaults to `false` on Windows), Git checks out symbolic links as
regular files whose contents are symlinks' target paths. Modifying
those regular files and committing the changes alters the symlink
target in the repository, and when they are checked out as actual
symlinks, the targets are different.

But the `end-of-file-fixer` pre-commit hook automatically adds
newlines to the end of regular files that lack them. It doesn't do
this on actual symlinks, but it does do it to regular files that
stand in for symlinks. This causes it to carry a risk of breaking
symlinks if it is run on Windows and the changes committed, and it
is easy to miss that this will happen because `git diff` output
shows it the same way as other additions of absent newlines.

This deliberately commits the change that end-of-file-fixer makes
to the `LICENSE-BSD` symlink, in order to allow a mitigation beyond
just excluding that symlink (or replacing it with a regular file)
to be tested. This change must be undone, of course.
Rationale:

- Small but likely benefit in general, since there are no currently
  foreseen intentional use cases of committing of broken/dangling
  symlinks in this project. So such symlinks that arise are likely
  unintentional.

- If the end-of-file-fixer hook has run on a Windows system where
  `core.symlinks` has *not* been set to `true`, and symlinks' paths
  have not been excluded, then a newline character is added to the
  end of the path held in the regular file Git checks out to stand
  in for the symlink. Because it is not actually a symlink, this
  will not detect the problem at that time (regardless of the order
  in which this and that hook run relative to each other). But when
  it is then run on CI on a system where symlinks are checked out,
  it will detect the problem.
The unanchored `LICENSE` and `COPYING` alternatives match the
pattern anywhere, and therefore exclude the currently used path
`fuzzing/LICENSE-BSD`.

License files are more likely than other files in this project to
be introduced as symlinks, and less likely to be noticed
immediately if they break. Symlinks can be checked out as regular
files when `core.symlinks` is set to `false`, which is rare outside
of Windows but is the default behavior when unset on Windows.

This exclusion fixes the current problem that end-of-file-fixer
breaks those links by adding a newline character to the end (the
symlinks are checked out broken if that is committed). It also
guards against most future cases involving licenses, though
possibly not all, and not other unrelated cases where symlinks may
be used for other purposes.

Although the pre-commit-hooks repository also provides a
destroyed-symlinks hook that detects the situation of a symlink
that has been replaced by a regular file, this does not add that
hook, because this situation is not inherently a problem. The code
here does not require symlinks to be checked out to work, and
adding that would break significant uses of the repository on
Windows.

Note that this leaves the situation where a license file may be a
symlink to another license file and may thus be checked out as a
regular file containing that file's path. However, it is easy to
understand that situation and manually follow the path. That
differs from the scenario where a symlink is created but broken,
because attempting to open it gives an error, and the error message
is often non-obvious, reporting that a file is not found but giving
the name of the symlink rather than its target.
@EliahKagan EliahKagan marked this pull request as ready for review August 16, 2024 01:04
Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your help, much appreciated.

@EliahKagan
Copy link
Contributor Author

No problem! Should anything further be done on it before it is ready to be merged?

@Byron
Copy link
Member

Byron commented Aug 17, 2024

I might have failed to press the button 🤦‍♂️

@Byron Byron merged commit 900fc33 into gitpython-developers:main Aug 17, 2024
22 checks passed
@EliahKagan
Copy link
Contributor Author

No problem!

@EliahKagan EliahKagan deleted the pre-commit branch August 17, 2024 18:32
@EliahKagan
Copy link
Contributor Author

I've noticed that the merge commit failed CI on the main branch. But this appears unrelated to any of the changes here, because it is a random failure due to #1676. I expect that the failed check would succeed if re-run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants