Migrate lint checks to pre-commit?
ScottTodd opened this issue · comments
Current lint checks
We have a lint workflow and script, both of which require some manual setup and can get out of sync:
- https://github.com/iree-org/iree/blob/main/.github/workflows/lint.yml
- https://github.com/iree-org/iree/blob/main/build_tools/scripts/lint.sh
We currently run:
Tool | Description |
---|---|
bazel_to_cmake |
Our custom tool for translating BUILD.bazel files to CMakeLists.txt files |
buildifier |
Formatting BUILD.bazel files |
black |
Formatting Python files |
pytype |
Analyzing types in Python files This has enough false positives to remove-- we could switch to mypy |
clang-format |
Formatting C/C++ files |
tabs |
Our custom script to check source files tabs when spaces should be used |
yamllint |
Linting yaml files |
markdownlint |
Linting markdown files |
path_lengths |
Checking for long path lengths (problematic on Windows) |
generated_cmake_files |
Checking that our test suite generation scripts have been run |
build_file_names |
Checking that Bazel files are named BUILD.bazel instead of BUILD Remove? Helps with integration into Google's downstream repo |
Pre-commit
Pre-commit is "a multi-language package manager for pre-commit hooks" that can be run easily on developer machines (no need to separately install specific versions of each tool) and on GitHub Actions via either an action or a service.
We've started using pre-commit on a few related projects and it has been helpful. See also this recent discussion on Discord.
Tip for installing pre-commit: use pipx: https://packaging.python.org/en/latest/guides/installing-stand-alone-command-line-tools/
Started testing this out in IREE with #17534 and this .pre-commit-config.yaml
file:
exclude: "third_party/"
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v3.2.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
exclude_types: ["image", "jupyter"]
- id: check-yaml
exclude: "mkdocs.yml" # Extensions aren't included in the schema: https://github.com/squidfunk/mkdocs-material/issues/6378
- repo: https://github.com/psf/black
rev: 23.3.0
hooks:
- id: black
So far I'm liking the ergonomics.
Setup on Windows:
py -m pip install --user pipx
py -m pipx ensurepath
pipx install pre-commit
pre-commit run --all-files
Buildifier has some annoying gotchas:
- bazelbuild/buildtools#914 makes it replace Windows style line endings with Linux line endings, marking all files as dirty when using git's
core.autocrlf=true
. For some reason I can format Bazel files in VSCode without issues but when I runbuildifier
directly or via pre-commit I get a diff for everyBUILD.bazel
file:
The file will have its original line endings in your working directory
warning: LF will be replaced by CRLF in compiler/plugins/target/LLVMCPU/Builtins/BUILD.bazel.
The file will have its original line endings in your working directory
warning: LF will be replaced by CRLF in compiler/plugins/target/LLVMCPU/internal/BUILD.bazel.
The file will have its original line endings in your working directory
warning: LF will be replaced by CRLF in compiler/plugins/target/LLVMCPU/test/BUILD.bazel.
- despite
buildifier
being written in go and pre-commit supporting package manager features, the community pre-commit hooks assume that it is already installed
git add . --renormalize
fixes line ending stuff for me whenever I have issues - maybe could run that after
(git line ending handling confuses me, so I use python build_tools/bazel_to_cmake/bazel_to_cmake.py && git add . --renormalize
as my bazel-to-cmake command)
Got this mostly done. Remaining work:
- Land #17538
- Switch "required checks" in protected branch settings from previous lint jobs to pre-commit job
- Enable
end-of-file-fixer
check - Enable
trailing-whitespace
check - Figure out how to enable
buildifier
check in a way that doesn't break Windows - Figure out how to enable
generate_cmake_files
check in a way that doesn't break Windows - Remove old
pytype
check - Add new
mypy
(https://mypy-lang.org/) check and ensure repo passes it - Test git commit hook mode (ensure it doesn't add significant overhead, a few checks are running on all files instead of changed files - may want to refactor those checks)
- Add documentation to https://iree.dev/developers/general/contributing/#coding-style-guidelines
git add . --renormalize
fixes line ending stuff for me whenever I have issues - maybe could run that after(git line ending handling confuses me, so I use
python build_tools/bazel_to_cmake/bazel_to_cmake.py && git add . --renormalize
as my bazel-to-cmake command)
Okay, this sort of works. A few ideas for working within the limits of pre-commit:
- Add a
run_buildifier.py
script that runs on all files in a single step then runsgit add . --renormalize
- Add a
run_buildifier.py
script that runs on a single file then runsgit add {FILENAME} --renormalize
(watch for git process lock if running in parallel) - Make the buildifier check optional (https://pre-commit.com/#confining-hooks-to-run-at-certain-stages makes it manual, might prefer a way to just skip it on Windows) and have the CI opt-in
Started looking at mypy on https://github.com/ScottTodd/iree/tree/pre-commit-mypy
Will need to either fix some issues or figure out how to limit how much of the repo it looks at
D:\dev\projects\iree (pre-commit-mypy)
λ pre-commit run --all-files mypy
mypy.....................................................................Failed
- hook id: mypy
- exit code: 2
build_tools\benchmarks\comparisons\common\__init__.py: error: Duplicate module named "common" (also at "build_tools\benchmarks\common\__init__.py")
build_tools\benchmarks\comparisons\common\__init__.py: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#mapping-file-paths-to-modules for more info
build_tools\benchmarks\comparisons\common\__init__.py: note: Common resolutions include: a) using `--exclude` to avoid checking one of them, b) adding `__init__.py` somewhere, c) using `--explicit-package-bases` or adjusting MYPYPATH
Found 1 error in 1 file (errors prevented further checking)
Most of the hooks are fast and only run on individual files when relevant. The slowest right now appears to be bazel_to_cmake, taking several seconds. I just set bazel_to_cmake to always run regardless of which files changed since it does its own walk through to discover Bazel and CMake source files. Could speed that up by teaching the python script to run on individual source files (bazel source -> generate cmake, cmake source -> look for corresponding bazel and edit in-place) and then let pre-commit handle finding affected files.
Performance is particularly important when running as a git hook, since git commit
should take milliseconds, not seconds.
Got a few devs specifically request getting the buildifier check running seamlessly (all platforms, ideally no manual download) here on Discord.
Can also check if there is an automatic deps adder we can use in open source. Google had an internal one but it sometimes was wrong or added extra deps (mostly across selects, which we barely use, IIRC). If such a tool / run mode exists we could try having pre-commit run it automatically.
Classic... there are two hooks for running buildifier listed at https://pre-commit.com/hooks.html, both of which assume that buildifier is already installed (despite pre-commit being a package manager), then https://github.com/bazelbuild/buildtools itself uses a third hook: https://github.com/keith/pre-commit-buildifier. That one claims it downloads buildifier, but instead of working like other pre-commit hooks and building from source (again - package manager) it uses a bash wrapper script that only supports macOS and Linux 🤦
Except for one last change (#17619) and pytype (which will be replaced with mypy), this migration is complete 🥳
All done! (except for setting up mypy for python type checking)