Meta issue: plugin system

Question

Meta issue: plugin system

sobolevn opened this issue 2 years ago · comments

I think that the main thing why Flake8 is so popular is its plugin system.
You can find plugins for every possible type of problems and tools.

Right now docs state:

Beyond rule-set parity, ruff suffers from the following limitations vis-à-vis Flake8:

1. Flake8 has a plugin architecture and supports writing custom lint rules.

I propose designing and implementing plugin API.
This way ruff can compete with flake8 in terms of adoption and usability.

Plugin API

I think that there are some flake8 problems that should be fixed and also there are some unique chalenges that should be addressed.

Explicit "opt-in" for plugins. Right now flake8 suffers from a problem when you install some tool and it has a flake8 plugin definition. This plugin is automatically enabled due to how setuptools hooks work. I think that all rules must be explicit. So, eslint's explicit plugins: looks like a better way.
Special "fix" API and tooling: so many typical problems can be solved easily by plugin authors
Plugin order. Since plugins will change the source code, we must be very strict about what order they run in. Do they run in parallel while checking files?
Plugin configuration: current way of configuring everything in [flake8] section can cause conflicts between plugins. Probably, the way eslint does that is better
Packaging: how to buld and package rust extensions? How to build wheels?

Please, share your ideas and concerns.

Charlie Marsh · Answer 1 · Thu Sep 29 2022 20:11:09 GMT+0800 (China Standard Time)

Strongly agree. Will write up some thoughts on this later!

Charlie Marsh · Answer 2 · Fri Sep 30 2022 04:22:38 GMT+0800 (China Standard Time)

The first big decision is: should plugins be written in Rust? Or in Python? I believe that either could be possible (though I haven't scoped out the work at all), e.g., using pyo3. It may even be easier to support plugins in Python given that loading code dynamically is much easier in a scripted language...

However, I'm partial to requiring plugins be written in Rust. It will lead to a more cohesive codebase, allow us to maintain a focus on performance, and avoid requiring extensive cross-language FFI. I'm open to being convinced otherwise here though.

Here are a few relevant resources on implementing a plugin system for Rust:

One of the main challenges seem to be around the lack of ABI stability in Rust. In many of the above write-ups, they discuss how both the plugins and calling library need to use the same versions of Rust in order to be compatible, which feels like a tall order. (From that perspective, one thing that's interesting to me is: could we compile plugins to WASM?)

Nikita Sobolev · Answer 3 · Fri Sep 30 2022 04:33:12 GMT+0800 (China Standard Time)

I think that instead of Python based plugins, it is better to provide some kind of query language to make easy plugins very easy. Like in https://github.com/hchasestevens/astpath

In my opinion, any complex stuff should be in Rust. This way it can reuse existing APIs and be fast. But, I don't know how many Python developers actually know Rust 🤔

I think another way of dealing with it is to ask exisiting flake8 plugin authors about their prefered way of writting it. Their feedback would be very valuable!

Charlie Marsh · Answer 4 · Fri Sep 30 2022 07:08:47 GMT+0800 (China Standard Time)

Interesting, Fixit / LibCST has something kind of like that too. It's not quite a distinct query language, but it's effectively a DSL (in Python) to pattern-match against AST patterns.

Josh Cannon · Answer 5 · Tue Oct 25 2022 23:25:35 GMT+0800 (China Standard Time)

My 2c. flake8's plugin support is pretty rudimentary (operates on tokens/lines plus a handfull of metadata). Therefore if you supported Python plugins, you likely could craft it in a way that supporting flake8 plugins out-of-the-box(-ish) would be feasible.

Then you could have most flake8 plugins available through ruff, and wouldn't need to support new plugins by porting the code to Rust (only if you want to because yummy yummy perf).

(lightly related to #414)

Charlie Marsh · Answer 6 · Thu Oct 27 2022 20:21:10 GMT+0800 (China Standard Time)

Another idea: we could build a plug-in system atop https://github.com/ast-grep/ast-grep. This would allow users to express lint rules in YAML or via a simple DSL.

Charlie Marsh · Answer 7 · Thu Oct 27 2022 20:29:08 GMT+0800 (China Standard Time)

(That tool is itself built atop tree-sitter.)

Sigurd Ljødal · Answer 8 · Wed Nov 09 2022 05:18:18 GMT+0800 (China Standard Time)

I’m coming to this from a position of having written a flake8 plugin for a very specific need at work, and as part of a larger project. This is not something generic, so it’d never make sense as a built-in feature in ruff. I’d love a way to write plugins to ruff in Python, mostly because it’s convenient as someone familiar with Python and not so much rust, but also because it would be nice to keep a project in pure Python even while interacting with rust.

The specific plugin in my case, oida, was first written as a standalone thing before I discovered how easy it was to add it as a plugin to flake8. It also uses LibCST, for its ability to round trip code, where we do codemodding for its. If it would be possible to expose a similar ast based Python-interface for plugins that would be awesome.

I also have use cases where I’d like to do auto fixing, which it would also be nice to support. The first thing I’d like to do is normalize import statements (relative vs absolute). In order to do that I’d need an interface where I get import statements or ast nodes and the path to the file so I can locate it in relation to other files on the system.

I understand that writing plugin in Python would be a slowdown compared to writing them in rust, but I think that tradeoff would be very much acceptable in many cases.

Charlie Marsh · Answer 9 · Wed Nov 09 2022 05:28:20 GMT+0800 (China Standard Time)

Very helpful and all makes sense.

Maybe just as another data point for the thread: when I was at Spring Discovery, we wrote a few Flake8 plugins to enforce highly codebase-specific rules.

For example:

"Always late-import TensorFlow" (i.e., import it within a function that depends on it, rather than at module top-level)
"If you ever import module X, make sure the file also imported module Y"
"Imports to module Z should always use import from structure"

Charlie Marsh · Answer 10 · Wed Nov 09 2022 05:29:19 GMT+0800 (China Standard Time)

So in that light, I think there are different categories of plugins:

Plugins that are custom to a codebase
Plugins that may apply to many codebases, but don't make sense to include directly in Ruff (e.g., Django-specific stuff could qualify here)

Charlie Marsh · Answer 11 · Wed Nov 09 2022 05:30:00 GMT+0800 (China Standard Time)

I think most of those "custom" plugins / checks could be built atop something like ast-grep, but more complex checks (like rewriting absolute and relative imports) would be limited by that approach.

Charlie Marsh · Answer 12 · Wed Nov 09 2022 05:30:20 GMT+0800 (China Standard Time)

The first thing I’d like to do is normalize import statements (relative vs absolute). In order to do that I’d need an interface where I get import statements or ast nodes and the path to the file so I can locate it in relation to other files on the system.

(Separately: this could arguably make sense to include in Ruff directly.)

Sigurd Ljødal · Answer 13 · Wed Nov 09 2022 06:03:35 GMT+0800 (China Standard Time)

I think most of those "custom" plugins / checks could be built atop something like ast-grep, but more complex checks (like rewriting absolute and relative imports) would be limited by that approach.

Yeah, I wouldn’t really be able to implement any of Oida using ast-grep, as all the rules depend on the context of the project. I use in-process caching to keep that state ready between files in the current flake8 plugin btw, forgot to mention that above, so the flake8 interface isn’t ideal for that kind of plugin.

The first thing I’d like to do is normalize import statements (relative vs absolute). In order to do that I’d need an interface where I get import statements or ast nodes and the path to the file so I can locate it in relation to other files on the system.

(Separately: this could arguably make sense to include in Ruff directly.)

I guess some rules could be, again not for my specific case. What we’re considering at work is to enforce relative imports within a Django app and use absolute imports for everything else. Our structure will be project.component.app or project.app, so it would be very specific to our use case how that rule should be applied. I’ve already played with implementing it in isort, but I found that code base hard to navigate and would love a clean ast/cst based plugin interface where I could add this logic :)

Peter Cock · Answer 14 · Tue Nov 15 2022 04:30:58 GMT+0800 (China Standard Time)

You asked for feedback from other flake8 plugin authors, so:

https://github.com/peterjc/flake8-black (620k downloads/month on PyPI), not needed if you can run black directly as well as running flake8, for example via the tool pre-commit or otherwise. Currently this reloads each Python file from disk (scope here to refactor to let black use its cache), it would not be possible to use the AST from flake8 directly. Does not make sense to plug into ruff.
https://github.com/peterjc/flake8-rst-docstrings/ (238k downloads/month on PyPI), uses the AST to extract docstrings, which are passed as strings to the Python library docutils to be validated as RST. My code is essentially a wrapper, and since docutils is written in Python that would have to be used internally if this plugin were to be ported to ruff.
https://github.com/peterjc/flake8-sfs (15k downloads/month on PyPI), uses the AST directly looking for particular kinds of node. Probably could be done in either Python or Rust, although unlikely to be popular enough to deserve including in ruff itself.

Charlie Marsh · Answer 15 · Tue Nov 15 2022 10:35:00 GMT+0800 (China Standard Time)

Thank you @peterjc! Really appreciate your engagement here as a plugin author!

(Regarding RST: it looks like there's at least one Rust crate for parsing RST, though it doesn't look super popular.)

Ofek Lev · Answer 16 · Tue Nov 15 2022 11:00:16 GMT+0800 (China Standard Time)

Is this possible, or supported currently? https://github.com/adamchainz/flake8-tidy-imports

Charlie Marsh · Answer 17 · Thu Nov 17 2022 23:07:07 GMT+0800 (China Standard Time)

@ofek - Not currently supported but it’s a pretty small surface area so should be easy to add some time in the next few days.

Ofek Lev · Answer 18 · Thu Nov 17 2022 23:21:16 GMT+0800 (China Standard Time)

Thanks! I've been enforcing absolute imports recently (except in tests) https://github.com/pypa/hatch/blob/b0911bb0eaa8d331c24eda940b97bf244ecd5ac3/.flake8#L8-L11

After that I'll switch over, and make new projects generated by Hatch use this.

Charlie Marsh · Answer 19 · Thu Nov 17 2022 23:39:08 GMT+0800 (China Standard Time)

Sweet! The banned relative import rule I can definitely do today.

Charlie Marsh · Answer 20 · Fri Nov 18 2022 01:39:51 GMT+0800 (China Standard Time)

@ofek -- I252 (banned relative imports) just went out in v0.0.125.

You can use it in Hatch by adding this to your pyproject.toml:

[tool.ruff]
select = [
  "B",
  "C",
  "E",
  "F",
  "W",
  # Ruff doesn't have this, but it does have E722.
  # "B001",
  "B003",
  "B006",
  "B007",
  # These don't exist in newer flake8-bugbear versions IIUC.
  # "B301",
  # "B305",
  # "B306",
  # "B902",
  "Q000",
  "Q001",
  "Q002",
  "Q003",
  "I252",
]
ignore = [
  "B027",
  # "E203",
  # "E722",
  # "W503",
]
line-length = 120
# tests can use relative imports
per-file-ignores = {"tests/*" = ["I252"], "tests/**/*" = ["I252"]}

[tool.ruff.flake8-tidy-imports]
ban-relative-imports = "all"

Let me know if it works, or doesn't! :)

Ofek Lev · Answer 21 · Mon Nov 21 2022 01:55:03 GMT+0800 (China Standard Time)

Thank you!!! pypa/hatch#607

Sigurd Ljødal · Answer 22 · Tue Nov 29 2022 01:09:42 GMT+0800 (China Standard Time)

@charliermarsh You wrote somewhere that libcst is significantly slower than the current ast implementation in ruff (can't find it right now). Do you know why? Is it because it's a cst or is it because the classes it exposes are Python "compatible"?

I'm asking because I've started looking into pyo3 and from what I see the only way to expose an ast to a Python plugin would be to make the ast classes Python classes in pyo3. If that's what's slow with libcst I guess there's not really any point in investigating that route too much, but if we could make that fast enough I guess it could be one way to make plugins work.

That doesn't resolve auto-fixing, but as I suggested in another thread I think maybe doing auto-fixing on the token level could be made to work. Maybe an interface like this:

def visit_Import(node: ast.Import, tokens: list[str]) -> list[str]:
    # Check ast (or tokens) for violations and return updated token
    return ["import", " ", "foo"]

Or maybe have tokens as an attribute on the ast nodes 🤔

Charlie Marsh · Answer 23 · Tue Nov 29 2022 02:48:48 GMT+0800 (China Standard Time)

@ljodal - This was all based on LibCST as a Rust crate, with no Python FFI -- so I think it's just the CST and parser, and not anything to do with the the serialization. (I also hacked in some RustPython vs. LibCST benchmarks into the existing LibCST benches and got similar results. As with all benchmarking, though, I could definitely be doing something wrong!)

Charlie Marsh · Answer 24 · Tue Nov 29 2022 02:49:20 GMT+0800 (China Standard Time)

@ljodal - I don't have great intuition for whether the PyO3 FFI would add much overhead and what the performance impact would be. I think it's worth exploring!

Sigurd Ljødal · Answer 25 · Tue Nov 29 2022 16:30:22 GMT+0800 (China Standard Time)

Aight, then I'll continue investigating :)

I haven't written any rust before, so it's slow going (thinking of doing advent of code in rust to get a kickstart). My plan was to use the Python ASDL definitions to generate AST classes, but it's been years since last I touched compilers so I'll have to see how I go about the tokenization and conversion to ast

Charlie Marsh · Answer 26 · Wed Nov 30 2022 05:12:42 GMT+0800 (China Standard Time)

You might be interested in some of the stuff in RustPython -- they generate the Rust AST definitions from an ASDL file here and here.

Sigurd Ljødal · Answer 27 · Wed Nov 30 2022 06:00:47 GMT+0800 (China Standard Time)

Nice, thanks for the links, I'll take a look :) I guess if I base it on that and just tweak it to be Python compatible it should be fairly easy to see the overhead as well

Steve Dignam · Answer 28 · Wed Jan 04 2023 10:42:55 GMT+0800 (China Standard Time)

Another idea: we could build a plug-in system atop https://github.com/ast-grep/ast-grep. This would allow users to express lint rules in YAML or via a simple DSL.

something like this would be useful to replicate eslint's https://eslint.org/docs/latest/rules/no-restricted-syntax which is handy for little one off things

Ryan Morshead · Answer 29 · Wed Jan 11 2023 09:35:41 GMT+0800 (China Standard Time)

As a flake8 plugin author myself who is not presently acquainted with Rust, I think it's important that Ruff's plugin system support plugins implemented in Python. While I understand the benefits of simplicity and performance that would come with requiring plugins to be implemented in Rust, I suspect that the eventual ecosystem of Ruff plugins would ultimately suffer for it. I think this because Ruff is a linter for Python and as such, many, and probably a significant majority of its users, will come to Ruff not knowing Rust. As a consequence, if only Rust plugins are supported, when Ruff users who are not familiar with Rust find a need or idea for a new plugin, in the case of the former they may switch to Flake8, and in the case of the latter they may do the same or just not develop a plugin at all. Chances are, prospective plugin authors care first about having something that works and only second about whether or not it is performant. Thus, it seems prudent to support the tools that Ruff's future plugin authors likely already know. That being Python.

My own personal anecdote is that I'm the author of a Flake8 plugin for another one of my projects. Both these projects are small in scale at the moment, so take this with a grain of salt, but I think it's unlikely that I would author a plugin for Ruff if I had to do it in Rust unless and until Ruff became more popular than Flake8. While I like the idea of learning Rust in order to port my plugin, I just don't think I could justify taking the time to do so. Rather, I would prefer to maintain a generic set of linting heuristics that I could use in both my Flake8 plugin and a potential Ruff plugin. Doing so would both save on maintenance effort and allow users to choose the linting tool of their choice. While my plugin and project don't have many users, I expect that my rational would probably be even more applicable to large projects with maintainers who have even less time to spend supporting what is at present, a fairly niche linting tool like Ruff.

With all this said, if Ruff supported plugins in Python and Rust, I think Ruff plugins could be a gateway to learn Rust (if an unlikely one) since authors of Python plugins like myself may find a need or demand for performance in the future.

Sondre Lillebø Gundersen · Answer 30 · Fri Jan 20 2023 00:38:08 GMT+0800 (China Standard Time)

I maintain a plugin that I would like to rewrite in rust, to be able to run it with ruff.

I just wanted to ask, what happens to flake8-plugins ported to ruff that need significant upgrading/maintenance for, e.g., new python versions; who has the maintenance burden once it has been ported? 🙂

Would it be possible to port it to ruff (if deemed generally useful), then split out as a plugin if/once the architecture for that is in place?

Charlie Marsh · Answer 31 · Fri Jan 20 2023 00:52:09 GMT+0800 (China Standard Time)

Oh rad! It looks like a great plugin -- a bunch of people have asked for it (#1785), I was actually gonna look into porting it soon myself given all the demand. But if you're up for owning the port, and / or willing to help out, that would be amazing, and I'd be happy to support you however I can.

Who has the maintenance burden once it has been ported?

This is a great question. Anything that we merge into Ruff (and that stays in Ruff), I'm signing up to be backstop maintainer. If other contributors are able to help maintain, merge in improvements, fix bugs, and support those bigger upgrades, it's obviously much appreciated and welcome. But I'm not merging in plugins with the expectation that others are required to maintain them in perpetuity.

(This means there are some limits on what we'd merge into Ruff. I haven't defined those yet since, frankly, I haven't seen anything proposed yet that feels out of place.)

A little off-topic, but it might be useful to note that, because of the "bundling" that we're doing with Ruff, plugins can actually share a bunch of functionality, which in some ways makes them easier to maintain as a group. For example, pydocstyle and flake8-unused-arguments both rely on the ability to determine whether a function or method is public or private. In Ruff, we can use a single mechanism for that tracking, and share the inferred visibility with the individual rules. As another example, I'd hope that some of the stuff we have around import tracking and annotation detection would be useful for flake8-type-checking.

Would it be possible to port it to ruff (if deemed generally useful), then split out as a plugin if/once the architecture for that is in place?

Yes, absolutely.

Sondre Lillebø Gundersen · Answer 32 · Fri Jan 20 2023 03:09:07 GMT+0800 (China Standard Time)

Ah how cool! Glad I stumbled onto this then 🥳 I'll direct remaining questions to that issue 👍

Ville Skyttä · Answer 33 · Tue Jan 24 2023 05:36:07 GMT+0800 (China Standard Time)

When I found out about ruff last week, two things resonated with me immediately:

The uncanny performance, obviously 🚀
No need to look for a plethora of $linter plugins, and spend time keeping them working e.g. across breaking API changes, Python versions, etc.

I realize there are good use cases for plugins (specific/private use cases perhaps the best one), but the fewer of them the merrier, as long as their functionality is available and the ecosystem thrives, all corners get the love they need, things end up in hands of users as quickly as they should etc. There are limits to how far that can scale, and not all itches can ever be scratched, but for the time being people working on ruff seem to be doing an awesome job on this front, too.

I'm not arguing against a plugin system, but rather just hoping that the proliferation of plugins there is in the traditional Python linter scene doesn't repeat itself here. Having read the above commentary, I've no reason to think it will 👍

Yak · Answer 34 · Tue Jan 24 2023 17:43:47 GMT+0800 (China Standard Time)

2. No need to look for a plethora of $linter plugins, and spend time keeping them working e.g. across breaking API changes, Python versions, etc.

Yes! This is part of what makes ruff an amazing project imo. You can just replace so many dependencies (and their sub-dependencies) with one self-contained package that already have (almost) all the rules you need available.

Matthew Gamble · Answer 35 · Wed Jan 25 2023 09:12:04 GMT+0800 (China Standard Time)

In the Javascript world, when using eslint, there are a myriad of plugins to support various use cases. Some of those are about enabling Typescript support, some about integrating prettier, some about strictness, etc etc

There is also a tool named xo that essentially wraps all of this up into a set of opinionated defaults, that automatically configures all of the above for you. I've started using this in my Typescript projects, and am loving it.

I would personally prefer to see an xo-style tool for python that wraps flake8, black and isort, but still provides all of the power of those things. People are inevitably going to want custom plugins for various reasons, and a good tool shouldn't block that IMO. I also don't want to be beholden to the decisions of a single dev team to decide what linting rules I should and shouldn't have, who also has to weigh up whether or not it's worth breaking peoples' builds to introduce some new checks (once things have stabalised a bit).

Unfortunately the big blocker for this happening with flake8 right now is the lack of a good configuration system. Eslint's config system is very powerful and supports presets. This empowers people to wrap it effectively without reducing the power of the underlying system.

Charlie Marsh · Answer 36 · Wed Jan 25 2023 10:51:45 GMT+0800 (China Standard Time)

I appreciate all the input here :)

I'll just chime in briefly to say a make a few quick comments:

I do see the "bundling" that's happening in Ruff as a feature and not a bug. The ability to replace multiple tools with one unified tool has resonated a lot with users! That makes me happy and has led me to continue pushing in that direction.
- (There are some "economies of scale" here too: if two rules need a notion of public / private visibility tracking, or docstring detection, they can share that logic. This wouldn't be impossible to achieve with a plugin system, but it's certainly harder to anticipate all the data a plugin might need.)
I would still like to support third-party plugins in some form.
- Broadly, I could see two APIs for plugins: (1) some kind of DSL, as in ast-grep, that doesn't require writing code, but is more limited in what you can flag; (2) a Rust and/or Python API, similar to the plugin systems you see in tools like ESLint, Babel, Rollup, etc.
However, given the success we've had with our current approach, plugins aren't a top priority for me right now.
- Introducing a plugin system will also put a lot of constraints on the project. As soon as we have non-Ruff code that depends on any sort of public API, we'll be far more limited in the kinds of changes and refactors we can make. So I don't even want to start hacking on a plugin system until we're confident that the API won't see significant churn.

All this is to say that I would like to support plugins at some point, but it likely won't happen soon enough for me to put any sort of timeline on it.

Matthew Brown · Answer 37 · Mon Mar 06 2023 13:52:20 GMT+0800 (China Standard Time)

Offering input from someone who doesn't use Ruff or Python, but who has rewritten a static analysis tool (with a large plugin ecosystem) in Rust that was originally in an interpreted language, I think you should stick with Rust as much as possible for plugins.

The path I took — wrapping the Rust tool in a custom wrapper that uses documented plugin hooks — probably doesn't make sense for a tool with this large a community, because it would require users recompile ruff.

SWC uses an alternative system — plugins are written in Rust and run using WASM, which is slightly slower but not as slow as firing up the Python interpreter. Colleagues have written one or two internal SWC plugins, allowing us to migrate away from slower Node-based tools.

Michael Bayer · Answer 38 · Wed Mar 08 2023 03:30:15 GMT+0800 (China Standard Time)

Ideas department:

RustPython is a full Python interpreter written Rust and is embeddable in Rust programs. Ruff could optionally embed this interpreter in order to use outlier flake8 plugins. If no such plugins are needed then the interpreter wouldn't be loaded.

I think being able to use existing flake8 plugins as is, as well as to be able to write plugins in Python in general, would be a big win.

Sean Mackesey · Answer 39 · Mon Apr 10 2023 21:39:52 GMT+0800 (China Standard Time)

Just chiming in here to say that over at Dagster the ability to write custom ruff lint rules would be extremely useful. Would also be fine if it required rust.

Christopher Bailey · Answer 40 · Tue May 02 2023 14:47:36 GMT+0800 (China Standard Time)

I definitely think support for Python based plug-ins is critical. Probably more important then Rust plugin support (at least for prioritization and initial support, not saying never implemente Rust support).

The repo / open source community has already showed that the tool itself is willing to accept a ton of popular plugin linting rules. So if the rules are open source, they can be written in Rust and added right into the repo without a plugin.

However, there are going to be a lot of rules that are not open source or popular. Especially ones written by orgs. The bar to entry here probably needs to be as low as possible. What if the org/devs do not know Rust? They know Python for sure since this is a Python linter.

Philipp A. · Answer 41 · Mon Jun 19 2023 17:12:53 GMT+0800 (China Standard Time)

peterjc/flake8-rst-docstrings (238k downloads/month on PyPI), uses the AST to extract docstrings, which are passed as strings to the Python library docutils to be validated as RST. My code is essentially a wrapper, and since docutils is written in Python that would have to be used internally if this plugin were to be ported to ruff.

(Regarding RST: it looks like there's at least one Rust crate for parsing RST, though it doesn't look super popular.)

Yeah, my crate hasn’t received much love, which I think is due to rST in general not receiving much love, especially outside the python community.

I’m not sure if it would be up to the task of validating rST grammar, since there is no formal rST grammar, so I had to do an ad-hoc one, which is not finished. My crate shares that issue with other (equally partial) rST implementations.

Henning · Answer 42 · Fri Jul 28 2023 03:03:19 GMT+0800 (China Standard Time)

Hello, the following tutorial about ruff mentions some plugin mechanism, which is not documented anywhere else:

https://dev.to/ken_mwaura1/enhancing-python-code-quality-a-comprehensive-guide-to-linting-with-ruff-3d6g#creating-custom-linting-rules-in-ruff

Can anyone confirm or deny this feature? I doubt that this issue is outdated.

Charlie Marsh · Answer 43 · Fri Jul 28 2023 03:09:13 GMT+0800 (China Standard Time)

I can confirm that the mentioned feature doesn't exist. I'm not sure where that code snippet came from, but we don't have a Python API or support custom rules like that.

Ryan Morshead · Answer 44 · Fri Jul 28 2023 05:45:38 GMT+0800 (China Standard Time)

Seems like something an AI would generate.

Philipp A. · Answer 45 · Fri Jul 28 2023 16:23:00 GMT+0800 (China Standard Time)

This reddit thread has a few approaches for plugins: https://www.reddit.com/r/rust/comments/144zmwk/how_can_i_add_dynamic_loading_to_do_plugins_for/

The options are basically

Rust-to-Rust by using an ABI stable type crate
Some other language interface like WASM or C
- There’s wrappers like Extism for WASM
Some embedded scripting language like Lua or one of the Rust ones like Rhai or dyon

Predrag Gruevski · Answer 46 · Fri Jul 28 2023 23:56:24 GMT+0800 (China Standard Time)

Another option is to embed just a query engine, as a lighter-weight option than a full scripting language. I've built such an engine (Trustfall) and several linters like this (cargo-semver-checks for linting semver in Rust, company-internal linter for internal lints).

The query engine route makes it possible to separate lints (business logic) from implementation details (how is the AST stored under the hood, etc.). Specifying lints as declarative queries makes maintainability and performance optimization easier: cargo-semver-checks recently became 2000x faster without any changes to lints.

It also makes it easy to spin up a playground without worrying about someone injecting eval(). Here's a playground where you can query the contents of popular Rust crates as well as internals like std and core: https://play.predr.ag/rustdoc

The oxc Javascript linter also recently adopted Trustfall as an approach for both internal and custom lints. Here's an example Trustfall-based lint from oxc: https://github.com/web-infra-dev/oxc/pull/627/files#diff-23497bb392f82d20572ed7744d6d0b80061ff480d9e4c913792cf2c911bde5cd

Zanie Blue · Answer 47 · Sat Jul 29 2023 06:18:15 GMT+0800 (China Standard Time)

There is also a lot of interesting discussion about plugins in the Rust ecosystem over at helix-editor/helix#3806

It may be essential for us to allow plugins to be authored in Python, regardless of the machinery between the Python API and our Rust API.

Adam Azarchs · Answer 48 · Tue Sep 26 2023 06:08:24 GMT+0800 (China Standard Time)

Definitely not a mainstream idea but just wanted throw this out there for consideration: you could consider using starlark for plugins.

The language is essentially a non-Turing-complete subset of python with strong sandboxing and safe multithreaded execution, designed for having an interpreter embedded in another hosting process. There is in fact an interpreter implementation for rust.

In some ways this would provide the "best of both worlds" in allowing you to keep a near-python syntax while avoiding many of the performance and maintenance issues inherent to supporting python plugins. For one thing, you don't need to be ABI-compatible with python, since you'd be self-hosting the interpreter. The main downside would be lack of availability of arbitrary python packages, but from a performance perspective that's probably a good thing.

The main differences between starlark and python:

Top-level variables (including functions) are frozen after import, and are single-assignment. This makes concurrent execution easy, since there can be no mutable state shared between invocations of a function; something that ruff would I think very much want to be able to take advantage of. It also permits certain forms of ahead of time "compilation" and static checking which are impossible to do reliably in a language as dynamic as python, e.g. checking for undefined names during load rather than at runtime on every reference.
No try/except. Errors are always hard failures. This significantly simplifies the runtime and again allows for more "precompilation".
Disallowing of various bug-prone patterns like modification during iteration.
No unbounded loops. for x in y is allowed but no while loops and, by default, no recursion. This is a little awkward at times but allows the runtime to ensure that all starlark programs will eventually terminate.
No OS access out of the box. While the hosting interpreter can expose methods for things like reading files, none is provided by default, meaning it should be safe to run untrusted starlark code. It also prevents plugin authors from "going around" your provided APIs.

Ultimately, I don't think any of this would provide much of a benefit over and above using WASM for plugins. WASM already enables users to write their code in any language of their choice that supports wasm as a compilation target. However, python is not one of those languages, and as has been pointed out, most ruff users work primarily, if not exclusively, in python, so having something at least near-python may have some value. It still wouldn't enable things like flake8-rst-docstrings delegating out to a python rst-parsing library, but personally I would consider that to be a good thing, as python dependency trees can quickly grown out of control and become difficult to maintain and keep up to date.

Ofek Lev · Answer 49 · Tue Sep 26 2023 06:22:21 GMT+0800 (China Standard Time)

If we do go down the Starlark route, the PyOxidizer project(s) can serve as an extensive example of usage in Rust.

Predrag Gruevski · Answer 50 · Tue Sep 26 2023 06:27:39 GMT+0800 (China Standard Time)

Small update if you might be considering Trustfall: at RustConf last week, @estebank and a few other folks expressed interest in using Trustfall to query Rust HIR as a way to support custom lints for Rust 👀

Gnosnay · Answer 51 · Thu Dec 14 2023 18:00:10 GMT+0800 (China Standard Time)

learned a lot from this long thread. May i know for now, if we wanna define our own syntax linter check with ruff, how should we do?
If anyone can give one way, i will very appreciate it

Miranda Van Minnen · Answer 52 · Wed Feb 14 2024 07:00:52 GMT+0800 (China Standard Time)

It sounds like the jury is still out when it comes to creating custom rules, is that correct? I've seen custom linting rules be a valuable tool when modularizing a monolith, with rules very customized for the codebase you're working in. As far as I can tell, Ruff does not allow you to develop custom rules at this time, so we'd have to run another linter alongside Ruff for that ability.

We just switched our codebase to using Ruff, and are also looking to start modularizing. I'm trying to figure out if I need to chose a second tool alongside Ruff for customizations.

Adam Azarchs · Answer 53 · Wed Feb 14 2024 07:25:37 GMT+0800 (China Standard Time)

I think the main point of contention is not whether it should be allowed but rather how.

IMO if people want to author a plugin in python they should probably use a python-based tool (e.g. flake8 or pylint) to run it. One of the things I like about ruff is that it doesn't have a dependency on a python runtime, and not unrelatedly that it is very fast. A plugin architecture for ruff would be nice, certainly, but I'd advocate for it being either native plugins (e.g. .so libs that can be dynamically loaded into the process and register themselves), WASM, or maybe some kind of DSL (ast-grep was mentioned earlier).

I strongly suspect that a DSL would be sufficient for 80+% of the kind of use cases people are describing where a repo has rules very specific to their code base that wouldn't be sufficiently broadly applicable to upstream. Especially nice about that is that such custom rules could just be included in the pyproject.toml (toml is isomorphic to yaml, though it does get awkward compared to yaml for more deeply nested structures).

Morgante Pell · Answer 54 · Wed Feb 21 2024 20:02:56 GMT+0800 (China Standard Time)

We've been working with Biome to integrate GritQL as an extension/plugin system and I'd love to offer the same for Ruff. The problem space is similar and I think GritQL provides a few advantages:

Preserve some of the best things about Ruff: no runtime Python dependency, pure Rust, no separate installation steps, etc.
Most custom/codebase-specific rules can be expressed as simple AST-based transforms.
Traversal still happens in pure Rust and, because declarative queries are used, they could be optimized to maintain Ruff's excellent performance

Here's a few example of how @charliermarsh's earlier custom suggestions could be implemented directly:

Always late import TensorFlow - studio

`import $import` where {$import <: contains `tensorflow`, $import <: not within block()}

If you ever import module X, make sure the file also imported module Y - studio

`import $import` where {
    $import <: contains `moduleX`,
    $program <: not contains `import moduleY`
}

Imports to module Z should always use import from structure - studio

`import $import` where {$import <: contains `module`, $import <: not within `from $_ import $_`}

Joberto Diniz · Answer 55 · Tue May 14 2024 20:57:21 GMT+0800 (China Standard Time)

I'm looking to write a few fitness functions by extending ruff linters. I think that would be ideal, and I'd like to use python.

Since ruff does not support plugins, I'm writing these functions as tests that run on CI using pytest, but these specific fitness functions I'm writing are linters, so it would make sense to write them as part of ruff linters.

Jhosman Frias Bravo · Answer 56 · Thu May 23 2024 00:25:03 GMT+0800 (China Standard Time)

@charliermarsh hi! do you think that ruff will consider a plugin system in the short-medium term?

Micha Reiser · Answer 57 · Tue May 28 2024 21:40:03 GMT+0800 (China Standard Time)

Thank you, @morgante, for offering your support to help us build a GritQL-based plugin system.

GritQL is undoubtedly at the top of my mind when it comes to designing a plugin system for Ruff, and I'm following the work in the Biome repository from a distance (but I must admit, not very closely).

It will probably be a while before we evaluate solutions for a plugin system because we're currently in the middle of rewriting Ruff's compiler infrastructure to support multifile analysis (and more ;)). But I'll come back to your offer when we're ready to explore Ruff plugins.