Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ruff server - A new built-in LSP for Ruff, written in Rust #10158

Merged
merged 8 commits into from
Mar 9, 2024

Conversation

snowsignal
Copy link
Contributor

@snowsignal snowsignal commented Feb 29, 2024

Summary

This PR introduces the ruff_server crate and a new ruff server command. ruff_server is a re-implementation of ruff-lsp, written entirely in Rust. It brings significant performance improvements, much tighter integration with Ruff, a foundation for supporting entirely new language server features, and more!

This PR is an early version of ruff_lsp that we're calling the pre-release version. Anyone is more than welcome to use it and submit bug reports for any issues they encounter - we'll have some documentation on how to set it up with a few common editors, and we'll also provide a pre-release VSCode extension for those interested.

This pre-release version supports:

  • Diagnostics for .py files
  • Quick fixes
  • Full-file formatting
  • Range formatting
  • Multiple workspace folders
  • Automatic linter/formatter configuration - taken from any pyproject.toml files in the workspace. (this will be implemented in a follow-up pull request)

Many thanks to @MichaReiser for his proof-of-concept work, which was important groundwork for making this PR possible.

Architectural Decisions

I've made an executive choice to go with lsp-server as a base framework for the LSP, in favor of tower-lsp. There were several reasons for this:

  1. I would like to avoid async in our implementation. LSPs are mostly computationally bound rather than I/O bound, and async adds a lot of complexity to the API, while also making harder to reason about execution order. This leads into the second reason, which is...
  2. Any handlers that mutate state should be blocking and run in the event loop, and the state should be lock-free. This is the approach that rust-analyzer uses (also with the lsp-server/lsp-types crates as a framework), and it gives us assurances about data mutation and execution order. tower-lsp doesn't support this, which has caused some issues around data races and out-of-order handler execution.
  3. In general, I think it makes sense to have tight control over scheduling and the specifics of our implementation, in exchange for a slightly higher up-front cost of writing it ourselves. We'll be able to fine-tune it to our needs and support future LSP features without depending on an upstream maintainer.

Test Plan

The pre-release of ruff_server will have snapshot tests for common document editing scenarios. An expanded test suite is on the roadmap for future version of ruff_server.

@snowsignal snowsignal force-pushed the jane/lsp/server-mvp branch 5 times, most recently from 0cb76b9 to d1c23eb Compare March 1, 2024 05:16
@snowsignal snowsignal marked this pull request as ready for review March 1, 2024 05:17
@snowsignal snowsignal force-pushed the jane/lsp/server-mvp branch 2 times, most recently from a44b169 to 86419e2 Compare March 1, 2024 05:23
@MichaReiser
Copy link
Member

@snowsignal How would you recommend to review this PR? Which parts of the PR are in its final state? Are there parts that you plan to iterate on in future PRs?

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very exciting. I haven't managed to get through everything yet. I plan to continue my review on Monday.

I would appreciate a more in depth explanation of the architecture and why you decided to go with lsp-server over tower-lsp. Documenting this decision will be useful for the future.

Cargo.toml Outdated Show resolved Hide resolved
Cargo.toml Show resolved Hide resolved
crates/ruff/src/args.rs Show resolved Hide resolved
crates/ruff_server/src/edit/document.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/edit/document.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/lib.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/lint.rs Show resolved Hide resolved
crates/ruff_server/src/server.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/server.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/server/schedule/thread/priority.rs Outdated Show resolved Hide resolved
@MichaReiser MichaReiser added the preview Related to preview mode features label Mar 1, 2024
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As said before. This is excellent work. I haven't reviewed everything in detail, the PR is too large for that.

I think this work deserves a more in depth explanation of the architecture decisions. I like the simplicity of the handlers and that mutating tasks run on a dedicated thread. However, it comes at the cost that cancellation, the entire scheduling, dispatching, are problems that we now need to solve. That's why I'm interested on hearing your perspective on how to balance these tradeoffs and the complexity of implementing said features (I'm especially interested in cancellation).

I would find some additional comments useful that explain some concepts, especially around scheduling, how requests/notifications work etc. I can see that you have put a lot of thought into it, let's make sure that other contributors are aware of your deliberate design choices.

It might be nice if some of the scheduling work could be tested too. Although I fear that this isn't going to be easy.

_notifier: Notifier,
params: types::DidChangeTextDocumentParams,
) -> Result<()> {
super::define_document_url!(params: &types::DidChangeTextDocumentParams);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand why using a macro is necessary and it isn't obvious what's happening here (or where document_url is coming from). I assume it is because all params have a different shape. How about we define a trait DocumentUrl with a single document_url method (I don't like the trait name, maybe Document?) that we could then pass to the document_controller

Or we remove the macro because the implementation is trivial?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This macro was created as a way to avoid the repeated boilerplate of writing a function that returned &params.text_document.uri from an arbitrary parameter type in lsp-types (a parameter type is a deserialized message payload for requests and notifications). Since these types don't have a shared function to get the Url, even though they all follow the same internal layout, it needs to be a concrete function for each parameter type.

The reason why we need to define this function for each background handler is because we need a generic way to extract the document URL from the parameters of a request/notification, in order to take a session snapshot. Here, though, it's just defined for convenience.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. IMO, the macro here seems overkill. The code it generates is minimal and very easy to write by hand. However, it adds significant complexity to readers because they have to open the macro definition or read its documentation to understand what's happening inside.

That's why I recommend removing the macro and instead implementing the trait function by hand (I'm sure code pilot can autocomplete it for you)

crates/ruff_server/src/server/api/requests/code_action.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/server/api/requests/diagnostic.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/session.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/server/schedule/thread/pool.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/server/api.rs Show resolved Hide resolved
Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! There is so much here! Really nice work. :-)

Other than the comments I left, I have some holistic feedback too:

  1. It feels like there is a lot of interesting documentation that could be written for a lot of your types that would help clarify some of their behavior. I found that the comments that did exist were more on the "inside baseball" side of the spectrum (e.g., the comments about not implementing a trait method or the comments about why a type isn't just a closure) rather than "here is the contract the caller needs to care about." I like the former. They belong. They help contextualize why code is the way it is. But I'd really like to see more of the latter.
  2. I found it somewhat difficult to review this PR without a "big picture" of how the code is structure. These are tricky docs to write, I'd suggest the first step in doing so is to pick a target audience. For example, for me, I know very little about LSP, but perhaps the architecture docs should assume that knowledge. But maybe not. I'm not sure.

Overall amazing work. I can't believe how fast you got this built!

crates/ruff_server/src/edit.rs Show resolved Hide resolved
crates/ruff_server/src/edit/document.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/edit/document.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/edit/document.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/edit/range.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/server/api.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/server/api/traits.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/server/schedule/thread/pool.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/server/schedule/thread/priority.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/server/schedule/thread/priority.rs Outdated Show resolved Hide resolved
libcst = { version = "1.1.0", default-features = false }
log = { version = "0.4.17" }
lsp-server = { version = "0.7.6" }
lsp-types = { version = "0.95.0", features = ["proposed"] }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend to use the lsprotocol library (https://github.com/microsoft/lsprotocol/tree/main/packages/rust/lsprotocol) as it'll always be in sync with the latest protocol. The ruff-lsp package also uses the Python version of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll have to make this a follow-up issue - this would be a pretty fundamental re-write.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks great. Thanks for addressing all the feedback.

What's still missing, in my view, is a discussion of the architectural decisions as part of the PR summary. We've covered some of them in our sync meeting but I think it is beneficial to write those down to have a reference for the future. That's the reason why I'm requesting feedback. Once that's done, feel free to merge.

As @BurntSushi pointed out, I think it would be good to have some more conceptual documentation. Maybe the README would be a good start. It doesn't have to be extensive but a brief explanation of the core concept involved.

crates/ruff_server/src/edit/document.rs Outdated Show resolved Hide resolved
encoding,
),
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree on this @snowsignal do you plan to follow up on this in a separate PR?

crates/ruff_server/src/server/api/requests/format.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/server/api/requests/format.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/session.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/session.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/server/schedule.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/server/api/traits.rs Outdated Show resolved Hide resolved
crates/ruff_server/src/server.rs Outdated Show resolved Hide resolved
Copy link
Contributor

github-actions bot commented Mar 9, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+13 -0 violations, +0 -0 fixes in 1 projects; 42 projects unchanged)

pandas-dev/pandas (+13 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ pandas/io/parsers/python_parser.py:1039:47: PLR6201 Use a `set` literal when testing for membership
+ pandas/io/parsers/python_parser.py:1106:9: PLR1702 Too many nested blocks (6 > 5)
+ pandas/io/parsers/python_parser.py:1106:9: PLR1702 Too many nested blocks (6 > 5)
+ pandas/io/parsers/python_parser.py:1208:9: PLR0917 Too many positional arguments (6/5)
+ pandas/io/parsers/python_parser.py:1343:9: PLR6301 Method `_remove_empty_lines` could be a function, class method, or static method
+ pandas/io/parsers/python_parser.py:1358:40: PLC1901 `v == ""` can be simplified to `not v` as an empty string is falsey
+ pandas/io/parsers/python_parser.py:391:9: PLR0914 Too many local variables (25/15)
+ pandas/io/parsers/python_parser.py:399:9: PLR1702 Too many nested blocks (6 > 5)
+ pandas/io/parsers/python_parser.py:399:9: PLR1702 Too many nested blocks (7 > 5)
+ pandas/io/parsers/python_parser.py:449:29: PLC1901 `c == ""` can be simplified to `not c` as an empty string is falsey
+ pandas/io/parsers/python_parser.py:701:9: PLR6301 Method `_is_line_empty` could be a function, class method, or static method
+ pandas/io/parsers/python_parser.py:815:37: PLR6201 Use a `set` literal when testing for membership
+ pandas/io/parsers/python_parser.py:863:9: PLR6301 Method `_remove_empty_lines` could be a function, class method, or static method

Changes by rule (6 rules affected)

code total + violation - violation + fix - fix
PLR1702 4 4 0 0 0
PLR6301 3 3 0 0 0
PLR6201 2 2 0 0 0
PLC1901 2 2 0 0 0
PLR0917 1 1 0 0 0
PLR0914 1 1 0 0 0

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@snowsignal snowsignal merged commit 0c84fbb into main Mar 9, 2024
17 checks passed
@snowsignal snowsignal deleted the jane/lsp/server-mvp branch March 9, 2024 04:57
nkxxll pushed a commit to nkxxll/ruff that referenced this pull request Mar 10, 2024
…sh#10158)

<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR introduces the `ruff_server` crate and a new `ruff server`
command. `ruff_server` is a re-implementation of
[`ruff-lsp`](https://github.com/astral-sh/ruff-lsp), written entirely in
Rust. It brings significant performance improvements, much tighter
integration with Ruff, a foundation for supporting entirely new language
server features, and more!

This PR is an early version of `ruff_lsp` that we're calling the
**pre-release** version. Anyone is more than welcome to use it and
submit bug reports for any issues they encounter - we'll have some
documentation on how to set it up with a few common editors, and we'll
also provide a pre-release VSCode extension for those interested.

This pre-release version supports:
- **Diagnostics for `.py` files**
- **Quick fixes**
- **Full-file formatting**
- **Range formatting**
- **Multiple workspace folders**
- **Automatic linter/formatter configuration** - taken from any
`pyproject.toml` files in the workspace.

Many thanks to @MichaReiser for his [proof-of-concept
work](astral-sh#7262), which was important
groundwork for making this PR possible.

## Architectural Decisions

I've made an executive choice to go with `lsp-server` as a base
framework for the LSP, in favor of `tower-lsp`. There were several
reasons for this:

1. I would like to avoid `async` in our implementation. LSPs are mostly
computationally bound rather than I/O bound, and `async` adds a lot of
complexity to the API, while also making harder to reason about
execution order. This leads into the second reason, which is...
2. Any handlers that mutate state should be blocking and run in the
event loop, and the state should be lock-free. This is the approach that
`rust-analyzer` uses (also with the `lsp-server`/`lsp-types` crates as a
framework), and it gives us assurances about data mutation and execution
order. `tower-lsp` doesn't support this, which has caused some
[issues](ebkalderon/tower-lsp#284) around data
races and out-of-order handler execution.
3. In general, I think it makes sense to have tight control over
scheduling and the specifics of our implementation, in exchange for a
slightly higher up-front cost of writing it ourselves. We'll be able to
fine-tune it to our needs and support future LSP features without
depending on an upstream maintainer.

## Test Plan

The pre-release of `ruff_server` will have snapshot tests for common
document editing scenarios. An expanded test suite is on the roadmap for
future version of `ruff_server`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Related to preview mode features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants