Add Language Server Protocol handler to JupyterLab

trungleduc opened this issue · comments


For now, LSP support for JupyterLab is provided by jupyterlab-lsp extension. While this monorepo offers a complete package with a lot of features, it's not easy for JupyterLab core or external extensions to profit from the LSP features.

Proposed Solution

I'm thinking about adding the handlers which allow JupyterLab frontend to communicate with the language server in the backend. It can be done by upstreaming the jupyter_lsp package of jupyterlab-lsp.

The second step is to create a frontend extension in JupyterLab core that can be served as an entry point for other extensions to request the LSP features.

Thank @bollwyvl for clarifying the situation! I'm quite new to the subject so maybe I missed previous discussions about LSP and JupyterLab, please correct me if I'm wrong.
The LSP kernel ideal looks promising, I agree with most of your points but I still have some questions about abandoning the WebSocket approach.

  • For multiple ws connections, by using another frontend extension as an entry point for LSP service, I think we only need one ws connection for all language servers. So performance-wise, do other potential advantages worth the added overhead of being a kernel?
  • Since it's a kernel, it will live outside the core of JupyterLab? Does it defeat the purpose of having an IDE-like experience out of the box for JupyterLab?

I really see the kernel and LSP protocol as orthogonal:

  • We always strived to make the kernel unaware of the "source" of execution requests. Exposing that it came from a console or a notebook or somewhere else was always considered as an abstraction leak. Kernels are mostly about execution, and not so much about source code.
  • On the other hand, LSP is almost entirely about the source code of your workspace. LSP servers exist completely independently of the ability to execute code (for e.g. compiled languages).

Hence, shoehorning the LSP protocol inside of the kernel protocol seems unintuitive, and it will be hard to reconcile the two, even from a UX perspective:

  • there can be many running Python kernels in one project, but it really makes sense to have a single language server running for Python
  • the special "LSP" kernels would appear in the list of running kernels while they are not really about executing code. I would strongly prefer having another category in the side panel with the currently running language servers:


outside the core of JupyterLab? Does it defeat the purpose

A huge amount of complexity in LSP is also in the language servers, much like the complexity of jupyter kernel messaging is also in... the kernels. At this point, to parallel ipykernel, I can't even confidently name a python language server i'd want everyone to have to use.

But, as i suggest, having in-browser language servers would allow us to ship no-foolin features, without a new server dependency, for kernel-less things like markdown, CSS, JSON Schema, etc. And if we just happened to get in-browser kernels, i wouldn't be sad either...

unaware of the "source" of execution requests

Be that as it may, the kernel still has knowledge of things very important to the user that might be outside the remit of the source document, e.g. dynamically-defined/side-effect variables, and some of those have useful LSP features associated with them, much as was already demonstrated with DAP (JEP47).

shoehorning the LSP protocol inside of the kernel protocol

As has been raised in a few places, continuing to evolve the existing jupyter kernel message spec to "catch up" with what is already defined in existing LSP features is a mug's game: if, instead, we embrace and extend LSP, we can get a lot of stuff for "free", but can define more of the integration on our terms.

Being able to plug into existing LSP features in this way would require maybe two JEPs:

  • formalizing a list of well-known comm targets
    • widgets, bokeh, etc. would get grandfathered in
  • reserving a comm target for LSP and defining its schema

This sounds better than nickle-and-diming JEPs for each new field/message, which no doubt is what it would take. And encouraging comm implementation would open up more kernels to other schema-constrained, language-agnostic components... like Jupyter Widgets, bokeh documents, etc.

not really about executing code

I see interactive control of a language server as an extension of get_ipython().set_hook, etc. We don't even know what people could build on top of this. Over on juyterlab-lsp, as soon as we support TCP servers, i'm for sure excited to try live-coding a language server with e.g. pygls. But I am already steeped in all this stuff. What would an even more interactive flavor of this be like?

Not having to implement a JSON-RPC wire-protocol from first principals is a big win, as we already have a life-cycled data object we can own. Indeed, this was what lintotype did: don't like your black line length? Move it with a slider.

And if we did do that once we could imagine going the other way, and offering LSP+Jupyter with a single bridge in the opposite direction.

reconcile the two, even from a UX perspective:

yep, there is certainly work to be done. We're already having to reconcile data from multiple sources on e.g. completion, and it's harsh. Basically every jupyter document is polyglot on multiple axes (code/narrative, input/output, natural languages, semantic types) something that a traditional source code document doesn't have to deal with.

But the high road is being able to bring as many as sources as a user wants, such that annotations of code, and eventually outputs, could come from:

  • the kernel
    • a vestigal language server inside the host kernel
  • a dedicated language server (whether inside a proxy kernel or not)
  • static LSIF data
  • software forge (gitlab/github) annotations (e.g. jupyterlab/pull-requests)