nvimtools / none-ls.nvim

null-ls.nvim reloaded / Use Neovim as a language server to inject LSP diagnostics, code actions, and more via Lua.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

UTF-8 characters break diagnostics' underline position

eshepelyuk opened this issue · comments

Because I've recently faced this when working with cspell.nvim
davidmh/cspell.nvim#32

I would like to revive this issue from null-ls

jose-elias-alvarez/null-ls.nvim#1630

Possibly related?

-- assume 1-indexed ranges
local convert_range = function(diagnostic)
local row = tonumber(diagnostic.row or 1)
local col = tonumber(diagnostic.col or 1)
local end_row = tonumber(diagnostic.end_row or row)
local end_col = tonumber(diagnostic.end_col or 1)
-- wrap to next line
if end_row == row and end_col <= col then
end_row = end_row + 1
end_col = 1
end
return u.range.to_lsp({ row = row, col = col, end_row = end_row, end_col = end_col })
end
local postprocess = function(diagnostic, _, generator)
local range = convert_range(diagnostic)
diagnostic.lnum = range["start"].line
diagnostic.end_lnum = range["end"].line
diagnostic.col = range["start"].character
diagnostic.end_col = range["end"].character
diagnostic.severity = diagnostic.severity or c.get().fallback_severity
diagnostic.source = diagnostic.source or generator.opts.name or generator.opts.command or "null-ls"
if diagnostic.filename and not diagnostic.bufnr then
local bufnr = vim.fn.bufadd(diagnostic.filename)
diagnostic.bufnr = bufnr
end
local user_postprocess = generator.opts.diagnostics_postprocess
if user_postprocess then
user_postprocess(diagnostic)
return
end
local formatted = generator and generator.opts.diagnostics_format or c.get().diagnostics_format
-- avoid unnecessary gsub if using default
if formatted == "#{m}" then
return
end
formatted = formatted:gsub("#{m}", diagnostic.message)
formatted = formatted:gsub("#{s}", diagnostic.source)
formatted = formatted:gsub("#{c}", diagnostic.code or "")
diagnostic.message = formatted
end

to_lsp = function(range)
local lsp_range = {
["start"] = {
line = range.row >= 1 and range.row - 1 or 0,
character = range.col >= 1 and range.col - 1 or 0,
},
["end"] = {
line = range.end_row >= 1 and range.end_row - 1 or 0,
character = range.end_col >= 1 and range.end_col - 1 or 0,
},
}
return lsp_range

Possibly related?

-- assume 1-indexed ranges
local convert_range = function(diagnostic)
local row = tonumber(diagnostic.row or 1)
local col = tonumber(diagnostic.col or 1)
local end_row = tonumber(diagnostic.end_row or row)
local end_col = tonumber(diagnostic.end_col or 1)
-- wrap to next line
if end_row == row and end_col <= col then
end_row = end_row + 1
end_col = 1
end
return u.range.to_lsp({ row = row, col = col, end_row = end_row, end_col = end_col })
end
local postprocess = function(diagnostic, _, generator)
local range = convert_range(diagnostic)
diagnostic.lnum = range["start"].line
diagnostic.end_lnum = range["end"].line
diagnostic.col = range["start"].character
diagnostic.end_col = range["end"].character
diagnostic.severity = diagnostic.severity or c.get().fallback_severity
diagnostic.source = diagnostic.source or generator.opts.name or generator.opts.command or "null-ls"
if diagnostic.filename and not diagnostic.bufnr then
local bufnr = vim.fn.bufadd(diagnostic.filename)
diagnostic.bufnr = bufnr
end
local user_postprocess = generator.opts.diagnostics_postprocess
if user_postprocess then
user_postprocess(diagnostic)
return
end
local formatted = generator and generator.opts.diagnostics_format or c.get().diagnostics_format
-- avoid unnecessary gsub if using default
if formatted == "#{m}" then
return
end
formatted = formatted:gsub("#{m}", diagnostic.message)
formatted = formatted:gsub("#{s}", diagnostic.source)
formatted = formatted:gsub("#{c}", diagnostic.code or "")
diagnostic.message = formatted
end

to_lsp = function(range)
local lsp_range = {
["start"] = {
line = range.row >= 1 and range.row - 1 or 0,
character = range.col >= 1 and range.col - 1 or 0,
},
["end"] = {
line = range.end_row >= 1 and range.end_row - 1 or 0,
character = range.end_col >= 1 and range.end_col - 1 or 0,
},
}
return lsp_range

TBH not sure, i haven't dug that deep into a code.
But the issue still exists in none-ls

I've been looking into this issue, I think the problem comes from a conflict between the way Lua interprets strings with multi-byte characters and the way we pass the col field through the patterns.

For example, the length for the string: · example typox in every other language would be 15, but Lua counts the bytes in the string, not the number of printable characters. This means that for the same string, lua returns 16 as the length of the string.

The report coming from CSpell also counts only printable characters, so for a file like this:

test.md

* example typox
· example typox

The report will be:

npx cspell --show-suggestions -c cspell.json lint --language-id markdown test.md

1/1 ./test.md 163.45ms X
./test.md:1:11 - Unknown word (typox) Suggestions: [typo, typos, type, typw, tyro]
./test.md:2:11 - Unknown word (typox) Suggestions: [typo, typos, type, typw, tyro]

Both lines have the same column as the start of the unknown word, because CSpell doesn't count bytes when reporting the position of the error.

So when we read the column from the report:

https://github.com/davidmh/cspell.nvim/blob/4a9843fdfc75e26d5518ac750c021850a4ca1098/lua/cspell/diagnostics/parser.lua#L26

We just forward whatever we got from the CSpell report.

The end_col ends up with the correct position because we calculate it with the custom from_quote adapter, which finds the end column programmatically.

To counter that discrepancy, I plan on using the column reported by CSpell only as an index to start looking for the word reported as an error in the end_col function, and mutating the entries table to define the col property in the same function.

I have a proof of concept that seems to work as expected, I'll test a few scenarios before I push anything.

IMO, that feels a bit too hacky to keep as a long-term solution, we should look into validating the col property in none-ls, maybe here:

local make_diagnostic = function(entries, defaults, attr_adapters, params, offsets)
if not entries["message"] then
return nil
end
local content_line = params.content and params.content[tonumber(entries["row"])] or nil
for attr, adapter in pairs(attr_adapters) do
entries[attr] = adapter(entries, content_line)
end
-- Unset private attributes
for k, _ in pairs(entries) do
if k:find("^_") then
entries[k] = nil
end
end
local diagnostic = vim.tbl_extend("keep", defaults, entries)
for k, offset in pairs(offsets) do
diagnostic[k] = diagnostic[k] and diagnostic[k] + offset
end
return diagnostic
end

This issue was fixed in #36