Search dialogue remains at "Initializing search" when search.lang=ja and with tags in md file
xingzipss opened this issue · comments
Contribution guidelines
- I've read the contribution guidelines and wholeheartedly agree
I've found a bug and checked that ...
- ... the problem doesn't occur with the
mkdocs
orreadthedocs
themes - ... the problem persists when all overrides are removed, i.e.
custom_dir
,extra_javascript
andextra_css
- ... the documentation does not mention anything about my problem
- ... there are no open or closed issues that are related to my problem
Description
search dialogue remains at "Initializing search" when setting search.lang=ja and writing md file with tags in the metadata
Expected behaviour
searching works as usual.
Actual behaviour
Search dialogue remains at "Initializing search"
Steps to reproduce
- install mkdocs and mkdocs-material , and create a project by "mkdocs new ..."
- edit docs/index.md like this and mkdocs.yml as Configuration below
_$ cat docs/index.md
tags:
- t
Welcome to MkDocs
- start mkdocs by "mkdocs serve" and click the search dialogue in browser , you'll see "Initializing search" remains , and there is an error in browser like this -
Package versions
mkdocs: version 1.3.0 from .... Python 3.7
mkdocs-material: Version: 8.3.8
Configuration
site_name: My Docs
theme:
name: material
languange: en
plugins:
- tags
- search:
lang: ja
System information
-
Operating system:
$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 10 (buster)
Release: 10
Codename: buster -
Browser:
firefox 101.0.1
Thanks for reporting. Please provide a minimal reproducible example and attach it here as a zip file.
mkdocs.zip
thanks~
I tried the minial reporduce steps this moring :
- make a new mkdocs project
- change the docs/index.md (add tags on the top) and mkdocs.yml (set search.lang=ja and tags) as the zip file shows.
- "mkdocs serve" then turn to the browser and search.
if there's " no tags information in the md file " or " search.lang=en " , search dialogue can still work properly.
Thanks for providing the reproduction! Indeed, this seems to be a problem with the Japanese segmenter.
Thanks alot , then is it possible to be fixed in here or we need to wait for lunr to make some changes
I'll see if we can work around it.
I got a similar error when I'm using tags:
in some .md
documents.
After removing those tags
settings (it's not something that we are using) it works, but probably not everybody could remove them.
My mistake, I found the error, as we made a migration from Pelikan, tags were added separated from ,
.
Okay, so I've taken some time to investigate the issue, and I'm confident to say that it's originating from the tokenizer that is shipped with Japanese language support in lunr-languages (our upstream dependency), shipping a custom implementation that seems to be incompatible with lunr's implementation (also see MihaiValentin/lunr-languages#45 (comment)):
I think you have two options here:
- Create a minimal reproducible example and file the issue upstream
- Switch to Insiders, since we completely rewrote the search, including segmentation for Japanese and Chinese
I rewrote the tokenizer completely from scratch, since lunr's default tokenizer has several problems. If you're interested how it works, I wrote a blog article about it. I'm afraid this is not back-portable without pulling in the entire source code of the new tokenizer, which is bound to the Carolina Reaper funding goal.
With that said, I'm closing this issue. I understand that switching to Insiders might not be an option for some users, but it's all I can offer right now. Other than that, reporting this upstream might be a good idea nonetheless.
Got it. Thank you for taking time to investigate on this. You're doing an awesome job , I wish I could afford switching to Insiders someday.