FaFre / fts5_expose

SQLite extension that exposes internal FTS5 components

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

fts5_expose

This is a small extension to expose internal FTS5 components of SQLite.

fts5_tokenize()

fts5_tokenize becomes available after loading the library and exposes the internal tokenizer api. All FTS5 tokenizers are supported. Default is unicode. Tokenizer can be configured via second parameter, equivalent to tokenizer= in the fts5 vtab creation.

For more information see https://www.sqlite.org/fts5.html#tokenizers

Return value is a JSON array of processed tokens. Make sure you compile SQLite with json1 support.

Example:

sqlite> .load 'fts5_expose.so'
sqlite> select fts5_tokenize('hello please tokenize meee');
["hello","please","tokenize","meee"]
Run Time: real 0.001 user 0.000000 sys 0.000284
sqlite> select fts5_tokenize('unicode61', 'remove_diacritics 2', 'ö ü ä ß');
["o","u","a","ß"]
Run Time: real 0.000 user 0.000000 sys 0.000234

About

SQLite extension that exposes internal FTS5 components

License:MIT License


Languages

Language:C 99.8%Language:Shell 0.2%