czqhurricnae / SilverDict

Web-Based Alternative to GoldenDict

Home Page:https://crissium.github.io/SilverDict/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SilverDict – Web-Based Alternative to GoldenDict

favicon

Documentation and Guides (At least read the general notes before using.)

This project is intended to be a modern, from-the-ground-up, maintainable alternative to GoldenDict(-ng), developed with Flask and React.

You can access the live demo here (the button to delete dictionaries is removed). It lives inside a free Okteto container, which sleeps after 24 hours of inactivity, so please bear with its slowness and refresh the page a few times if you are seeing a 404 error, and remember that it may be (terribly) out of sync with the latest code changes.

Screenshots

Light 1 Light 2 Dark Mobile

The dark theme is not built in, but rendered with the Dark Reader Firefox extension.

Some Peculiarities

  • The wildcard characters are ^ and + (instead of % and _ of SQL or the more traditional * and ?) for technical reasons. Hint: imagine % and _ are shifted one key to the right on an American keyboard.
  • This project creates a back-up of DSL dictionaries, overhauls1 them and silently overwrites the original files. So after adding a DSL dictionary to SilverDict, it may no longer work with GoldenDict.
  • During the indexing process of DSL dictionaries, the memory usage could reach as high as 1.5 GiB (tested with the largest DSL ever seen, the Encyclopædia Britannica), and even after that the memory used remains at around 500 MiB. Restart the server process and the memory usage will drop to a few MiB. (The base server with no dictionaries loaded uses around 50 MiB of memory.)
  • Both-sides suggestion matching is implemented with an $n$-gram based method, where $n = 4$, meaning that it will only begin working when the query is equal to or longer than 4 characters. This feature is disabled by default, and can be enabled by editing ~/.silverdict/preferences.yaml and create the ngram table in the settings menu. This process could be slow. You have to do this manually each time a dictionary is added or deleted.

Features

  • Python2-powered
  • Cleaner code
  • Deployable both locally and on a self-hosted server
  • Fast enough
  • Minimalist web interface
  • Separable client and server components
  • Works as expected
  • DSL, StarDict, MDict supported
  • Cross-platform (Linux, Windows, MacOS, Android, limited iOS)

Roadmap

  • Linux: RPM/Deb packaging
  • ?? Publish on PyPI
  • Windows: package everything into a single click-to-run executable (help wanted)

Server-side

  • Add support for Babylon BGL glossary format
  • Add support for StarDict format
  • Add support for ABBYY Lingvo DSL format
  • Reduce DSL parsing time
  • Reduce the memory footprint of the MDict Reader
  • Inline styles to prevent them from being applied to the whole page (The commented-out implementation in server/app/dicts/mdict/html_cleaner.py breaks richly-formatted dictionaries.)3
  • Reorganise APIs (to facilitate dictionary groups)
  • Ignore diacritics when searching (testing still wanted from speakers of Turkish and Asian languages other than CJK)
  • Ignore case when searching
  • GoldenDict-like morphology-awareness (walks -> walk) and spelling check (fuzzy-search, that is, malarky -> malady, Malaya, malarkey, Malay, Mala, Maalox, Malcolm)
  • Transliteration for the Cyrillic4, Greek, Arabic, Hebrew and Devanagari scripts (done: Greek, one-way Arabic)
  • OpenCC Chinese conversion (please set your preference in ~/.silverdict/preferences.yaml and add zh to the group with Chinese dictionaries)
  • Add the ability to set sources for automatic indexing, i.e. dictionaries put into the specified directories will be automatically added
  • Recursive source scanning
  • Multithreaded article extraction (This project will benefit hugely from no-GIL python)
  • Improve the performance of suggestions matching
  • Make the suggestion size customisable
  • Allow configure suggestion matching mode, listening address, running mode, etc. via a configuration file, without modifying code
  • Add a timestamp field to suggestions to avoid newer suggestions being overridden by older ones
  • Full-text search

Client-side

  • Use the Bootstrap framework
  • Allow zooming in/out of the definition area
  • Click to search for words in the definition
  • Make the strings translatable
  • GoldenDict-like dictionary group support
  • A mobile-friendly interface (retouch needed)
  • A real mobile app
  • A C++/Qt (or QML) desktop app5

Issue backlog

  • Make the dialogues children of the root element (How can I do this with nested dialogues?)

Usage

Dependencies

This project utilises some Python 3.10 features, such as the match syntax, and a minimal set of dependencies:

PyYAML # for better efficiency, please install libyaml before building the wheel
Flask # the web framework
Flask-Cors
waitress # the WSGI server
python-idzip # for dictzip
lxml # for XDXF-formatted StarDicts
python-lzo # for v1/v2 MDict
xxhash # for v3 MDict
dsl2html # for DSL

The package dsl2html is mine, and could be used by other projects.

In order to enable the feature of morphology analysis, you need to install the Python package hunspell and place the Hunspell dictionaries into ~/.silverdict/hunspell.

In order to enable the feature of Chinese conversion, you need to install the Python package opencc.

Local Deployment

The simplest method to use this app is to run it locally.

cd client
yarn install
yarn build
mv build ../server/

And then:

cd ../server
pip3.10 install -r requirements.txt
python3.10 server.py # working-directory-agnostic

Then access it at localhost:2628.

Or, if you do not wish to build the web app yourself or clone the whole repository, you can download from release a zip archive, which contains everything you need to run SilverDict.

For Windows users: A zip archive complete with a Python interpreter and all the dependencies is available in release. Download the archive, unzip it, and double-click setup.bat to generate a shortcut. Then you can move it wherever you wish and click on it to run SilverDict. After launching the server, you can access it at localhost:2628.

For Termux users: run the bash script termux_setup.sh in the top-level directory, which will install all the dependencies, including hunspell. The script assumes you have enabled external storage access and will create a default source directory at /sdcard/Documents/Dictionaries.

Alternatively, you could use dedicated HTTP servers such as nginx to serve the static files and proxy API requests. Check out the sample config for more information.

Server Deployment

I recommend nginx if you plan to deploy SilverDict to a server. Run yarn build to generate the static files of the web app, or download a prebuilt one from release (inside server/build), and then place them into whatever directory where nginx looks for static files. Remember to reverse-proxy all API requests and permit methods other than GET and POST.

Assuming your distribution uses systemd, you can refer to the provided sample systemd config and run the script as a service.

Docker Deployment

Docker is not recommended as you have to tuck in all your dictionary and, highly fragmented data files, which is not very practical. It is fine if you only run SilverDict locally, though.

Contributing

  • Start with an item in the roadmap, or open an issue to discuss your ideas. Please notify me if you are working on something to avoid duplicated efforts. I myself dislike enforcing a coding style, but please use descriptive, verbose variable names and UTF-8 encoding, LF line endings, and indent with tabs.
  • Help me with the transliteration feature.
  • Translate the guides into your language. You could edit them directly on GitHub.

Acknowledgements

The favicon is the icon for 'Dictionary' from the Papirus icon theme, licensed under GPLv3.

This project uses or has adapted code from the following projects:

Name Developer Licence
mdict-analysis Xiaoqiang Wang GPLv3
python-stardict Su Xing GPLv3
dictionary-db (together with the $n$-gram method) Jean-François Dockes GPL 2.1
pyglossary Saeed Rasooli GPLv3

I would also express my gratitude to Jiang Qian for his suggestions, encouragement and great help.

Similar projects


Footnotes

  1. What it does: (1) decompress the dictionary file if compressed; (2) remove the BOM, non-printing characters and strange symbols (only {·} currently) from the text; (3) normalize the initial whitespace characters of definition lines; (4) overwrite the .dsl file with UTF-8 encoding and re-compress with dictzip. After this process the file is smaller and easier to work with.

  2. A note about type hinting in the code: I know for proper type hinting I should use the module typing, but the current way is a little easier to write and can be understood by VS Code.

  3. The use of a custom styling manager such as Dark Reader is recommended until I fix this, as styles for different dictionaries meddle with each other. Or better, if you know CSS, you could just edit the dictionaries' stylesheets to make them less intrusive and individualistic.

  4. A Russian-speaking friend told me that it is unusual to type Russian on an American keyboard, so whether this feature is useful is open to doubt.

  5. I have come up with a name: Kilvert (yeah, after the Welsh priest for its close resemblance to SilverDict, and the initial letter, of course, stands for KDE). (I'm on Xfce by the way.)

About

Web-Based Alternative to GoldenDict

https://crissium.github.io/SilverDict/

License:GNU General Public License v3.0


Languages

Language:Python 72.5%Language:JavaScript 21.5%Language:CSS 2.1%Language:XSLT 2.0%Language:HTML 1.2%Language:Shell 0.7%