pyGlue
is a Python library for dealing with a glued word. It supports both monolingual and bilingual corpus word ungluing.
- Clone this repository
git clone https://github.com/Kawaeee/pyGlue.git
- Install required packages
pip install -r requirements.txt
pyGlue
reads input from stdin and writes output to stdout. So, You can adapt pyGlue
to any variation as you want.
# Input
echo -e "LinuxOperating systems such as UNIX only." | python pyGlue.py > mono.out
echo -e "Corgi.ai is an open-source app framework for Corgilover.\tCorgi.ai es un marco de aplicación de código abierto para los amantes de Corgi." | python pyGlue.py > bi.out
# Output
Linux Operating systems such as UNIX only.
Corgi.ai is an open-source app framework for Corgi lover. Corgi.ai es un marco de aplicación de código abierto para los amantes de Corgi.
- Add a custom dictionary to limit aggressive word segmentation