Michael95-m / myanmar_names

Burmese name conversion with rule-based method (Burmese to English and English to Burmese)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Myanmar Names

This is the simple repository to convert the names from Myanmar written in Burmese language to English and vice versa.

Currently only the dictionary based conversion method is used. The method is that at first, the name written in Burmese is segmented into syllable level and then these syllable words are converted into equivalent English word by the dictionary built from the dataset for Burmese to English conversion(Check EDA notebook for building dictionary). For English to Burmese syllable conversion, the same method is used.

Currently the dictionary based conversion has several weakness and the result is not satisfactory. The json file needs manual checking for best accuracy. I will do some research and add more methods to get the best result.

Usage

from mm_names.convert import Converter

converter = Converter()

print(converter.mm2en("မင်းခန့်မောင်မောင်")) 
## 'min khant maung maung'

print(converter.en2mm("Min Khant Maung Maung"))
## 'မင်း ခန့် မောင် မောင်'

Makefile

You can setup enviroments and automate quality checks and test cases by using make

## enviroment setup
make setup

## quality check
make quality_check

## unit test
make test

Acknowledgment

I use the data from Sayar Ye Kyaw Thu repo and I inspired the work from Ko Htain Linn Shwe's repo. I also used Burmese syllable segmentation code from Sayar Ye Kyaw Thu's code.

I just make some EDA and want to do some improvment for name conversion method.

References

Ko Htain Linn Shwe's repo, https://github.com/saturngod/myanmar_names

The dataset from Sayar Ye Kyaw thu, https://github.com/ye-kyaw-thu/myRoman/blob/main/person-name/person-name.ver1.0.txt

Syllable segmentation, https://github.com/ye-kyaw-thu/myWord/blob/main/syl_segment.py

About

Burmese name conversion with rule-based method (Burmese to English and English to Burmese)


Languages

Language:Jupyter Notebook 96.3%Language:Python 3.6%Language:Makefile 0.1%