Bahasa-NLP-Tensorflow, Gathers Tensorflow deep learning models for Bahasa Malaysia NLP problems, code simplify inside Jupyter Notebooks 100% including dataset.
Table of contents
- Augmentation
- Sparse classification
- Long-text classification
- Dependency Parsing
- Entity Tagging
- Abstractive Summarization
- Extractive Summarization
- POS Tagging
- Optical Character Recognition
- Question-Answer
- Speech to Text
- Stemming
- Topic Generator
- Text to Speech
- Topic Modeling
- Word Vector
Augmentation
- word2vec Malaya
Sparse classification
- Fast-text Ngrams
Normal-text classification
- Fast-text
- Only Attention
70+ more models can get from here.
Long-text classification
- Dilated CNN
- Wavenet
Dependency Parsing
- Bidirectional LSTM + CRF
- Bidirectional LSTM + CRF + Bahdanau
- Bidirectional LSTM + CRF + Luong
Entity Tagging
- Bidirectional LSTM + CRF
- Bidirectional LSTM + CRF + Bahdanau
- Bidirectional LSTM + CRF + Luong
POS Tagging
- Bidirectional LSTM + CRF
- Bidirectional LSTM + CRF + Bahdanau
- Bidirectional LSTM + CRF + Luong
Abstractive Summarization
- Dilated Seq2Seq
- Pointer Generator + Bahdanau Attention
- Pointer Generator + Luong Attention
Extractive Summarization
- Skip-thought
- Residual Network + Bahdanau Attention
Optical Character Recognition
- CNN + LSTM RNN
Question-Answer
- End-to-End + GRU
- Dynamic Memory + GRU
Speech to Text
- BiRNN + LSTM + CTC Greedy
- Wavenet
- Deep speech 2
Text to Speech
- Tacotron
- Seq2Seq + Bahdanau Attention
- Deep CNN + Monothonic Attention + Dilated CNN vocoder
Stemming
- Seq2seq + Beam decoder
- Seq2seq + Bahdanau Attention + Beam decoder
- Seq2seq + Luong Attention + Beam decoder
Topic Generator
- TAT-LSTM
- TAV-LSTM
- MTA-LSTM
Topic Modeling
- Lda2Vec
Word Vector
- word2vec
- ELMO
- Fast-text