#Neural Machine Translation (attention and zero-shot) This project is developed during Prof. Andrew Ng's deep learning boot camp, where we implemented and explored two states of art machine translation models: Attention-based NMT and GNMT. The details of the models could be accessed by both the original paper listed or my review in project reports.
The architecture (of the repo, not the model) refers to the implementation of MemN2N quite a lot: We separate the model from data, logging, and preprocessing modules, and use a single interface for ML model of similar tasks.
main.py
: Configuration interface. It will parse parameter configs, build the model and run training/testing/sampling experiments.data\
: Training data and the wrapping data iterator.checkpoints\
: Checkpoints of trained weights.logs\
: Logs of loss, accuracy, etc.model\
: Deep learning models. Here we haveattention.py
andzero.py
, both of which inherit theModel
classclass Model
:build_variables()
: Prepare the variable placeholderbuild_model()
: Prepare the modeltrain()
: Train the modeltest()
: Evaluate the test errorsample()
: Sample/translate certain sentencescountParameters()
: Count the parameters in the modelsave()
: Save the model to checkpointsload()
: Load the model from checkpoints
attention.py
: Attention-based model. See Luong et al. 2015zero.py
: Google's GNMT model. See Y Wu et al. 2016
bleu\
: Scripts to calculate BLEU score.subword-nmt\
: Word-piece model's token generator used in zero-shot translation.experiments\
: Training experiments (parameter configuration).report\
: Summary/notes of training experiments.
- Our mentors: Ziang, Awni and Anand
- Terrific cohort: James, Jeremy, Joseph and Dillon