StDario / ctx_zero_domain

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Context-aware Sockeye

This repository contains the code for the paper Addressing Zero-Resource Domains Using Document-Level Context in Neural Machine Translation [arXiv] to appear at the Adapt-NLP 2021 workshop at EACL 2021.

The code expects the data be prepared in the following format: CONTEXT_SENTENCE <SEP> MAIN_SENTENCE

The main parameter needed to use the different models is

--model-type

where possible values are: avg_emb_add and max_emb_add for the DomEmb model, and ctx_dec for CtxPool (other parameters need to be set to use the pooling option: --use-doc-pool, --doc-pool-window (int) and --doc-pool-stride (int)). For a full overview of all of the parameters, check sockeye/arguments.py

The datasets are available at link

Sockeye

PyPI version GitHub license GitHub issues Build Status Documentation Status

This package contains the Sockeye project, a sequence-to-sequence framework for Neural Machine Translation based on Apache MXNet (Incubating). It implements state-of-the-art encoder-decoder architectures, such as:

In addition, it provides an experimental image-to-description module that can be used for image captioning. Recent developments and changes are tracked in our CHANGELOG.

If you have any questions or discover problems, please file an issue. You can also send questions to sockeye-dev-at-amazon-dot-com.

Documentation

For information on how to use Sockeye, please visit our documentation. Developers may be interested in our developer guidelines.

Citation

For technical information about Sockeye, see our paper on the arXiv (BibTeX):

Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar, Artem Sokolov, Ann Clifton and Matt Post. 2017. Sockeye: A Toolkit for Neural Machine Translation. ArXiv e-prints.

About

License:Apache License 2.0


Languages

Language:Python 99.7%Language:JavaScript 0.2%Language:Shell 0.1%Language:Dockerfile 0.0%Language:CSS 0.0%