ml-lab/LRE

This repository contains source code for the experiments in a paper titled A Semisupervised Approach for Language Identification based on Ladder Networks

In 2015 NIST conducted a LRE i-vector challenge. The challenge was to identify which language is spoken from a speech sample, given that the language belongs to one of 50 given language or is one of out-of-set languages. The speech samples were already processed into i-vectors and duration information. The data was split into training, dev and test. The training data included labeled samples from the 50 given languages. The dev data included unlabeled samples from both the 50 given languages and the out-of-set languages. The test was similar to dev but it could have been only used for making submissions to the competition.

our solution used a modification of the Ladder Network and published code.
The dark knowledge of tongues, fun with the i-vector dataset supplied by the challenge.

About

NIST Language i-vector Machine Learning Challenge

MIT License

Languages

Language:Jupyter Notebook 80.9%Language:Python 19.1%