zhhao1 / fcgcl

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FCGCL: Fine- and Coarse-Granularity Contrastive Learning for Speech Translation

This is the pytorch implementation for paper "FCCL: Fine- and Coarse-Granularity Contrastive Learning for Speech Translation".

Enviroment Configuration

Our code is based on Espnet and use PyTorch-Lightning to organize our code. Please install Espnet and PyTorch-Lightning following the official guidance.

Data Preparation

  1. Download the wav2vec 2.0 model published in Huggingface.
  2. We extract feature bases on wav2vec 2.0 before training. The scripts are saved on ./scripts/.
  3. Save to json file. This is consistent with Espnet. We upload the dev.json and the corresponding feature for reference to quickly debug the code.

Model Training

. ./run.sh

The training process in defined on ./src/bins/plModule.py. The contrastive module is defined on ./src/bins/cl_loss.py.

About


Languages

Language:Roff 43.5%Language:C++ 27.8%Language:Python 8.9%Language:Perl 6.6%Language:C 6.5%Language:Shell 2.5%Language:HTML 1.3%Language:Makefile 0.9%Language:Smalltalk 0.6%Language:JavaScript 0.5%Language:PHP 0.4%Language:M4 0.1%Language:CSS 0.1%Language:Yacc 0.1%Language:Batchfile 0.1%Language:Cython 0.1%Language:Emacs Lisp 0.1%Language:CMake 0.0%Language:Assembly 0.0%Language:Java 0.0%Language:Logos 0.0%Language:Ruby 0.0%Language:NewLisp 0.0%Language:Raku 0.0%Language:SystemVerilog 0.0%Language:OCaml 0.0%Language:ActionScript 0.0%Language:Less 0.0%Language:nesC 0.0%Language:Starlark 0.0%Language:Slash 0.0%Language:E 0.0%Language:Forth 0.0%