speech-corpus

There are 0 repository under speech-corpus topic.

clovaai / ClovaCall
ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020)
call-based-speech-corpus goal-oriented-dialog interspeech2020 korean-speech speech-corpus speech-recognition
Language:Python 217
yc9701 / pansori
Tools for ASR Corpus Generation from Online Video
corpus data-pipeline dataset-generation online-video speech-corpus speech-recognition
Language:Python 139
kan-bayashi / LibriTTSLabel
Alignment files of LibriTTS.
speech-synthesis speech-corpus
56
lennes / spect
SpeCT - Speech Corpus Toolkit for Praat. Documentation: https://lennes.github.io/spect/
praat speech analysis annotation corpus-linguistics corpus-tools speech-analysis conversational-speech transcription transcript spoken-language spect speech-corpus
Language:HTML 55
khiajohnson / SpiCE-Corpus
An open-access corpus of conversational bilingual speech in Cantonese and English
corpus speech-corpus bilingual-corpora cantonese-language english-language spice-corpus
Language:JavaScript 40
ruslan-corpus / ruslan-corpus.github.io
speech-corpus speech-dataset text-to-speech russian tts
Language:HTML 19
AsoSoft / AsoSoft-Speech-Corpus
AsoSoft Speech Corpus can be used for spoken language processing tasks in Central Kurdish such as speech recognition, speaker recognition, gender identification, and phonetic analysis.
central-kurdish speech-corpus
10
dcavar / ELAN2split
Split ELAN Annotation Files and corresponding speech files into a corpus format for common ASR and Forced Aligners
speech-recognition forced-alignment elan speech-corpus sox xerxes xml cpp11
Language:C++ 10
kevobt / speech-to-text-voxforge
Downloader for the voxforge corpus
voxforge downloader generator speech-corpus
Language:Python 8
ubaleht / SiberianIngrianFinnish
This project is devoted to the Siberian Ingrian Finnish language. Siberian Ingrian Finnish – is a language (dialect) used by the descendants of the settlers who spoke Lower Luga Ingrian Finnish varieties and Lower Luga Ingrian (Izhorian) who have been living in Omsk oblast (previously they lived also in other regions of the Siberia) for more than 200 years. The ancestors of the speakers of Siberian Ingrian Finnish came from the Lower Luga area in the early 19th century. They came from the Rosona river area, to be exact. This region is also called Estonian Ingria. Siberian Ingrian Finnish (Russian: Сибирский ингерманландский идиом) is the term introduced by D. V. Sidorkevich.
speech-corpus finnish finnish-language ingrian-finnish izhorian
Language:C# 6
ina-foss / InaGVAD
Voice activity detection and speaker gender segmentation audiovisual corpus
audio-dataset audio-segmentation audiovisual-dataset benchmark corpus gender gender-bias gender-prediction gender-representation radio speaker-gender speech-activity-detection speech-corpus speech-dataset tv voice-activity-detection acoustic-diversity dataset
Language:Jupyter Notebook 5
joneavila / DRAL
Code for Dialogs Re-enacted Across Languages (DRAL)
prosody speech-corpus
Language:Python 4
ubaleht / SiberianTatar
This project is devoted to the dialects of the Siberian Tatars. Around 100,000 people are spoken in these dialects. The language of Siberian Tatars consists of three dialects: Tobolo-Irtysh, Tom and Baraba.
speech-corpus tatar
4
vectominist / Switchboard-WSJ-Utils
Utilities for preprocessing the Switchboard and WSJ corpora in Python3
wsj python speech-corpus switchboard wtimit torchaudio
Language:Python 4
mborsdorf / TargetLanguageExtraction
python pytorch matlab deep-learning audio speech-separation speaker-extraction auditory-attention speech-processing audio-processing source-separation multilingual speech-database speech-dataset speech-corpus
3
mllpresearch / Europarl-ASR
A 1300-hour English speech and text corpus of parliamentary debates for streaming ASR training and benchmarking, speech data filtering and speech data verbatimization.
automatic-speech-recognition speech-corpus streaming-asr speech-data-filtering speech-data-verbatimization
2
mbar0075 / Speech-Technology
Deliverables relating to the Speech Technology University Unit (Notes Courtesy to Dr. Andrea De Marco)
deep-learning mel-spectrogram mfcc speaker-identification speech-technology speech-corpus cnn-classification keras cnn-architecture
Language:Jupyter Notebook 1
mllpresearch / ESO-dataset
ESO speech dataset: an English-language speech corpus of the oncology domain for ASR training and benchmarking and MT benchmarking.
automatic-speech-recognition domain-adaptation large-language-models llm machine-translation oncology speech-corpus speech-translation
1

speech-corpus

clovaai / ClovaCall

yc9701 / pansori

kan-bayashi / LibriTTSLabel

lennes / spect

khiajohnson / SpiCE-Corpus

ruslan-corpus / ruslan-corpus.github.io

AsoSoft / AsoSoft-Speech-Corpus

dcavar / ELAN2split

kevobt / speech-to-text-voxforge

ubaleht / SiberianIngrianFinnish

ina-foss / InaGVAD

joneavila / DRAL

ubaleht / SiberianTatar

vectominist / Switchboard-WSJ-Utils

mborsdorf / TargetLanguageExtraction

mllpresearch / Europarl-ASR

mbar0075 / Speech-Technology

mllpresearch / ESO-dataset