shekofteh

Shekofteh's repositories

itsp

Introduction to Speech Processing

CC-BY-SA-4.0000

E2PCast-Final

A Dataset for English to Persian Voice Casting

000

SGR_AFM

The code of the paper: "Exploiting auditory filter models as interpretable convolutional frontends to obtain optimal architectures for speaker gender recognition".

100

IIRI-Net

The code of the paper: "IIRI-Net: An interpretable convolutional front-end inspired by IIR filters for speaker identification".

100

Spoken-Language-Identification

100

E2PCast

E2PCast: An English to Persian Voice Casting Dataset

100

Bachelors-Project-Allosaurus

extra files used for bachelor's project

100

Audio-Classification

Code for YouTube series: Deep Learning for Audio Classification

MIT000

InterpretableCNN

An extended version of SincNet in which some general auditory filter models are added for the Speaker Identification task

100

nn-zero-to-hero

Neural Networks: Zero to Hero

MIT000

ShEMO-Modification

A modification on the ShEMO database

Language:Jupyter Notebook100

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

MIT000

SpeechTransProgress

Tracking the progress in end-to-end speech translation

CC0-1.0000

speech-data-gatherer-mobile

100

SampleDataWakeWordDetection

100

MOSNet

Implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"

NOASSERTION000

parstwiner

Name Entity Recognition (NER) on the Persian Twitter dataset.

MIT000

Classification-of-Heart-Sound-Signal-Using-Multiple-Features-

Data plus code fo Classification of Heart Sound Signal Using Multiple Features

000

math-tools-nyu

DS-GA 1013 Mathematical Tools for Data Science

000

allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

GPL-3.0100

PAVID-CVs

Persian Audio-Visual Database

100

kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

NOASSERTION100

asr_assignment

Code for the first assignment of the ASR course for 2020

100

DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

MPL-2.0100