Maoshuiyang

symao's starred repositories

visdom

A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

Language:PythonApache-2.01000800

machine-learning-systems-design

A booklet on machine learning systems design with exercises. NOT the repo for the book "Designing Machine Learning Systems"

Language:HTML900100

hypertunity

A toolset for black-box hyperparameter optimisation.

Language:PythonApache-2.013600

Automatic_Speech_Recognition

End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow

Language:PythonMIT284300

UrbanSound8K-audio-classification-with-ResNet

Language:Python5200

DWT-DCT-Digital-Image-Watermarking

A digital image watermarking algorithm based on combining two transforms; DWT and DCT.

Language:Python8000

TensorFlow-Tutorials

TensorFlow Tutorials with YouTube Videos

Language:Jupyter NotebookMIT927100

Data-analysis-and-visuliastion

Analyze and Visualize data insights of an audio file in the format .wav (Speech signal ). And communicating findings and Extracting features.

Language:Jupyter Notebook100

Predicting emotions based on speech audio samples of American English, German and British English languages using Support Vector Machine, K-Nearest Neighbor, Random Forest and Recurrent Neural Network. Analyzing the performance of each model based on the dataset.

Language:Jupyter Notebook1800

librosa

Python library for audio and music analysis

Language:PythonISC710400

neat-vision

Neat (Neural Attention) Vision, is a visualization tool for the attention mechanisms of deep-learning models for Natural Language Processing (NLP) tasks. (framework-agnostic)

Language:VueMIT25100

deep-learning-drizzle

Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!

Language:HTML1221300

efficientdensenet_crnn

memory efficient densenet+lstm+ctc实现中文识别

Language:PythonMIT3100

SimpleHTR

Handwritten Text Recognition (HTR) system implemented with TensorFlow.

Language:PythonMIT197900

espresso

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Language:PythonNOASSERTION94200

ctc_tensorflow_example

CTC + Tensorflow Example for ASR

Language:PythonMIT31300

CRNN_Tensorflow

Convolutional Recurrent Neural Networks(CRNN) for Scene Text Recognition

Language:PythonMIT103200

ml-tutorial

machine learning algorithms and implementations

Language:Jupyter NotebookMIT11400

transformer-tensorflow

Implementation of Transformer Model in Tensorflow

Language:Python44500

emotion_recognition

CTC for emotion recognition

Language:Python6000

pase

Problem Agnostic Speech Encoder

Language:PythonMIT43900

keras-sincnet

Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)

Language:Python7200

SMHA

My master thesis: Siamese multi-hop attention for cross-modal retrieval.

Language:PythonMIT500

Multimodal-Transformer

[ACL'19] [PyTorch] Multimodal Transformer

Language:PythonMIT81100

multimodal-speech-emotion

TensorFlow implementation of "Multimodal Speech Emotion Recognition using Audio and Text," IEEE SLT-18

Language:Jupyter NotebookMIT25800

lihang_book_algorithm

致力于将李航博士《统计学习方法》一书中所有算法实现一遍

Language:Python570000

kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Language:ShellNOASSERTION1422000

generative-models

Collection of generative models, e.g. GAN, VAE in Pytorch and Tensorflow.

Language:PythonUnlicense732500

pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

Language:PythonApache-2.0585300

DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Language:C++MPL-2.02524300