rosengaga (undeadyequ)

undeadyequ

Geek Repo

Company:@JiaoTong University

Location:Japan

Github PK Tool:Github PK Tool

rosengaga's repositories

CRAFT-pytorch

Official implementation of Character Region Awareness for Text Detection (CRAFT)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

espnet

End-to-End Speech Processing Toolkit

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0
Language:Jupyter NotebookStargazers:0Issues:2Issues:0
Language:PythonStargazers:0Issues:0Issues:0

kaldi

This is now the official location of the Kaldi project.

Language:ShellLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Language:ShellStargazers:0Issues:0Issues:0

PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

protest-detection-violence-estimation

Implementation of the model used in the paper Protest Activity Detection and Perceived Violence Estimation from Social Media Images (ACM Multimedia 2017)

Language:Jupyter NotebookLicense:MITStargazers:0Issues:1Issues:0

protest_issue_classification

A tool for trianing your own protest issue classfication model.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:HTMLStargazers:0Issues:0Issues:0

ser_model

Lightweight and Interpretable ML Model for Speech Emotion Recognition and Ambiguity Resolution (trained on IEMOCAP dataset)

Language:Jupyter NotebookLicense:MITStargazers:0Issues:1Issues:0

Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Language:PythonStargazers:0Issues:0Issues:0

tango

Codes and Model of the paper "Text-to-Audio Generation using Instruction Tuned LLM and Latent Diffusion Model"

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Language:HTMLStargazers:0Issues:0Issues:0

vim_config

Recommand Vim configuration

Language:Vim scriptStargazers:0Issues:0Issues:0