mediagarden's starred repositories

ClipCap-Chinese

基于ClipCap的看图说话Image Caption模型

Language:PythonStargazers:243Issues:0Issues:0

OpenGlass

Turn any glasses into AI-powered smart glasses

Language:CLicense:MITStargazers:2195Issues:0Issues:0

mobile-ffmpeg

FFmpeg for Android, iOS and tvOS. Not maintained anymore. Superseded by FFmpegKit.

Language:CLicense:GPL-3.0Stargazers:3805Issues:0Issues:0

Wave-U-Net-for-Speech-Enhancement

Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.

Language:PythonLicense:MITStargazers:310Issues:0Issues:0
Language:C++Stargazers:1Issues:0Issues:0

MossFormer

This repo provides the processed samples of the manuscript "MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-head Transformer with Convolution-augmented Joint Self-Attentions", which was submitted to ICASSP 2023.

License:Apache-2.0Stargazers:70Issues:0Issues:0

hello_driver

HelloWorld for Linux Device Driver

Language:CStargazers:2Issues:0Issues:0

AudioClassification-Pytorch

The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TDNN, Res2Net, ResNetSE and other models, as well as a variety of preprocessing methods.

Language:PythonLicense:Apache-2.0Stargazers:305Issues:0Issues:0

tpu-perf

A model benchmark tool

Language:PythonLicense:MITStargazers:10Issues:0Issues:0

vddswitcher

A Windows console tool that uses parsec-vdd to switch to a virtual display.

Language:C++License:GPL-3.0Stargazers:86Issues:0Issues:0

person_search_demo

利用YOLOv3结合行人重识别模型,实现行人的检测识别,查找特定行人

Language:PythonLicense:MITStargazers:523Issues:0Issues:0

ustreamer

µStreamer - Lightweight and fast MJPEG-HTTP streamer

Language:CLicense:GPL-3.0Stargazers:1584Issues:0Issues:0
Language:C++License:GPL-2.0Stargazers:136Issues:0Issues:0

gpt-image

Tool to create GPT disk image files

Language:PythonLicense:MITStargazers:9Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:4Issues:0Issues:0

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:21206Issues:0Issues:0

tspi-linux-sdk

【非立创官方版本】LCEDA Tai-Shang Pi Linux SDK.立创开发板泰山派Linux SDK.

Stargazers:32Issues:0Issues:0

zhvoice

Chinese voice corpus. 中文语音语料,语音更加清晰自然,包含8个开源数据集,3200个说话人,900小时语音,1300万字。

Stargazers:515Issues:0Issues:0
Language:C++Stargazers:1Issues:0Issues:0

uwe5621ds-aml

uwe5621ds driver for amlogic platform

Language:CStargazers:9Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:180Issues:0Issues:0

alsa-examples

generic alsa samples.

Language:CStargazers:2Issues:0Issues:0

vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Language:PythonLicense:MITStargazers:6376Issues:0Issues:0

sherpa-ncnn

Real-time speech recognition using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Raspberry Pi, VisionFive2, LicheePi4A etc.

Language:C++License:Apache-2.0Stargazers:847Issues:0Issues:0

sherpa-onnx

Speech-to-text, text-to-speech, and speaker recongition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift

Language:C++License:Apache-2.0Stargazers:979Issues:0Issues:0

sherpa

Speech-to-text server framework with next-gen Kaldi

Language:C++License:Apache-2.0Stargazers:452Issues:0Issues:0

kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Language:ShellLicense:NOASSERTIONStargazers:13825Issues:0Issues:0

snowboy

Future versions with model training module will be maintained through a forked version here: https://github.com/seasalt-ai/snowboy

Language:C++License:NOASSERTIONStargazers:3002Issues:0Issues:0

RapidASR

商用级开源语音自动识别程序库,开箱即用,全平台支持,中英文混合识别。A Cross-platform implementation of ASR inference. It's based on ONNXRuntime and FunASR. We provide a set of easier APIs to call ASR models.

Language:C++License:MITStargazers:438Issues:0Issues:0