Beast code in Giters

mediagarden's starred repositories

ClipCap-Chinese

基于ClipCap的看图说话Image Caption模型

Language:Python24300

OpenGlass

Turn any glasses into AI-powered smart glasses

Language:CMIT219500

mobile-ffmpeg

FFmpeg for Android, iOS and tvOS. Not maintained anymore. Superseded by FFmpegKit.

Language:CGPL-3.0380500

Wave-U-Net-for-Speech-Enhancement

Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.

Language:PythonMIT31000

This repo provides the processed samples of the manuscript "MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-head Transformer with Convolution-augmented Joint Self-Attentions", which was submitted to ICASSP 2023.

Apache-2.07000

hello_driver

HelloWorld for Linux Device Driver

Language:C200

AudioClassification-Pytorch

The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TDNN, Res2Net, ResNetSE and other models, as well as a variety of preprocessing methods.

Language:PythonApache-2.030500

tpu-perf

A model benchmark tool

Language:PythonMIT1000

vddswitcher

A Windows console tool that uses parsec-vdd to switch to a virtual display.

Language:C++GPL-3.08600

person_search_demo

利用YOLOv3结合行人重识别模型，实现行人的检测识别，查找特定行人

Language:PythonMIT52300

ustreamer

µStreamer - Lightweight and fast MJPEG-HTTP streamer

Language:CGPL-3.0158400

rkdeveloptool

Language:C++GPL-2.013600

gpt-image

Tool to create GPT disk image files

Language:PythonMIT900

FunASR-bm

Language:PythonNOASSERTION400

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION2120600

tspi-linux-sdk

【非立创官方版本】LCEDA Tai-Shang Pi Linux SDK.立创开发板泰山派Linux SDK.

3200

zhvoice

Chinese voice corpus. 中文语音语料，语音更加清晰自然，包含8个开源数据集，3200个说话人，900小时语音，1300万字。

51500

pulse_audio_demo

Language:C++100

uwe5621ds-aml

uwe5621ds driver for amlogic platform

Language:C900

rknn-llm

Language:PythonNOASSERTION18000

alsa-examples

generic alsa samples.

Language:C200

vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Language:PythonMIT637600

sherpa-ncnn

Real-time speech recognition using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Raspberry Pi, VisionFive2, LicheePi4A etc.

Language:C++Apache-2.084700

sherpa-onnx

Speech-to-text, text-to-speech, and speaker recongition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift

Language:C++Apache-2.097900

mediagarden