ainisa20

Madison Smith's repositories

Automatic-Speech-Recognition-from-Scratch

An minimal Seq2Seq example of Automatic Speech Recognition (ASR) based on Transformer

Language:PythonMIT000

CcClip

使用vue(vue3) + ffmpeg + wasm 实现纯前端音视频编辑，功能包括：视频剪辑、音频剪辑、音频合成裁剪、音波展示、视频抽帧、gif抽帧、帧播放器、字幕、贴图、时间轴、素材轨道

Language:VueNOASSERTION000

ChatGPT-Next-Web

One-Click to deploy well-designed ChatGPT web UI on Vercel. 一键拥有你自己的 ChatGPT 网页服务。

Language:TypeScriptNOASSERTION000

CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

Language:PythonNOASSERTION000

Detect-and-read-meters

This is the first released system towards complex meters` detection and recognition, which is implemented by computer vision techniques.

Language:PythonMIT000

facefusion

Next generation face swapper and enhancer

Language:PythonNOASSERTION000

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

MIT000

Grounded-Segment-Anything

Grounded-SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Apache-2.0000

langchain-ChatGLM

langchain-ChatGLM, local knowledge based ChatGLM with langchain ｜基于本地知识库的 ChatGLM 问答

Apache-2.0000

Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction method. 🤝🤖 It integrates various technologies like Whisper, Linly, Microsoft Speech Services, and SadTalker talking head generation system. 🌟🔬

MIT000

LLaMA-Factory

Unify Efficient Fine-Tuning of 100+ LLMs

Apache-2.0000

MiniGemini

Official implementation for Mini-Gemini

Apache-2.0000

mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Apache-2.0000

MuseV

MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising

MIT000

OOTDiffusion

Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

NOASSERTION000

PaddlePaddle-DeepSpeech

基于PaddlePaddle实现的语音识别，中文语音识别。项目完善，识别效果好。支持Windows，Linux下训练和预测，支持Nvidia Jetson开发板预测。

Apache-2.0000

parler-tts

Inference and training library for high-quality TTS models.

Language:PythonApache-2.0000

ragas

Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines

Apache-2.0000

recognize-anything

Code for the Recognize Anything Model (RAM) and Tag2Text Model

Apache-2.0000

ReplaceAnything

000

roop

one-click face swap

GPL-3.0000

scrcpy

Display and control your Android device

Apache-2.0000

self-rag

This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.

Language:PythonMIT000

tmagic-editor

Language:TypeScriptNOASSERTION000

VITS-Pytorch

本项目是基于Pytorch的语音合成项目，使用的是VITS，VITS是一种语音合成方法，这种时端到端的模型使用起来非常简单，不需要文本对齐等太复杂的流程，直接一键训练和生成，大大降低了学习门槛。

Apache-2.0000

Whisper-Finetune

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment

Apache-2.0000

wvp-GB28181-pro

WEB VIDEO PLATFORM是一个基于GB28181-2016标准实现的网络视频平台，支持NAT穿透，支持海康、大华、宇视等品牌的IPC、NVR、DVR接入。支持国标级联，支持rtsp/rtmp等视频流转发到国标平台，支持rtsp/rtmp等推流转发到国标平台。

Language:JavaMIT000

ZLMediaKit

WebRTC/RTSP/RTMP/HTTP/HLS/HTTP-FLV/WebSocket-FLV/HTTP-TS/HTTP-fMP4/WebSocket-TS/WebSocket-fMP4/GB28181/SRT server and client framework based on C++11

NOASSERTION000

ainisa20

Madison Smith's repositories

Automatic-Speech-Recognition-from-Scratch

CcClip

ChatGPT-Next-Web

CogVLM

CPM-Bee

Detect-and-read-meters

facefusion

FastSAM

GPT-SoVITS

Grounded-Segment-Anything

langchain-ChatGLM

Linly-Talker

LLaMA-Factory

MiniGemini

mPLUG-DocOwl

MuseV

OOTDiffusion

PaddlePaddle-DeepSpeech

parler-tts

ragas

recognize-anything

ReplaceAnything

roop

scrcpy

self-rag

tmagic-editor

VITS-Pytorch

Whisper-Finetune

wvp-GB28181-pro

ZLMediaKit