Madison Smith (ainisa20)

ainisa20

Geek Repo

0

followers

0

following

Location:Xinjiang University is No. 666 Hetian Street, Shayibake District, Urumqi City, Xinjiang Uygur Autonomous Region, China.

Github PK Tool:Github PK Tool

Madison Smith's repositories

MiniGemini

Official implementation for Mini-Gemini

License:Apache-2.0Stargazers:0Issues:0Issues:0

parler-tts

Inference and training library for high-quality TTS models.

License:Apache-2.0Stargazers:0Issues:0Issues:0

mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

License:Apache-2.0Stargazers:0Issues:0Issues:0

ragas

Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines

License:Apache-2.0Stargazers:0Issues:0Issues:0

self-rag

This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.

License:MITStargazers:0Issues:0Issues:0

LLaMA-Factory

Unify Efficient Fine-Tuning of 100+ LLMs

License:Apache-2.0Stargazers:0Issues:0Issues:0

Automatic-Speech-Recognition-from-Scratch

An minimal Seq2Seq example of Automatic Speech Recognition (ASR) based on Transformer

License:MITStargazers:0Issues:0Issues:0

MuseV

MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising

License:MITStargazers:0Issues:0Issues:0

Linly-Talker

Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction method. 🤝🤖 It integrates various technologies like Whisper, Linly, Microsoft Speech Services, and SadTalker talking head generation system. 🌟🔬

License:MITStargazers:0Issues:0Issues:0

OOTDiffusion

Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

License:NOASSERTIONStargazers:0Issues:0Issues:0

Whisper-Finetune

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment

License:Apache-2.0Stargazers:0Issues:0Issues:0

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

License:MITStargazers:0Issues:0Issues:0

facefusion

Next generation face swapper and enhancer

License:NOASSERTIONStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

License:NOASSERTIONStargazers:0Issues:0Issues:0

roop

one-click face swap

License:GPL-3.0Stargazers:0Issues:0Issues:0

wvp-GB28181-pro

WEB VIDEO PLATFORM是一个基于GB28181-2016标准实现的网络视频平台,支持NAT穿透,支持海康、大华、宇视等品牌的IPC、NVR、DVR接入。支持国标级联,支持rtsp/rtmp等视频流转发到国标平台,支持rtsp/rtmp等推流转发到国标平台。

Language:JavaLicense:MITStargazers:0Issues:0Issues:0

ZLMediaKit

WebRTC/RTSP/RTMP/HTTP/HLS/HTTP-FLV/WebSocket-FLV/HTTP-TS/HTTP-fMP4/WebSocket-TS/WebSocket-fMP4/GB28181/SRT server and client framework based on C++11

License:NOASSERTIONStargazers:0Issues:0Issues:0

VITS-Pytorch

本项目是基于Pytorch的语音合成项目,使用的是VITS,VITS是一种语音合成方法,这种时端到端的模型使用起来非常简单,不需要文本对齐等太复杂的流程,直接一键训练和生成,大大降低了学习门槛。

License:Apache-2.0Stargazers:0Issues:0Issues:0

PaddlePaddle-DeepSpeech

基于PaddlePaddle实现的语音识别,中文语音识别。项目完善,识别效果好。支持Windows,Linux下训练和预测,支持Nvidia Jetson开发板预测。

License:Apache-2.0Stargazers:0Issues:0Issues:0

Grounded-Segment-Anything

Grounded-SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

License:Apache-2.0Stargazers:0Issues:0Issues:0

recognize-anything

Code for the Recognize Anything Model (RAM) and Tag2Text Model

License:Apache-2.0Stargazers:0Issues:0Issues:0

Detect-and-read-meters

This is the first released system towards complex meters` detection and recognition, which is implemented by computer vision techniques.

License:MITStargazers:0Issues:0Issues:0

FastSAM

Fast Segment Anything

License:Apache-2.0Stargazers:0Issues:0Issues:0

langchain-ChatGLM

langchain-ChatGLM, local knowledge based ChatGLM with langchain | 基于本地知识库的 ChatGLM 问答

License:Apache-2.0Stargazers:0Issues:0Issues:0
License:NOASSERTIONStargazers:0Issues:0Issues:0

CPM-Bee

百亿参数的中英文双语基座大模型

Stargazers:0Issues:0Issues:0

CcClip

使用vue(vue3) + ffmpeg + wasm 实现纯前端音视频编辑,功能包括:视频剪辑、音频剪辑、音频合成裁剪、音波展示、视频抽帧、gif抽帧、帧播放器、字幕、贴图、时间轴、素材轨道

License:NOASSERTIONStargazers:0Issues:0Issues:0

scrcpy

Display and control your Android device

License:Apache-2.0Stargazers:0Issues:0Issues:0

ChatGPT-Next-Web

One-Click to deploy well-designed ChatGPT web UI on Vercel. 一键拥有你自己的 ChatGPT 网页服务。

License:NOASSERTIONStargazers:0Issues:0Issues:0