Kunkka's starred repositories
QuantumSpeech-QCNN
IEEE ICASSP 21 - Quantum Convolution Neural Networks for Speech Processing and Automatic Speech Recognition
llama2-webui
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend for Generative Agents/Apps.
ControlVideo
[ICLR 2024] Official pytorch implementation of "ControlVideo: Training-free Controllable Text-to-Video Generation"
haystack
:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
S.A.T.U.R.D.A.Y
A toolbox for working with WebRTC, Audio and AI
detectron2
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
LxgwNeoZhiSong
A Chinese serif font derived from IPAmj Mincho. 一款衍生于「IPAmj明朝」的中文宋体字型。
Douyin_TikTok_Download_API
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
qtransformer
Quantum-enhanced transformer neural network
LLM-in-Vision
Recent LLM-based CV and related works. Welcome to comment/contribute!
Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
groove2groove
Code for "Groove2Groove: One-Shot Music Style Transfer with Supervision from Synthetic Data"
MusicTransformer-Pytorch
MusicTransformer written for MaestroV2 using the Pytorch framework for music generation
Multimodal-GPT
Multimodal-GPT
chineseocr
yolo3+ocr