Siri-2001

Siri-2001

Geek Repo

Github PK Tool:Github PK Tool

Siri-2001's starred repositories

naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Language:PythonLicense:MITStargazers:1224Issues:0Issues:0

DRSformer

Learning A Sparse Transformer Network for Effective Image Deraining (CVPR 2023)

Language:PythonStargazers:231Issues:0Issues:0

AMBER

An LLM-free Multi-dimensional Benchmark for Multi-modal Hallucination Evaluation

Language:PythonLicense:Apache-2.0Stargazers:61Issues:0Issues:0

Fay

Fay is an open-source digital human framework integrating language models and digital characters. It offers retail, assistant, and agent versions for diverse applications like virtual shopping guides, broadcasters, assistants, waiters, teachers, and voice or text-based mobile assistants.

License:GPL-3.0Stargazers:8374Issues:0Issues:0

anything-llm

The all-in-one Desktop & Docker AI application with full RAG and AI Agent capabilities.

Language:JavaScriptLicense:MITStargazers:15934Issues:0Issues:0

audioWhisper

Listen to any audio stream on your machine and print out the transcribed or translated audio.

Language:PythonLicense:MITStargazers:109Issues:0Issues:0

faster-whisper

Faster Whisper transcription with CTranslate2

Language:PythonLicense:MITStargazers:9717Issues:0Issues:0

VoiceTyping

通过语音(说话)即可完成实时文本输入。通过PaddleSpeech项目二次开发 完成,支持离线脱网环境部署,支持GPU推理,目前客户端仅支持Windows。

Language:PythonStargazers:24Issues:0Issues:0

Glance-Focus

This repo contains source code for Glance and Focus: Memory Prompting for Multi-Event Video Question Answering (Accepted in NeurIPS 2023)

Language:PythonLicense:MITStargazers:17Issues:0Issues:0

annotated_deep_learning_paper_implementations

🧑‍🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

Language:Jupyter NotebookLicense:MITStargazers:49779Issues:0Issues:0

VecFloorSeg

Source code repo for VectorFloorSeg: Two-Stream Graph Attention Network for Vectorized Roughcast Floorplan Segmentation

Language:PythonStargazers:30Issues:0Issues:0

Room-Segmentation

Automatic Room Segmentation

Language:PythonStargazers:9Issues:0Issues:0
Language:PythonLicense:MITStargazers:7Issues:0Issues:0
Language:PythonStargazers:49Issues:0Issues:0
Stargazers:67Issues:0Issues:0

llm-action

本项目旨在分享大模型相关技术原理以及实战经验。

Language:HTMLLicense:Apache-2.0Stargazers:7106Issues:0Issues:0

py_floor_plan_segmenter

A Python package to segment cluttered 2D floor plans based on down-sampling.

Language:PythonLicense:BSD-3-ClauseStargazers:22Issues:0Issues:0

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:17422Issues:0Issues:0

awesome-pretrained-chinese-nlp-models

Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合

Language:PythonLicense:MITStargazers:4383Issues:0Issues:0

RetNet

Huggingface compatible implementation of RetNet (Retentive Networks, https://arxiv.org/pdf/2307.08621.pdf) including parallel, recurrent, and chunkwise forward.

Language:Jupyter NotebookLicense:MITStargazers:222Issues:0Issues:0

RetNet

An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"

Language:PythonLicense:MITStargazers:1138Issues:0Issues:0

MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Language:PythonLicense:BSD-3-ClauseStargazers:25074Issues:0Issues:0

floor-sp

Floor-SP: Inverse CAD for Floorplans by Sequential Room-wise Shortest Path, ICCV 2019

Language:PythonLicense:MITStargazers:129Issues:0Issues:0

Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Language:PythonLicense:BSD-3-ClauseStargazers:2536Issues:0Issues:0

COCO-WholeBody

ECCV2020 paper "Whole-Body Human Pose Estimation in the Wild"

Language:PythonStargazers:720Issues:0Issues:0

wholebody3d

Official repository of Human3.6M 3D WholeBody (H3WB) dataset

Language:PythonLicense:MITStargazers:223Issues:0Issues:0

INR-V-VideoGenerationSpace

The Official Implementation for INR-V: A Continuous Representation Space for Video-based Generative Tasks

Language:PythonStargazers:13Issues:0Issues:0

slt_how2sign_wicv2023

Sign Language Translation for Instructional Videos - CVPR WiCV 2023

Language:PythonLicense:MITStargazers:32Issues:0Issues:0

Chinese-LLaMA-Alpaca-2

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

Language:PythonLicense:Apache-2.0Stargazers:6964Issues:0Issues:0

stable-diffusion-webui-extension-templates

a template of stable-diffusion-webui extension

Language:PythonLicense:Apache-2.0Stargazers:66Issues:0Issues:0