yumianhuli (yumianhuli1)

yumianhuli1

Geek Repo

Company:China

Location:shanghai

Github PK Tool:Github PK Tool

yumianhuli's repositories

adetailer

Auto detecting, masking and inpainting with detection model.

Language:PythonLicense:AGPL-3.0Stargazers:0Issues:0Issues:0

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Awesome-Interaction-Aware-Trajectory-Prediction

A selection of state-of-the-art research materials on trajectory prediction

Language:TeXLicense:MITStargazers:0Issues:0Issues:0

ChatDev-QingHua

Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)

License:Apache-2.0Stargazers:0Issues:0Issues:0

crewAI

Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.

License:MITStargazers:0Issues:0Issues:0

Fay

Fay is an open-source digital human framework integrating language models and digital characters. It offers retail, assistant, and agent versions for diverse applications like virtual shopping guides, broadcasters, assistants, waiters, teachers, and voice or text-based mobile assistants.

License:GPL-3.0Stargazers:0Issues:0Issues:0

FollowYourClick

[arXiv 2024] Follow-Your-Click: This repo is the official implementation of "Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts"

Stargazers:0Issues:0Issues:0

FollowYourPose

[AAAI 2024] Follow-Your-Pose: This repo is the official implementation of "Follow-Your-Pose : Pose-Guided Text-to-Video Generation using Pose-Free Videos"

License:MITStargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

HybrIK

Official code of "HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation", CVPR 2021

License:MITStargazers:0Issues:0Issues:0

InternVideo

InternVideo: General Video Foundation Models via Generative and Discriminative Learning (https://arxiv.org/abs/2212.03191)

License:Apache-2.0Stargazers:0Issues:0Issues:0

Latte

Latte: Latent Diffusion Transformer for Video Generation.

License:Apache-2.0Stargazers:0Issues:0Issues:0

LLMUnity

Integrate LLM models in Unity!

License:MITStargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

MaterialSearch

AI语义搜索本地素材。以图搜图、查找本地素材、根据文字描述匹配画面、视频帧搜索、根据画面描述搜索视频。Semantic search. Search local photos and videos through natural language.

License:GPL-3.0Stargazers:0Issues:0Issues:0

momask-codes

Official implementation of "MoMask: Generative Masked Modeling of 3D Human Motions (CVPR2024)"

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

MoneyPrinterV2

Automate the process of making money online.

License:AGPL-3.0Stargazers:0Issues:0Issues:0

Open-Sora-Plan

This project aim to reproducing Sora (Open AI T2V model), but we only have limited resource. We deeply wish the all open source community can contribute to this project.

License:NOASSERTIONStargazers:0Issues:0Issues:0

Qwen1.5

Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud.

Stargazers:0Issues:0Issues:0

RapidVideOCR

Extract video hard subtitles and automatically generate corresponding srt files.

License:Apache-2.0Stargazers:0Issues:0Issues:0

Sakura-13B-Galgame

适配轻小说/Galgame的日中翻译大模型

Stargazers:0Issues:0Issues:0

sd-webui-negpip

Extension for Stable Diffusion web-ui enables negative prompt in prompt

License:AGPL-3.0Stargazers:0Issues:0Issues:0

smanga

A simple manga browser 一款docker直装的漫画浏览器

Stargazers:0Issues:0Issues:0

tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

License:MITStargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

UniEdit

UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing

Stargazers:0Issues:0Issues:0

VAST

Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

Stargazers:0Issues:0Issues:0

wetts

Production First and Production Ready End-to-End Text-to-Speech Toolkit

License:Apache-2.0Stargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

YOLO-World

Real-Time Open-Vocabulary Object Detection

License:GPL-3.0Stargazers:0Issues:0Issues:0