nifeng (nemonameless)

nemonameless

Geek Repo

Company:Baidu

Location:Beijing

Github PK Tool:Github PK Tool

nifeng's repositories

PaddleDetection

Object detection and instance segmentation toolkit based on PaddlePaddle.

Language:PythonLicense:Apache-2.0Stargazers:3Issues:1Issues:0

PaddleYOLO

🚀🚀🚀 YOLOSeries of PaddleDetection implementation, PPYOLOE, YOLOX, YOLOv5, YOLOv6, YOLOv7 and so on. 🚀🚀🚀

Language:PythonLicense:GPL-3.0Stargazers:2Issues:1Issues:0

AniPortrait

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

License:Apache-2.0Stargazers:0Issues:0Issues:0

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.

Stargazers:0Issues:0Issues:0

Bunny

A family of lightweight multimodal models.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

cobra

Cobra: Extending Mamba to Multi-modal Large Language Model for Efficient Inference

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

DiS

Scalable Diffusion Models with State Space Backbone

License:NOASSERTIONStargazers:0Issues:0Issues:0

Emu

Emu: An Open Multimodal Generalist

Language:PythonStargazers:0Issues:0Issues:0

InternVL

InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks —— An Open-Source Alternative to ViT-22B

License:MITStargazers:0Issues:0Issues:0

LLaMA2-Accessory

An Open-source Toolkit for LLM Development

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

LLaVA

Visual Instruction Tuning: Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

MetaTransformer

Meta-Transformer for Unified Multimodal Learning

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

OmDet

Fast and accurate open-vocabulary end-to-end object detection

License:Apache-2.0Stargazers:0Issues:0Issues:0

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), but we only have limited resource. We deeply wish the all open source community can contribute to this project.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

OpenDiT

OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

Language:C++License:Apache-2.0Stargazers:0Issues:1Issues:0

PaddleClas

A treasure chest for visual recognition powered by PaddlePaddle

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

PixArt-sigma

New PixArt Model, Faster, Stronger, Better

License:AGPL-3.0Stargazers:0Issues:0Issues:0

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

StreamingT2V

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

Language:PythonStargazers:0Issues:0Issues:0

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction"

License:MITStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

VMamba

VMamba: Visual State Space Models

Language:PythonStargazers:0Issues:0Issues:0

xtuner

An efficient, flexible and full-featured toolkit for fine-tuning large models (InternLM, Llama, Baichuan, Qwen, ChatGLM)

License:Apache-2.0Stargazers:0Issues:0Issues:0

YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

License:GPL-3.0Stargazers:0Issues:0Issues:0

zigma

The official implementation of "ZigMa: A DiT-Style Mamba-based Diffusion Model

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0