2132660698's repositories
AGI-Samantha
AGI has been achieved externally
Awesome-BEV-Perception-Multi-Cameras
Awesome papers about Multi-Camera 3D Object Detection and Segmentation in Bird's-Eye-View, such as DETR3D, BEVDet, BEVFormer, BEVDepth, UniAD
Bert-VITS2-ext
基于Bert-VITS2做的表情、动画测试. Animation testing based on Bert-VITS2.
Bumblebee
A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.
ChatLM-mini-Chinese
中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调。
DynamiCrafter
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
InstantMesh
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
InternVL
[CVPR 2024 Oral] InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks —— An Open-Source Alternative to ViT-22B
life2vec-light
Basic implementation of the life2vec model with the dummy data.
lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
MGMap
[CVPR2024] The code for "MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction"
MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
OccWorld
[ECCV 2024] 3D World Model for Autonomous Driving
OneChart
official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"
open-parse
Improved file parsing for LLM’s
Perplexica
Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI
QAnything
Question and Answer based on Anything.
Qwen1.5
Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud.
ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
screenshot-to-code
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
search2ai
Help your LLMs online
shap-e
Generate 3D objects conditioned on text or images
Stirling-PDF
#1 Locally hosted web application that allows you to perform various operations on PDF files
StoryDiffusion
Create Magic Story!
StreamingT2V
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
Time-LLM
[ICLR 2024] Official implementation of "Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
UniTS
A unified time series model.
Vi-internlm2-xcomposer
随着“一带一路”倡议的持续推进,**与东盟国家间的经济与文化往来日趋密切,促进了对东盟语言,特别是越南语在自然语言处理领域研究的需求。面对大语言模型在多语言支持上的局限,特别是在处理越南语等东盟语言时的不足,本研究旨在通过开发与优化针对越南语的多模态大语言模型来提升其应用效率与准确度,从而支持更广泛的跨文化交流与合作。
VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 30+ HF models, 15+ benchmarks