xyzjin

0

followers

following

stars

xyzjin's starred repositories

MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具，支持PDF/网页/多格式电子书提取。

Language:PythonAGPL-3.0749900

everyone-can-use-english

人人都能用英语

Language:TypeScriptMPL-2.02359900

Visual-Instruction-Tuning

SVIT: Scaling up Visual Instruction Tuning

Language:PythonMIT15700

ALLaVA

Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model

Language:PythonApache-2.023200

coyo-dataset

COYO-700M: Large-scale Image-Text Pair Dataset

Language:Python112500

HanLP

中文分词词性标注命名实体识别依存句法分析成分句法分析语义依存分析语义角色标注指代消解风格转换语义相似度新词发现关键词短语提取自动摘要文本分类聚类拼音简繁转换自然语言处理

Language:PythonApache-2.03326900

llama

Inference code for Llama models

Language:PythonNOASSERTION5500800

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Language:PythonBSD-3-Clause544200

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonNOASSERTION965100

UFE-AVS

Official code for CVPR 2024 paper, "Audio-Visual Segmentation via Unlabeled Frame Exploitation""

Language:Python900

X-Decoder

[CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language

Language:PythonApache-2.0127600

developer2gwy

公务员从入门到上岸，最佳程序员公考实践教程

NOASSERTION690000

lectures

Material for cuda-mode lectures

Language:Jupyter NotebookApache-2.0210500

cuda_learning

learning how CUDA works

Language:Cuda12500

ALBEF

Code for ALBEF: a new vision-language pre-training method

Language:PythonBSD-3-Clause147100

modelscope-classroom

Language:Jupyter NotebookApache-2.026100

shikra

Language:PythonNOASSERTION71800

MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Language:PythonApache-2.01006300

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.0788700

ms-swift

Use PEFT or Full-parameter to finetune 300+ LLMs or 60+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Language:PythonApache-2.0282100

VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Language:PythonApache-2.0112700

CUDA-Learn-Notes

🎉CUDA/C++ 笔记 / 大模型手撕CUDA / 技术博客，更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.

Language:CudaGPL-3.099500

Tianji

天机是一款专注人情世故的大语言模型系统。您可以利用它进行涉及传统人情世故的任务，如何说好话、如何会来事儿等，以提升您的“情商”和"核心竞争能力"

Language:PythonApache-2.030100

ViP-LLaVA

[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

Language:PythonApache-2.026400

dataset

The Open Images dataset

Language:PythonApache-2.0423900

fromage

🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".

Language:Jupyter NotebookApache-2.047000

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonNOASSERTION452900

Uniaa

Unified Multi-modal IAA Baseline and Benchmark

6800

LLaVA-UHD-Better

A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo

Language:PythonApache-2.02700

conv-llava

Language:PythonApache-2.09200