Zheng Bowen (SLTK1)

SLTK1

Geek Repo

Location:ShenZhen,China

Github PK Tool:Github PK Tool

Zheng Bowen's starred repositories

langflow

Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.

Language:PythonLicense:MITStargazers:26072Issues:202Issues:1238

generative_agents

Generative Agents: Interactive Simulacra of Human Behavior

ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Language:PythonLicense:Apache-2.0Stargazers:15663Issues:101Issues:988

qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Language:Jupyter NotebookLicense:MITStargazers:9861Issues:84Issues:247

fiftyone

The open-source tool for building high-quality datasets and computer vision models

Language:PythonLicense:Apache-2.0Stargazers:8024Issues:55Issues:1496

ARC-AGI

The Abstraction and Reasoning Corpus

Language:JavaScriptLicense:Apache-2.0Stargazers:3245Issues:96Issues:65

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Language:PythonLicense:Apache-2.0Stargazers:1673Issues:23Issues:62

torchtitan

A native PyTorch Library for large model training

Language:PythonLicense:BSD-3-ClauseStargazers:1492Issues:35Issues:125

OpenDiT

OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference

Language:PythonLicense:Apache-2.0Stargazers:1413Issues:23Issues:60

License-Plate-Detector

基于Yolov5车牌检测,更快更准.

LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Language:PythonLicense:MITStargazers:1164Issues:21Issues:52

LLaVA-pp

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

groundingLMM

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

camera_calibration

Accurate geometric camera calibration with generic camera models

Language:C++License:BSD-3-ClauseStargazers:690Issues:29Issues:67

libSGM

Stereo Semi Global Matching by cuda

Language:C++License:Apache-2.0Stargazers:609Issues:31Issues:69

Segmentation-Pytorch

Semantic Segmentation in Pytorch. Network include: FCN、FCN_ResNet、SegNet、UNet、BiSeNet、BiSeNetV2、PSPNet、DeepLabv3_plus、 HRNet、DDRNet

chat_templates

Chat Templates for 🤗 HuggingFace Large Language Models

Language:JinjaLicense:MITStargazers:426Issues:6Issues:10

Rotated_IoU

Differentiable IoU of rotated bounding boxes using Pytorch

Language:PythonLicense:MITStargazers:411Issues:9Issues:54

Hi-SAM

[arXiv preprint] Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation

Language:PythonLicense:Apache-2.0Stargazers:183Issues:12Issues:18

DocScanner

The official repo for “DocScanner: Robust Document Image Rectification with Progressive Learning”.

Language:PythonLicense:NOASSERTIONStargazers:148Issues:18Issues:9

OneChart

[ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"

Language:PythonLicense:Apache-2.0Stargazers:134Issues:1Issues:16

Awesome-Chart-Understanding

A curated list of recent and past chart understanding work based on our survey paper: From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models.

Table-LLaVA

Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train Dataset for table understanding and develop a generalist tabular MLLM named Table-LLaVA.

Language:PythonLicense:Apache-2.0Stargazers:121Issues:6Issues:8

Ant-Multi-Modal-Framework

Research Code for Multimodal-Cognition Team in Ant Group

Language:PythonLicense:CC-BY-4.0Stargazers:101Issues:4Issues:18

awesome-table-structure-recognition

A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating.

DocGenome

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models

Language:Jupyter NotebookLicense:CC-BY-4.0Stargazers:85Issues:4Issues:4

RapidLayout

Analysis of Chinese and English layouts 中英文版面分析

Language:PythonLicense:Apache-2.0Stargazers:81Issues:4Issues:0

vllm-fork

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:27Issues:2Issues:0

Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Language:PythonLicense:NOASSERTIONStargazers:1Issues:0Issues:0