Chongyu-Liu (lcy0604)

lcy0604

Geek Repo

Company:SCUT

Location:Guangzhou

Github PK Tool:Github PK Tool

Chongyu-Liu's starred repositories

MegaHan97K

MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Categories

Language:PythonStargazers:4Issues:0Issues:0

anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Language:PythonStargazers:528Issues:0Issues:0

OpenDiT

OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference

Language:PythonLicense:Apache-2.0Stargazers:1352Issues:0Issues:0

synthtiger

Official Implementation of SynthTIGER (Synthetic Text Image Generator), ICDAR 2021

Language:PythonLicense:MITStargazers:454Issues:0Issues:0

One-DM

Official implementation of One-DM (ECCV 2024).

License:MITStargazers:5Issues:0Issues:0

Document-AI-Recommendations

Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.

Stargazers:137Issues:0Issues:0

groundingLMM

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Language:PythonStargazers:685Issues:0Issues:0

LLaMA2-Accessory

An Open-source Toolkit for LLM Development

Language:PythonLicense:NOASSERTIONStargazers:2624Issues:0Issues:0

VisionLLM

VisionLLM Series

Language:PythonLicense:Apache-2.0Stargazers:748Issues:0Issues:0

DiffMatch

Official implementation of "Diffusion Model for Dense Matching" (ICLR'24 Oral)

Language:PythonStargazers:130Issues:0Issues:0

DocRes

[CVPR 2024] DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks

Language:PythonLicense:MITStargazers:221Issues:0Issues:0

RFUND

Official release of RFUND introduced in the paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction" (arXiv:2401.03472).

Stargazers:16Issues:0Issues:0

UPOCR

Official implementation of UPOCR: Towards unified pixel-level OCR interface (ICML 2024)

Language:PythonStargazers:27Issues:0Issues:0

geektime-books

:books: 极客时间电子书

Stargazers:9640Issues:0Issues:0

MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Language:PythonLicense:Apache-2.0Stargazers:3102Issues:0Issues:0

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonLicense:NOASSERTIONStargazers:5722Issues:0Issues:0

Awesome-LLMs-Datasets

Summarize existing representative LLMs text datasets.

License:Apache-2.0Stargazers:737Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:7021Issues:0Issues:0

ChartAst

ChartAssistant is a chart-based vision-language model for universal chart comprehension and reasoning.

Language:PythonLicense:NOASSERTIONStargazers:66Issues:0Issues:0

imagen-pytorch

Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch

Language:PythonLicense:MITStargazers:7898Issues:0Issues:0

FontDiffuser

[AAAI2024] FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning

Language:PythonStargazers:230Issues:0Issues:0

EEG-Transformer

i. A practical application of Transformer (ViT) on 2-D physiological signal (EEG) classification tasks. Also could be tried with EMG, EOG, ECG, etc. ii. Including the attention of spatial dimension (channel attention) and *temporal dimension*. iii. Common spatial pattern (CSP), an efficient feature enhancement method, realized with Python.

Language:PythonLicense:GPL-3.0Stargazers:234Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:102Issues:0Issues:0

T-Rex

[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

Language:PythonLicense:NOASSERTIONStargazers:2013Issues:0Issues:0

LLM-in-Vision

Recent LLM-based CV and related works. Welcome to comment/contribute!

Stargazers:800Issues:0Issues:0

awesome_LLMs_interview_notes

LLMs interview notes and answers:该仓库主要记录大模型(LLMs)算法工程师相关的面试题和参考答案

License:MITStargazers:1094Issues:0Issues:0

GPT-4V_OCR

Evaluation of the Optical Character Recognition (OCR) capabilities of GPT-4V(ision)

Language:PythonStargazers:110Issues:0Issues:0

LLM-Agent-Paper-List

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

Stargazers:5834Issues:0Issues:0

OWTTT

[ICCV 2023 Oral] Official repository for “On the Robustness of Open-World Test-Time Training: Self-Training with Dynamic Prototype Expansion”

Language:PythonLicense:MITStargazers:39Issues:0Issues:0

Text2Tex

[ICCV 2023] Text2Tex: Text-driven Texture Synthesis via Diffusion Models

Language:PythonLicense:NOASSERTIONStargazers:530Issues:0Issues:0