WeiHaoran (Ucas-HaoranWei)

Ucas-HaoranWei

Geek Repo

Company:University of Chinese Academy of Sciences

Location:Beijing

Github PK Tool:Github PK Tool

WeiHaoran's starred repositories

Hi-SAM

[arXiv preprint] Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation

Language:PythonLicense:Apache-2.0Stargazers:180Issues:0Issues:0

tikzjax

TikZJax is TikZ running under WebAssembly in the browser

Language:JavaScriptLicense:LPPL-1.3cStargazers:436Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:66Issues:0Issues:0
Language:PythonStargazers:32Issues:0Issues:0

SMT-plusplus

Official implementation of the Sheet Music Transformer ++

Language:PythonLicense:MITStargazers:10Issues:0Issues:0

Fox

official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"

Stargazers:3Issues:0Issues:0

MoVA

MoVA: Adapting Mixture of Vision Experts to Multimodal Context

Language:PythonLicense:Apache-2.0Stargazers:107Issues:0Issues:0

olimpic-icdar24

Practical End-to-End Optical Music Recognition for Pianoform Music

Language:PythonLicense:MITStargazers:9Issues:0Issues:0

Fox

official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"

Language:PythonStargazers:87Issues:0Issues:0

ChatSpot

Official implementation of the IJCAI2024 accepted paper "ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning"

License:NOASSERTIONStargazers:7Issues:0Issues:0

merlin

[ECCV2024] Official code implementation of Merlin: Empowering Multimodal LLMs with Foresight Minds

Language:PythonLicense:NOASSERTIONStargazers:79Issues:0Issues:0

UnrealText

Synthetic Scene Text from 3D Engines

Language:C++License:MITStargazers:240Issues:0Issues:0

Vary-tiny-600k

Vary-tiny codebase upon LAVIS (for training from scratch)and a PDF image-text pairs data (about 600k including English/Chinese)

Language:PythonStargazers:28Issues:0Issues:0

OneChart

[ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"

Language:PythonLicense:Apache-2.0Stargazers:133Issues:0Issues:0

DreamLLM

[ICLR 2024 Spotlight] DreamLLM: Synergistic Multimodal Comprehension and Creation

Language:PythonLicense:Apache-2.0Stargazers:363Issues:0Issues:0

Vary-toy

Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)

Language:PythonStargazers:2Issues:0Issues:0

dessurt

Official implementation for Dessurt

Language:PythonLicense:MITStargazers:55Issues:0Issues:0

imp

a family of highly capabale yet efficient large multimodal models

Language:PythonLicense:Apache-2.0Stargazers:152Issues:0Issues:0
Stargazers:51Issues:0Issues:0

Vary-toy

Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)

Language:PythonStargazers:579Issues:0Issues:0

mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Language:PythonLicense:Apache-2.0Stargazers:1194Issues:0Issues:0

Vary

[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.

Language:PythonStargazers:1687Issues:0Issues:0

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:4554Issues:0Issues:0

CornerAffinity

[IJCAI2022] Corner Affinity: A Robust Grouping Algorithm to Make Corner-guided Detector Great Again

Language:PythonLicense:BSD-3-ClauseStargazers:5Issues:0Issues:0

HumanLiker

[NeurIPS2022 spotlight]HumanLiker: A Human-like Object Detector to Model the Manual Labeling Process

Language:PythonStargazers:6Issues:0Issues:0

CCNet-Pure-Pytorch

Criss-Cross Attention (2d&3d) for Semantic Segmentation in pure Pytorch with a faster and more precise implementation.

Language:PythonLicense:MITStargazers:182Issues:0Issues:0

Aircraft-KP

Keypoint dataset for airplane

Language:PythonStargazers:10Issues:0Issues:0

Object-Detection-Metrics

Most popular metrics used to evaluate object detection algorithms.

Language:PythonLicense:MITStargazers:4916Issues:0Issues:0