Dezhi Peng (shannanyinxiang)

shannanyinxiang

Geek Repo

Company:South China University of Technology

Location:Guangzhou, China

Github PK Tool:Github PK Tool

Dezhi Peng's starred repositories

DocRes

[CVPR 2024] DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks

Language:PythonLicense:MITStargazers:213Issues:0Issues:0

RFUND

Official release of RFUND introduced in the paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction" (arXiv:2401.03472).

Stargazers:15Issues:0Issues:0

UPOCR

Official implementation of UPOCR: Towards unified pixel-level OCR interface (ICML 2024)

Language:PythonStargazers:24Issues:0Issues:0

Decoupled-attention-network

Pytorch implementation for "Decoupled attention network for text recognition".

Language:PythonLicense:MITStargazers:310Issues:0Issues:0

Awesome-LLMs-Datasets

Summarize existing representative LLMs text datasets.

License:Apache-2.0Stargazers:725Issues:0Issues:0

FontDiffuser

[AAAI2024] FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning

Language:PythonStargazers:225Issues:0Issues:0

EEG-Transformer

i. A practical application of Transformer (ViT) on 2-D physiological signal (EEG) classification tasks. Also could be tried with EMG, EOG, ECG, etc. ii. Including the attention of spatial dimension (channel attention) and *temporal dimension*. iii. Common spatial pattern (CSP), an efficient feature enhancement method, realized with Python.

Language:PythonLicense:GPL-3.0Stargazers:231Issues:0Issues:0

Monkey

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Language:PythonLicense:MITStargazers:1556Issues:0Issues:0

HST

This is an official implementation for "Hierarchical Side Tuning for Vision Transformers".

Language:PythonStargazers:18Issues:0Issues:0

SCUT-EnsExam

SCUT-EnsExam is a real-world handwritten text erasure dataset for examination paper scenarios, which consists of 545 examination paper images. The dataset is randomly divided into training set and test set of 430 and 115 images, respectively.

Stargazers:7Issues:0Issues:0

CG-GAN

Official PyTorch implementation of the CVPR 2022 paper: "Look Closer to Supervise Better: One-Shot Font Generation via Component-Based Discriminator"

Language:PythonStargazers:84Issues:0Issues:0

ESTextSpotter

(ICCV 2023) ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer

Language:PythonStargazers:70Issues:0Issues:0

GPT-4V_OCR

Evaluation of the Optical Character Recognition (OCR) capabilities of GPT-4V(ision)

Language:PythonStargazers:108Issues:0Issues:0

Real-CE

Real-CE: A Benchmark for Chinese-English Scene Text Image Super-resolution (ICCV2023)

Language:PythonStargazers:66Issues:0Issues:0

LLM-Agent-Paper-List

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

Stargazers:5769Issues:0Issues:0

DocTamper

[CVPR2023] Towards Robust Tampered Text Detection in Document Image: New Dataset and New Solution

Language:PythonStargazers:96Issues:0Issues:0

a2m_chineseNMT

Dataset for TALLIP2019 paper "Ancient-Modern Chinese Translation with a New Large Training Dataset"

Stargazers:21Issues:0Issues:0

Adaptor

Global Adaptive Transformer for Cross-Subject EEG Classification.

Language:PythonLicense:GPL-3.0Stargazers:20Issues:0Issues:0

CS-GAN

Common Spatial Generative Adversarial Networks based Data Augmentation for Cross-Subject Brain-Computer Interfacet

Language:PythonLicense:GPL-3.0Stargazers:7Issues:0Issues:0

EEG-Toolbox

A toolbox for EEG signals processing. Welcome to join and build!

Language:PythonStargazers:10Issues:0Issues:0

EEG-Conformer

EEG Transformer 2.0. i. Convolutional Transformer for EEG Decoding. ii. Novel visualization - Class Activation Topography.

Language:PythonLicense:GPL-3.0Stargazers:364Issues:0Issues:0

Recommendations-Diffusion-Text-Image

A paper collection of recent diffusion models for text-image generation tasks, e,g., visual text generation, font generation, text removal, text image super resolution, text editing, handwritten generation, scene text recognition and scene text detection.

Stargazers:157Issues:0Issues:0
License:GPL-3.0Stargazers:21Issues:0Issues:0
Language:Jupyter NotebookStargazers:12Issues:0Issues:0

AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Language:PythonLicense:MITStargazers:163822Issues:0Issues:0

Union14M

[ICCV 2023] Code base for Revisiting Scene Text Recognition: A Data Perspective

Language:PythonLicense:MITStargazers:150Issues:0Issues:0
Language:PythonStargazers:30Issues:0Issues:0

SMT

This is an official implementation for "Scale-Aware Modulation Meet Transformer".

Language:PythonLicense:MITStargazers:176Issues:0Issues:0

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Stargazers:10568Issues:0Issues:0

ViTEraser

Official implementation of ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining (AAAI 2024)

Language:PythonLicense:MITStargazers:26Issues:0Issues:0