rkshuai's starred repositories

llama.cpp

LLM inference in C/C++

Language:PythonLicense:NOASSERTIONStargazers:34535Issues:305Issues:350

yapf

A formatter for Python files

Language:PythonLicense:Apache-2.0Stargazers:13669Issues:211Issues:840

awesome-multimodal-ml

Reading list for research topics in multimodal machine learning

AdelaiDet

AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.

Language:PythonLicense:NOASSERTIONStargazers:3333Issues:84Issues:543
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2902Issues:24Issues:74

SuperCLUE

SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese

DB

A PyTorch implementation of "Real-time Scene Text Detection with Differentiable Binarization".

BlueLM

BlueLM(蓝心大模型): Open large language models developed by vivo AI Lab

Language:PythonLicense:NOASSERTIONStargazers:797Issues:13Issues:26

deocclusion

Code for our CVPR 2020 work.

Language:PythonLicense:Apache-2.0Stargazers:773Issues:16Issues:52

minigpt4.cpp

Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)

Language:C++License:MITStargazers:547Issues:8Issues:13

LaTeX_OCR

:gem: 数学公式识别 Math Formula OCR

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:476Issues:15Issues:15

DocTr

The official code for “DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction”, ACM MM, Oral Paper, 2021.

Language:PythonLicense:MITStargazers:335Issues:17Issues:30

DocProj

Document Rectification and Illumination Correction using a Patch-based CNN

Language:PythonLicense:MITStargazers:320Issues:13Issues:28

E2E-MLT

E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text

Language:C++License:MITStargazers:290Issues:16Issues:75

Table-OCR

Recognize tables from images and restore them into word.

Language:C++License:GPL-3.0Stargazers:270Issues:13Issues:20

CapsFusion

[CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale

Dewarping-Document-Image-By-Displacement-Flow-Estimation

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

Language:PythonLicense:MITStargazers:153Issues:6Issues:13

seq2seq-layout-analysis

end2end layout analysis based seq2seq

DBnet-lite.pytorch

A pytorch re-implementation of Real-time Scene Text Detection with Differentiable Binarization

MMBench

Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"

movenet

Un-official implementation of MoveNet from Google

Language:PythonLicense:MITStargazers:98Issues:6Issues:22

TreeDecoder

A Tree-Structured Decoder for Image-to-Markup Generation

waveCorrection

OCR Document image deformation correction.复现阿里OCR皱巴巴文档图像形变矫正

qaida

Large scale font independent printed Urdu text data set

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:49Issues:5Issues:3

cddod

Project page for "Cross-Domain Document Object Detection: Benchmark Suite and Method, CVPR 2020"

Language:PythonLicense:MITStargazers:44Issues:8Issues:8

GPT-4V_Social_Media

GPT-4V(ision) as A Social Media Analysis Engine