Phan Hoang's repositories
Key information extraction from invoice document with Graph Convolution Network
Tensorflow Serving with Docker / Docker Compose
MNIST Embedding Visualisation using Tensorflow Projector, link blog:
Let ChatGPT teach your own chatbot in hours with a single GPU!
Official Pytorch implementation of 6DRepNet: 6D Rotation representation for unconstrained head pose estimation.
official repository for ChatIE paper and a tool of IE using ChatGPT. Note: we set defaul openai key. See issues for the solution of gpt3.5-turbo request limit. The response speed depends on openai. ( sometimes, the official is too crowded and the speed/model will be slow/overloaded.)
Official PyTorch implementation of `Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition`
Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.
Source code of "EdgeFormer: Improving Light-weight ConvNets by Learning from Vision Transformers"
An unofficial Pytorch implementation of ERNIE-Layout which is originally released through PaddleNLP.
Official repo for FEAR: Fast, Efficient, Accurate and Robust Visual Tracker (ECCV 2022)
Official PyTorch implementation of Global Context Vision Transformers
Marrying Grounding DINO with Segment Anything - Detect and Segment Anything with Text Inputs
Scene text recognition
Holistically-Attracted Wireframe Parsing
Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)
Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
The code and the DIW dataset for "Learning From Documents in the Wild to Improve Document Unwarping" (SIGGRAPH 2022)
Scene Text Recognition with Permuted Autoregressive Sequence Models (ECCV 2022)
A toolbox for skeleton-based action recognition.
Official Pytorch implementations for "SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation" (NeurIPS 2022)
source code for Table Generation
A method to increase the speed and lower the memory footprint of existing vision transformers.