jobinkv / DocFigure

Repository from Github https://github.comjobinkv/DocFigureRepository from Github https://github.comjobinkv/DocFigure

DocFigure

A dataset for scientific document figure classiication

How to get the dataset

We proved the scientific document images from the article published in CVPR, ECCV and ICCV. We don't have any copy write on this figure images. We provide you a python script for dowloading the pdf files from IEEE and CVF. Please make sure that you have acces to these websites.

Convert the all pdf file to image file. Download pdfbox

git clone https://github.com/jobinkv/DocFigure.git
cd DocFigure
wget http://mirrors.estointernet.in/apache/pdfbox/2.0.14/pdfbox-app-2.0.14.jar
python readAnotation.py

It will create a folder sub images in a folder images

Trained Models

Trained model link

To test the trained model run

python testTrainedModel.py --trainedFigClassModel '/downloded/path/to/epoch_9_loss_0.04706_testAcc_0.96867_X_resnext101_docSeg.pth' --inputImage '/path/of/inputimage/for/testing'

About


Languages

Language:Python 100.0%