Giters
project-deepform
/
deepform
Experimental form data extraction for journalism
Geek Repo:
Geek Repo
Github PK Tool:
Github PK Tool
Stargazers:
75
Watchers:
2
Issues:
31
Forks:
10
project-deepform/deepform Issues
How are non-entity tokens handled ?
Updated
2 years ago
Access to OCR outputs
Updated
3 years ago
Issue writing the dataset as parquet in add_features
Updated
4 years ago
Comments count
1
About how to obtain original pdf files for 2012, 2014
Closed
4 years ago
Comments count
3
No token file for ...
Updated
4 years ago
Update wand version 0.9 -> 0.10
Closed
4 years ago
train.py crashes on save if passed a custom model name
Closed
4 years ago
Hand check 2020 sample data, all fields
Closed
4 years ago
Enable data conversion to run without huge memory allocation
Closed
4 years ago
Fix 2012 duplicate data problems
Closed
4 years ago
Create infer.py
Closed
4 years ago
Load 1000 random 2020 documents into Overview
Closed
4 years ago
Comments count
1
Train on combined 2012 and 2014 data
Updated
4 years ago
Run complete model on 2020 sample documents and upload to Overview
Updated
4 years ago
Run totals model on 2020 data
Closed
4 years ago
Comments count
1
Modify create_training_data.py to create labels for advertiser and contract number
Updated
4 years ago
Merge fuzzy-matching code into infer.py
Updated
4 years ago
Continuous 2020 downloading and inference
Updated
4 years ago
Hand-check 2020 test totals
Updated
4 years ago
Run totals model on 2020 sample documents and upload to Overview
Updated
4 years ago
Merge 2012 and 2014 training data
Updated
4 years ago
Create 2014 tokens.csv
Updated
4 years ago
Generate start and end date labels from 2014 data
Updated
4 years ago
Train model on advertiser, contract number in 2012 data
Updated
4 years ago
Match output token more intelligently
Closed
4 years ago
Comments count
1
Stop logging password as a config variable
Closed
4 years ago
Comments count
1
Add script to automate retrieving training data
Closed
4 years ago
Comments count
2
Make docker container available as a development environment
Closed
4 years ago
Pull PDFs on demand for annotation
Closed
4 years ago
Add license
Closed
4 years ago
Create test version of sweep
Updated
4 years ago