NikkitaNgl / oxml-comptetiton

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About

OxML 2023 Financial Machine Learning sompetition notebooks with code.

Intro

Task 1: Given a set of pdf documents, Build an ESG document classifier that can take a document as an input, classify each page to be either E,S or G related

Task 2: Given a set of pdf as images, Build a table detector that can precisely locate the position of table from of a page document

Approach

Task 1: Finutune DistillBert on train/validation/test set

Task 2: Zero-shot DETR style table transformer.

Main dependencies:

  • huggingface
  • PyMuPDF

About


Languages

Language:Jupyter Notebook 100.0%