I wished for a free website with an ever-growing list of math problems, teachyourselfmath is that website.
If a document containing math problem exists, we'd like to extract every problem from it and dump it in a database. LaTeX is something that can be understood by both, computers and humans. Hence, the problem boils down to converting a PDF into LaTeX, removing the irrelevant parts, and storing the remaining parts.
Meta came up with a model to parse academic PDF documents and find the LaTeX math in it.
Currently, I run this model's server locally on my computer with every PDF I can get my hands on. The main server has a queue-based system that interacts with the model's server and processes all the problems. Here is a visual illustration of how it works:
- Get
nougat
from here. Run it as a server. - You will need PostgreSQL and Redis to run this.
yarn
yarn build
- Setup the
.env
file using the.env.example
file. yarn start
!
created by Vivek Nathani (@viveknathani_), licensed under the MIT License.