allenai / science-parse

Science Parse parses scientific papers (in PDF form) and returns them in structured form.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

memory usage of docker container is increasing ~10GB

amandalmia14 opened this issue · comments

Hi,
I am using the science parse (docker) for my work and I found that every time I start the memory consumption increases after some parsing the pdfs.

Is the docker is saving any data when a parsing (pdf -> json) request is been requested?

Thanks

It does not save anything between documents, but since it is a Java program, it will eventually grow to use all the space that we allow it to take. The docker container starts the server with -Xmx8g, which means the Java heap is allowed to grow to 8GB. Plus some overhead, I am not surprised that it would grow to 10GB. In fact, 8GB for the Java process is small. It would run better with 16GB, but then many people couldn't run it at all.

Sorry, science-parse is a big project. Most of the space is used for the model.