usc-isi-i2 / Web-Karma

Information Integration Tool

Home Page:http://www.isi.edu/integration/karma/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Batch Mode: OutOfMemoryError

megankatsumi opened this issue · comments

Hello,
I am having an issue running Karma in batch mode to generate rdf files from .csv files. I was successfully able to apply the mappings on a sample of the data via the workspace, however when I attempt to map the entire files I encounter the error message appended (2 samples) below. The csv files that I am attempting to map range from 22MB - 767MB. I can provide samples of the files if needed. Note that some other error messages are also printed - I'm not sure whether or not these are related.

22mbFileErrorOutput.txt
330mbFileErrorOutput.txt

Can you upload that 22MB file and model file? Meanwhile, please also try to increase JVM's heap size. Thanks.

Hello,
Example files are attached FYI. Increasing the heap size seems to have done the trick (though the other error messages are still shown).
Thanks!

example_files.zip

Hi,
I'm just following up on this. I'm still encountering some issues using batch mode. The csv file that I am trying to transform is 148MB. I've attached a sample of this file as well as the model file for your reference. I'm receiving timeout errors even when the heap size is set to 7000m. Any suggestions on what might be causing this or how to improve Karma's performance? I've been able to transform larger files in batch mode in the past.
Thanks!
examples.zip

Hi,
I apologize for the delay in the response.
Try using this with your command: java -Xmx7000m -XX:-UseGCOverheadLimit
This will increase java heap space as well as turn off the GCOverheadLimit exceeded error. I was able to execute the command successfully on the example file.
If you still face any other error, post the error file.

Please reopen this issue if you face the above problem.
Thanks!