Problem in verifying the RocksDB StateStore
Nayaamar opened this issue · comments
Hi,
I wanted to test the RocksDB StateStore implementation and check if it really resists the OOM exception.
before I explain my issue.... I compiled the spark version 3.2.0, I'm running it on linux manjaro, I wrote the test application in java and I tested it with spark-submit, my system has 2 cores and 4 logical cores, 16 GB Ram.
My test application is a simple word count that is the sample in structured streaming programming guide page. I also wrote a server in python that sends many random words to the client which is connected to the python listening socket.
My problem is that when I send the words to the spark application (my word count app) it throws OOM exception in both using RocksDBStateStore and HDFSStateStore. What is the problem?! Am I making a mistake in running the application?!
Config of SparkSession
SparkSession spark = SparkSession
.builder()
.appName("JavaStructuredNetworkWordCount")
.config("spark.sql.streaming.stateStore.providerClass",
"org.apache.spark.sql.execution.streaming.state.RocksDBStateStoreProvider")
.config("spark.local.dir", "/home/username/sparkTemp")
.config("spark.executor.memory", "15g")
.config("spark.driver.memory", "15g")
.config("spark.memory.offHeap.enabled", true)
.config("spark.memory.offHeap.use", true)
.config("spark.memory.offHeap.size", "50g")
.config("spark.executor.memoryOverhead", "50g")
.config("spark.sql.shuffle.partitions", 8)
.config("spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows", false)
.getOrCreate();
Execution command
/path/to/spark-submit --master local[*] --deploy-mode client --class org.example.Test4 --name Run /path/to/Test4-1.0-SNAPSHOT.jar --driver-memory 15g --executor-memory 15g
Thanks for help.