itsvikramagr / spark-benchmark

Structured streaming benchmark utils

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problem in verifying the RocksDB StateStore

Nayaamar opened this issue · comments

Hi,
I wanted to test the RocksDB StateStore implementation and check if it really resists the OOM exception.
before I explain my issue.... I compiled the spark version 3.2.0, I'm running it on linux manjaro, I wrote the test application in java and I tested it with spark-submit, my system has 2 cores and 4 logical cores, 16 GB Ram.

My test application is a simple word count that is the sample in structured streaming programming guide page. I also wrote a server in python that sends many random words to the client which is connected to the python listening socket.

My problem is that when I send the words to the spark application (my word count app) it throws OOM exception in both using RocksDBStateStore and HDFSStateStore. What is the problem?! Am I making a mistake in running the application?!

Config of SparkSession

SparkSession spark = SparkSession
                .builder()
                .appName("JavaStructuredNetworkWordCount")
                .config("spark.sql.streaming.stateStore.providerClass",
                        "org.apache.spark.sql.execution.streaming.state.RocksDBStateStoreProvider")
                .config("spark.local.dir", "/home/username/sparkTemp")

                .config("spark.executor.memory", "15g")
                .config("spark.driver.memory", "15g")
                .config("spark.memory.offHeap.enabled", true)
                .config("spark.memory.offHeap.use", true)
                .config("spark.memory.offHeap.size", "50g")
                .config("spark.executor.memoryOverhead", "50g")
                .config("spark.sql.shuffle.partitions", 8)
                .config("spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows", false)
                .getOrCreate();

Execution command
/path/to/spark-submit --master local[*] --deploy-mode client --class org.example.Test4 --name Run /path/to/Test4-1.0-SNAPSHOT.jar --driver-memory 15g --executor-memory 15g

Thanks for help.