rchain / rchain

Blockchain (smart contract) platform using CBC-Casper proof of stake + Rholang for concurrent execution.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Testing v0.13.0-alpha release with different LFS chunk size

hilltracer opened this issue · comments

Overview

This document provides the LFS boot memory test of v.0.13.0-alpha release.

Rnode parameters

Considering two nodes: source (has full state) and destination (download LFS from src). Nodes run in docker with the next settings.
For src-node:

  obs-src-lfs:
    << : *default-rnode
    container_name: obs-src-lfs
    ports:
      - 45400:45400
      - 45404:45404
      - 127.0.0.1:9000:9000
    volumes:
      - /rchain/testnet/obs-src-lfs/rnode/:/var/lib/rnode/
    command:
      -Dcom.sun.management.jmxremote.port=9000
      -Dcom.sun.management.jmxremote.rmi.port=9000
      -Dcom.sun.management.jmxremote.local.only=false
      -Dcom.sun.management.jmxremote.authenticate=false
      -Dcom.sun.management.jmxremote.ssl=false
      -Dsun.rmi.dgc.client.gcInterval=10000
      -Dsun.rmi.dgc.server.gcInterval=10000
      -Djava.rmi.server.hostname=localhost
      -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/lib/rnode/heapdump_OOM.hprof -XX:+ExitOnOutOfMemoryError -XX:ErrorFile=/var/lib/rnode/hs_err.log 
      -Dlogback.configurationFile=/var/lib/rnode/logback.xml 
      -XX:MaxDirectMemorySize=200m -J-Xmx4g
      run -c /var/lib/rnode/rnode.conf
      --bootstrap rnode://8c7b1834f78f11e640ce58a899f0c1dc7605d712@node0.testnet.rchain.coop?protocol=40400&discovery=40404
      --network-id testnet220309 --shard-name testnet4
      --fault-tolerance-threshold -1 --synchrony-constraint-threshold 0.99 --finalization-rate 1  --max-number-of-parents 1 --no-upnp         --api-max-blocks-limit=100 --api-enable-reporting
      --host obs-src-lfs.testnet.dev.rchain.coop   --protocol-port 45400 --discovery-port 45404

For dest-node:

  obs-dest-lfs:
    << : *default-rnode
    container_name: obs-dest-lfs
    ports:
      - 46400:46400
      - 46404:46404
      - 127.0.0.1:9001:9001 
    volumes:
      - /rchain/testnet/obs-dest-lfs/rnode/:/var/lib/rnode/
    command:
      -Dcom.sun.management.jmxremote.port=9001
      -Dcom.sun.management.jmxremote.rmi.port=9001
      -Dcom.sun.management.jmxremote.local.only=false
      -Dcom.sun.management.jmxremote.authenticate=false
      -Dcom.sun.management.jmxremote.ssl=false
      -Dsun.rmi.dgc.client.gcInterval=10000
      -Dsun.rmi.dgc.server.gcInterval=10000
      -Djava.rmi.server.hostname=localhost
      -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/lib/rnode/heapdump_OOM.hprof -XX:+ExitOnOutOfMemoryError -XX:ErrorFile=/var/lib/rnode/hs_err.log 
      -Dlogback.configurationFile=/var/lib/rnode/logback.xml 
      -XX:MaxDirectMemorySize=200m -J-Xmx4g
      run -c /var/lib/rnode/rnode.conf 
      --network-id testnet220309 --shard-name testnet4
      --bootstrap rnode://61e393ee2491c2b6cc15907ccc7f62c12cfb7217@obs-src-lfs.testnet.dev.rchain.coop?protocol=45400&discovery=45404
      --fault-tolerance-threshold -1 --synchrony-constraint-threshold 0.99 --finalization-rate 1  --max-number-of-parents 1 --no-upnp         --api-max-blocks-limit=100 --api-enable-reporting
      --host obs-dest-lfs.testnet.dev.rchain.coop   --protocol-port 46400 --discovery-port 46404

Before performing the experiments, LFS was loaded on src-node from the test net.
After that, I change the network-id from testnet220309 to testnet2203091. In order src-node not to load new blocks during experiments, because it's invalid network-id.

During testing, the parameter pageSize was changed:
https://github.com/tgrospic/rchain/blob/edab27d2ec3cf1a695b1474c2dba3d0af58735cb/casper/src/main/scala/coop/rchain/casper/engine/LfsTupleSpaceRequester.scala#L38-L40

Garbage collector control

The garbage collector (GC) is very affected by to result of experiments. To correct determining how much memory is actually used, you need to force perform GC. For this purpose in code added command Runtime.getRuntime().gc() after each chunk import and export:
hilltracer@2294fbd
Below are the results with GC and without GC.

src results (forced GC on)

Chunk size Heap (Max) Heap (Usual) CPU (Usual) Direct Buffer (max) Time
500 450 MB 250 MB 35% 21 MB 28 min
750 500 MB 275 MB 35% 21 MB 28 min
1000 600 MB 330 MB 32% 21 MB 21 min
2000 1050 MB 450 MB 30% 22 MB 25 min
4000 1650 MB 1150 MB 30% 22 MB 23 min
8000 1900 MB 1100 MB 23(5)% 23 MB Error

dest results (forced GC on)

Chunk size Heap (Max) Heap (Usual) CPU (Usual) Direct Buffer (max) Time
500 550 MB 300 MB 45% 46 MB 28 min
750 700 MB 300 MB 50% 56 MB 28 min
1000 1000 MB 350 MB 45% 67 MB 21 min
2000 1100 MB 500 MB 50% 110 MB 25 min
4000 1900 MB 750 MB 45% 190 MB 23 min
8000 4000 MB 3500 MB 40(5)% 200 MB Error

src results (forced GC off)

Chunk size Heap (Max) Heap (Usual) CPU (Usual) Direct Buffer (max) Time
750 1200 MB 550 MB 25 % 23 MB 25 min
1000 1550 MB 700 MB 25 % 24 MB 23 min

dest results (forced GC off)

Chunk size Heap (Max) Heap (Usual) CPU (Usual) Direct Buffer (max) Time
750 3250 MB 1500 MB 40 % 135 MB 25 min
1000 3250 MB 2750(1000) MB 40 % 200 MB 23 min

Conclusion

The most optimal is to use the chunk size equal to 750.

Graphs (forced GC on)

500 (GC on)
image
image
image
image

750 (GC on)
image
image
image
image

1000 (GC on)
image
image
image
image

2000 (GC on)
image
image
image
image

4000 (GC on)
image
image
image
image

8000 (GC on)
image
image
image
image

Graphs (forced GC off)

750 (GC off)
image
image
image
image

1000 (GC off)
image
image
image
image