Chain Index Uses Excessive Memory on Mainnet.
bwbush opened this issue · comments
Brian W Bush commented
Summary
Syncing the chain index on mainnet
requires excessive amounts of memory when it needs to catch up with syncing.
For example, it used approximately 100GB of memory where the last 4% of mainnet
was synced recently. (It also uses all available processor cores for extended periods.)
Steps to reproduce the behavior
Run the chain-index on mainnet:
`which time` --verbose plutus-chain-index start-index --network-id 764824073 \
--db-path chain-index.db/ci.sqlite \
--socket-path node.socket \
--port 9083
Actual Result
AppConfig {acLogConfigPath = Nothing, acMinLogLevel = Nothing, acConfigPath = Nothing, acCLIConfigOverrides = CLIConfigOverrides {ccSocketPath = Just "node.socket", ccDbPath = Just "chain-index.db/ci.sqlite", ccPort = Just 9083, ccNetworkId = Just 764824073}, acCommand = StartChainIndex}
Logging config:
Representation {minSeverity = Info, rotation = Nothing, setupScribes = [ScribeDefinition {scKind = StdoutSK, scFormat = ScText, scName = "stdout", scPrivacy = ScPublic, scRotation = Nothing, scMinSev = Debug, scMaxSev = Emergency}], defaultScribes = [(StdoutSK,"stdout")], setupBackends = [KatipBK,AggregationBK,MonitoringBK,EKGViewBK], defaultBackends = [KatipBK,AggregationBK,EKGViewBK], hasEKG = Just (Endpoint ("localhost",12790)), hasGraylog = Nothing, hasPrometheus = Nothing, hasGUI = Nothing, traceForwardTo = Nothing, forwardDelay = Nothing, traceAcceptAt = Nothing, options = fromList []}
Chain Index config:
Socket: node.socket
Db: chain-index.db/ci.sqlite
Port: 29083
Network Id: Testnet (NetworkMagic {unNetworkMagic = 764824073})
Security Param: 2160
Store from: BlockNo 0
The tip of the local node: SlotNo 52553102
Connecting to the node using socket: node.socket
Starting webserver on port 29083
A Swagger UI for the endpoints are available at http://localhost:29083/swagger/swagger-ui
Syncing (96%)
Syncing (97%)
Syncing (98%)
Syncing (99%)
Syncing (100%)
^C Interrupt
Command terminated by signal 2
Command being timed: "plutus-chain-index"
User time (seconds): 489738.46
System time (seconds): 159428.96
Percent of CPU this job got: 1556%
Elapsed (wall clock) time (h:mm:ss or m:ss): 11:34:53
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 98701420
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 45605
Minor (reclaiming a frame) page faults: 25476185
Voluntary context switches: 448477241
Involuntary context switches: 793936231
Swaps: 0
File system inputs: 8
File system outputs: 5032137816
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Expected Result
It's unreasonable to require 100GB+ of memory to run chain index. Ideally, its memory footprint should be under 5 GB.
Describe the approach you would take to fix this
- Experiment with the use of
--RTS
options. - Profile the memory usage of the haskell code.
- Break database transactions into smaller units.
- Replace SQLite3 with a more performant persistent store.
System info
plutus-apps
at commit ce8282d
ak3n commented
Could you please check if the situation is still the same? There were several PRs to improve it.