google / trillian

A transparent, highly scalable and cryptographically verifiable data store.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Any way to scale up log signer?

Ruide opened this issue · comments

commented

Hi,

Our log signer currently has around 60 qps for sequencing leaves. And we shall not scale it up with more workers or more instances. Can I know if there is any way to scale up it to around 2k qps?

Thanks a lot!

commented

2 easy way:
1). move database closer (reduce network latency)
2). span requests to multiple trees, so logsigner can run multiple sequencers and parallel processes. Else it is io bound.

Hey @Ruide - to follow up on this, we would expect to see performance higher than 60 qps. What backend are you using for this?

commented

Hey @Ruide - to follow up on this, we would expect to see performance higher than 60 qps. What backend are you using for this?

Good morning @paulmattei

I am using a 2 core 4G memory for the log signer instance. My config for log signer is batch size = 1000, num_sequencer = 10. And I'm using a Rockdb MySQL database instance. The log signer is now serving 4 log trees at a total of 100 QPS. After splitting the requests to different log trees, everything runs smoothly now.

The CPU and memory usage for the log signer instance from monitoring is 4% and 1%. From the Prometheus logging I have, I list the detailed average latency as following.

mysql_queue_leaves_latency: 40ms
mysql_dequeue_leaves_latency: 150ms
sequencer_merge_delay: 2.25s
sequencer_latency: 1.4s
sequencer_latency_init_tree: 1.2s
sequencer_latency_dequeue: 180ms
sequencer_latency_update_leaves: 250ms

From Opencensus tracing, I see ct/v1/add-chain handler latency for ct front end is 135ms. And for log server the queueleaves handler takes 28ms.