akash-network / cosmos-omnibus

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

add PIGZ to accelerate .tar.gz by 14x !

andy108369 opened this issue · comments

We should add and probably default to pigz compression tool.

The results speak for themselves:

  • tar - 600 MiB/s
  • tar.gz - 19.7 MiB/s
  • tar.zst - 46.6 MiB/s
  • tar.gz (PIGZ) - 275 MiB/s !

The secret is simple: gzip is constrained to a single thread while pigz can use multiple threads to perform the compression.

pigz performs the same thing as gzip, but it distributes the work across multiple processors and cores while compressing, considerably speeding up the compression/decompression process.

# TAR (no compression)
root@rpc:~# tar c -C $SNAPSHOT_DIR . 2>/dev/null | pv -petrafb -i 1 -s $SNAPSHOT_SIZE > /dev/null
8.20GiB 0:00:07 [ 988MiB/s] [1.17GiB/s] [>                              ]  1% ETA 0:10:26
19.0GiB 0:00:28 [ 403MiB/s] [ 693MiB/s] [==>                            ]  2% ETA 0:17:48
22.4GiB 0:00:37 [ 383MiB/s] [ 620MiB/s] [====>                          ]  3% ETA 0:19:49
24.4GiB 0:00:42 [ 405MiB/s] [ 594MiB/s] [====>                          ]  3% ETA 0:20:37
^C.8GiB 0:00:43 [ 385MiB/s] [ 589MiB/s] [====>                          ]  3% ETA 0:20:46

# TAR.GZ (`gzip -1`)
root@rpc:~# tar c -C $SNAPSHOT_DIR . 2>/dev/null | gzip -1 | pv -petrafb -i 1 -s $SNAPSHOT_SIZE >/dev/null
 137MiB 0:00:07 [19.4MiB/s] [19.7MiB/s] [>                             ]  0% ETA 10:43:41
 451MiB 0:00:23 [20.3MiB/s] [19.6MiB/s] [>                             ]  0% ETA 10:45:54
 550MiB 0:00:28 [20.1MiB/s] [19.7MiB/s] [>                             ]  0% ETA 10:43:50
 610MiB 0:00:31 [20.5MiB/s] [19.7MiB/s] [>                             ]  0% ETA 10:42:45
^C31MiB 0:00:32 [20.2MiB/s] [19.7MiB/s] [>                             ]  0% ETA 10:42:11

# TAR.ZST (`zstd`, v1.4.8)
root@rpc:~# tar c -C $SNAPSHOT_DIR . 2>/dev/null | zstd -c $zstd_extra_arg | pv -petrafb -i 1 -s $SNAPSHOT_SIZE > /dev/null
 452MiB 0:00:10 [47.1MiB/s] [45.3MiB/s] [>                              ]  0% ETA 4:39:39
 691MiB 0:00:15 [48.2MiB/s] [46.1MiB/s] [>                              ]  0% ETA 4:34:46
1.18GiB 0:00:26 [47.0MiB/s] [46.4MiB/s] [>                              ]  0% ETA 4:32:28
1.46GiB 0:00:32 [45.5MiB/s] [46.6MiB/s] [>                              ]  0% ETA 4:31:35
^C50GiB 0:00:33 [47.6MiB/s] [46.6MiB/s] [>                              ]  0% ETA 4:31:22

# TAR.GZ with PIGZ
root@rpc:~# tar c -C $SNAPSHOT_DIR . 2>/dev/null | pigz --fast | pv -petrafb -i 1 -s $SNAPSHOT_SIZE > /dev/null
 861MiB 0:00:03 [ 294MiB/s] [ 287MiB/s] [>                              ]  0% ETA 0:44:05
2.15GiB 0:00:08 [ 276MiB/s] [ 275MiB/s] [>                              ]  0% ETA 0:45:55
4.35GiB 0:00:16 [ 278MiB/s] [ 278MiB/s] [>                              ]  0% ETA 0:45:18
7.07GiB 0:00:26 [ 287MiB/s] [ 278MiB/s] [>                              ]  0% ETA 0:45:06
8.60GiB 0:00:32 [ 267MiB/s] [ 275MiB/s] [>                              ]  1% ETA 0:45:30
^C

Use latest zstd and set thread to 0 to use all cores

Use latest zstd and set thread to 0 to use all cores

Very good point!
With -T0 (or ZSTD_NBTHREADS=0 environment variable exported) it runs as good as pigz:

root@rpc:~# tar c -C $SNAPSHOT_DIR . 2>/dev/null | zstd -c -T0 | pv -petrafb -i 1 -s $SNAPSHOT_SIZE > /dev/null
 158MiB 0:00:01 [ 156MiB/s] [ 156MiB/s] [>                                                                                                                                                                                                                                                             ]  0% ETA 3:33:51
 981MiB 0:00:03 [ 388MiB/s] [ 327MiB/s] [>                                                                                                                                                                                                                                                             ]  0% ETA 1:43:31
1.51GiB 0:00:05 [ 255MiB/s] [ 308MiB/s] [>                                                                                                                                                                                                                                                             ]  0% ETA 1:49:49
1.98GiB 0:00:07 [ 227MiB/s] [ 289MiB/s] [>                                                                                                                                                                                                                                                             ]  0% ETA 1:56:54
2.45GiB 0:00:09 [ 240MiB/s] [ 278MiB/s] [>                                                                                                                                                                                                                                                             ]  0% ETA 2:01:32
3.17GiB 0:00:12 [ 253MiB/s] [ 270MiB/s] [>                                                                                                                                                                                                                                                             ]  0% ETA 2:05:07
3.93GiB 0:00:15 [ 267MiB/s] [ 268MiB/s] [>                                                                                                                                                                                                                                                             ]  0% ETA 2:06:09
4.92GiB 0:00:19 [ 248MiB/s] [ 264MiB/s] [>                                                                                                                                                                                                                                                             ]  0% ETA 2:07:35
^C68GiB 0:00:22 [ 258MiB/s] [ 264MiB/s] [>                                                                                                                                                                                                                                                             ]  0% ETA 2:07:45

Closing as one can export ZSTD_NBTHREADS=0.