xcp-ng / xcp

Entry point for issues and wiki. Also contains some scripts and sources.

Home Page:https://xcp-ng.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Changing zstd defaults (was: Update to Zstd 1.5.2)

sapcode opened this issue · comments

Hi oliver,

the Zstd did fantastic speed upgrade in zstd release 1.5.2, is there a chance to get this version into the XCP 8.2 repo ?

https://github.com/facebook/zstd/releases
https://github.com/facebook/zstd/releases/download/v1.5.2/zstd-1.5.2.tar.gz

Or can we just replace the version in OS from here: https://rpmfind.net/linux/epel/7/x86_64/Packages/z/zstd-1.5.2-1.el7.x86_64.rpm ?

Best regards

zstd 1.5.2 is already available in the xcp-ng-testing repository. I did not push it as an update for everyone as we favour stability over component upgrades. Also, it has not been benchmarked in the uses cases where XCP-ng uses it.

Indeed, actual benchmarks in XCP-ng are welcome 👍 This might have dual purpose:

  • validate it works well
  • make the update worth it

Could you do that @sapcode ? That would be a good way to contribute!

Running to activate xcp-ng-testing repo ;)

@olivierlambert here are the statistics on the compare between zstd 1.4.1 vs. 1.5.2

zstd 1.4.1 Performance

22 min, 61 GB: xe vm-export vm=eb74b41c-bc64-46d7-94b5-e5408b3cfaac filename= | zstdmt -T0 -1 --long=29 > /usb/backup_perm/[xsa1.local.com](http://xsa1.local.com/).xva.zst
19 min, 88 GB:  xe vm-export vm=eb74b41c-bc64-46d7-94b5-e5408b3cfaac filename= | zstdmt -T0 --fast=20 > /usb/backup_perm/[xsa1.local.com](http://xsa1.local.com/).f20.xva.zst
17 min, 77 GB:  xe vm-export vm=eb74b41c-bc64-46d7-94b5-e5408b3cfaac filename= | zstdmt -T0 --fast=3 > /usb/backup_perm/[xsa1.local.com](http://xsa1.local.com/).f3.xva.zst

zstd 1.5.2 Performance

15.50 min, 57 GB: xe vm-export vm=eb74b41c-bc64-46d7-94b5-e5408b3cfaac filename= | zstdmt -T0 -2 --long=25 > /usb/backup_perm/[xsa1.local.com](http://xsa1.local.com/).2-25.xva.zst 
22 min, 61 GB: xe vm-export vm=eb74b41c-bc64-46d7-94b5-e5408b3cfaac filename= | zstdmt -T0 -1 --long=29 > /usb/backup_perm/[xsa1.local.com](http://xsa1.local.com/).29.v2.xva.zst
15.19 min, 48 GB: xe vm-export vm=eb74b41c-bc64-46d7-94b5-e5408b3cfaac filename= | zstdmt -T0 -3 --long=29 -M2308 > /usb/backup_perm/[xsa1.local.com](http://xsa1.local.com/).3-29.xva.zst
16 min, 67 GB: xe vm-export vm=eb74b41c-bc64-46d7-94b5-e5408b3cfaac filename= | zstdmt --fast=1 > /usb/backup_perm/[xsa1.local.com](http://xsa1.local.com/).f1.xva.zst 
16 min, 71 GB: xe vm-export vm=eb74b41c-bc64-46d7-94b5-e5408b3cfaac filename= | zstdmt --fast=3 > /usb/backup_perm/[xsa1.local.com](http://xsa1.local.com/).f3.xva.zst 
16 min, 71 GB: xe vm-export vm=eb74b41c-bc64-46d7-94b5-e5408b3cfaac filename= | zstdmt --fast=4 > /usb/backup_perm/[xsa1.local.com](http://xsa1.local.com/).f4.xva.zst
21 min, 59 GB: xe vm-export vm=eb74b41c-bc64-46d7-94b5-e5408b3cfaac compress=zstd filename=/usb/backup_perm/[xsa1.local.com](http://xsa1.local.com/).152.xva.zst
16 min, 40 GB: xe vm-export vm=eb74b41c-bc64-46d7-94b5-e5408b3cfaac filename= | zstdmt -T0 -9 --long=31 -M2308 > /usb/backup_perm/[xsa1.local.com](http://xsa1.local.com/).9-31.xva.zst
26 min, 39 GB: xe vm-export vm=eb74b41c-bc64-46d7-94b5-e5408b3cfaac filename= | zstdmt -T0 -15 --long=31 -M2308 > /usb/backup_perm/[xsa1.local.com](http://xsa1.local.com/).15-31.xva.zst
15.43 min, 41 GB: xe vm-export vm=eb74b41c-bc64-46d7-94b5-e5408b3cfaac filename= | zstdmt -T0 -3 --long=31 -M2308 > /usb/backup_perm/[xsa1.local.com](http://xsa1.local.com/).3-31.xva.zst
15.26 min, 42 GB: xe vm-export vm=eb74b41c-bc64-46d7-94b5-e5408b3cfaac filename= | zstdmt -T0 -1 --long=31 -M2308 > /usb/backup_perm/[xsa1.local.com](http://xsa1.local.com/).1-31.xva.zst

The Winner is, best time and compression ratio zstdmt -T0 -6 --long=31 -M2308 !!!

15.36 min, 40 GB: xe vm-export vm=eb74b41c-bc64-46d7-94b5-e5408b3cfaac filename= | zstdmt -T0 -6 --long=31 -M2308 > /usb/backup_perm/[xsa1.local.com](http://xsa1.local.com/).6-31.xva.zst

Nice, however, for -T0 -6 --long=31 -M2308: how can we be sure it's not related to your hardware? Do you have more info on those arguments passed to zstd?

We could probably make the update in 8.3, what do you think @stormi ?

I was executing the all commands on the same hardware, on the same VM which has 310GB disk size reserved - 1 x boot disk RHEL 8.3 OS 60 GB ( 12 GB used ), 1 x data disk 250 GB ( 70 GB used) - Total to be compressed 82 GB. The winner had 40 GB in 15.36 minutes.

In total we had 8+ hours backup time of all VM's with the old version 1.4.1 zstd, which is now down to 3.5 hours when using zstd 1.5.2.

  1. CPU Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz - Speed: 2299 MHz - 80 CPU ( 2 x Sockets x 2 CPU )
  2. 768 GB RAM
  3. Motherboard Supermicro CSE-819U X10DRU-i+
  4. Dom0 CPU's grub.conf ( dom0_mem=8192M,max:8192M, dom0_max_vcpus=4-8 )

With regards to Zstd Parameter check this:
https://manpages.ubuntu.com/manpages/focal/man1/zstd.1.html
https://github.com/facebook/zstd

-T#, --threads=#
Compress using # working threads (default: 1). If # is 0, attempt to detect and use the number of physical CPU cores.
In all cases, the nb of threads is capped to ZSTDMT_NBTHREADS_MAX==200. This modifier does nothing if zstd is
compiled without multithread support.

-6 # compression level [1-19] (default: 3)

--long=31 > enables long distance matching with # windowLog, if not # is not present it defaults to 27. This increases the
window size (windowLog) and memory usage for both the compressor and decompressor. This setting is
designed to improve the compression ratio for files with long matches at a large distance. If windowLog is set to
larger than 27, --long=windowLog or --memory=windowSize needs to be passed to the decompressor.

-M2308 > -M#, --memory=#: Set a memory usage limit. By default, Zstandard
uses 128 MB for decompression as the maximum amount of memory the decompressor is allowed to use, but you
can override this manually if need be in either direction (ie. you can increase or decrease it). This is also used during
compression when using with --patch-from=. In this case, this parameter overrides that maximum size allowed for a
dictionary. (128 MB). Additionally, this can be used to limit memory for dictionary training. This parameter
overrides the default limit of 2 GB. zstd will load training samples up to the memory limit and ignore the rest.

Thanks for the explanation on parameters, however I meant more "how those parameters could be universal for all our users"? We'll have to make some choices by default obviously :)

IMHO it would be a good way to check how many CPU cores dom0 has and e.g. script it dynamically like:
--threads= {nproc -2}
So you leave 2 cores for dom0 keep other things running.

What about the other settings? Should we always use "long=31", the same compression level and the memory usage?

If we want this upstream, we need to make it simple, however I'm not sure XAPI will accept a contribution that could be too complex.

Hi both,
with that config Dom0 CPU's grub.conf ( dom0_mem=8192M,max:8192M, dom0_max_vcpus=4-8 ) all other VM's were able to run without any issues during backup times, there was no impact visible.
You could also make a XAPI compression config file where every customer can configure additional zstd parameter on demand or per VM custom field ;) this would give everyone maximum flexibility...
Best regards

We could probably make the update in 8.3, what do you think @stormi ?

Same as you: if we can fine better defaults than the current ones, but without causing regressions in specific situations (this is what is hard to define), then we could do the change.

This would likely require a variety of benchmarks: various VM sizes, various host configurations...

I doubt the exact combination of options defined above would play the role of such a default, but maybe I'm wrong. More tests may tell. My hunch is that we can define something that is possibly faster than the current defaults while remaining safe, but that the best performance is something that is defined on a case by case basis.

Hi both, with that config Dom0 CPU's grub.conf ( dom0_mem=8192M,max:8192M, dom0_max_vcpus=4-8 ) all other VM's were able to run without any issues during backup times, there was no impact visible. You could also make a XAPI compression config file where every customer can configure additional zstd parameter on demand or per VM custom field ;) this would give everyone maximum flexibility... Best regards

As Oliver mentioned above: It should be simple and probably not too much diverting from upstream. Thus your 'per VM' idea is pretty much over the top. That custom field would add a lot of complexity, no real admin wants, as it's basically unmaintainable (RAS!).
The problem is, that your grub.conf might differ from others. On systems with lower memory (like a raspberry etc.) will not have that much memory and CPUs. Otherwise: Mine have 12 GB and 8 cores for dom0.

Ideally it would choose the ressources flexible, depending on dom0 properties.
E.g. leave 2 cpu cores free, if possible, don't use more than 50 % memory, but a safe value should be 512 MB otherwise as fixed one. Still way more than 128, but should be non-critical in every environment.

You can use only this "zstdmt -T0 -6" as default as all other values will be defaulted. T0 will select as many CPU's which are available for compression and -6 is just a higher compression then the default 3.