joshuarobinson / s5cmd_benchmarking

Benchmarking object storage with s5cmd and ansible

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

s5cmd_benchmarking

Benchmarking object storage with s5cmd and ansible.

Use the following Ansible playbook for simple S3 benchmarking. For more detailed explanation see the accompanying blog post.

Required:

  • Install Ansible and configure host group, update playbook with host group.
  • Install docker and docker-py on all hosts.
  • Create access keys and bucket to be used in testing.
  • Edit variables in the yaml file: S3 endpoint, bucket, and key prefix.
  • Create s3credentials.yaml file containing S3 access and secret keys.

An example credentials file is included in the repository, please add your keys there (and do NOT add to source control).

This benchmark is designed for storage systems that may do inline compression but NOT dedupe. To modify the compression factor, update the compressibility_ratio variable to N where the resulting data should compress N:1. The dataset generator is based on the lzdatagen utility.

This playbook is designed for the # of forks to be as large as the host group so that each task is executed in parallel on all hosts. Is also recommended to use the following option in your ansible.cfg for timing information per step: callback_whitelist = profile_tasks.

About

Benchmarking object storage with s5cmd and ansible