Explosion of summary when uploading many objects with different names

Question

Explosion of summary when uploading many objects with different names

iyuroch opened this issue 9 months ago · comments

As part of summary k6 uses http url. I'm having benchmark uploading many small objects to s3, it yields huge summaries that are hard to analyse when output prometheus or others (big cardinality). Is there anything I can do from my side in order to make it work?

Mihail Stoykov · Answer 1 · Tue Oct 24 2023 20:26:26 GMT+0800 (China Standard Time)

Hi @iyuroch, thanks for reporting this.

I am a bit confused though on what exactly you are using.

As part of summary k6 uses http url

This isn't really true for the end of test summary which is somewhat related to the jslib. It is true for outputs as they get the whole data unaggregated, and then they can aggregate it.

Can you please confirm how you are using k6 and do you mean the prometheus remote write output when you say "output prometheus"

Ivan Yurochko · Answer 2 · Tue Oct 24 2023 23:50:50 GMT+0800 (China Standard Time)

@mstoykov thanks for getting back to me, sample default function, when running this with constant arrival rate it gets oom after couple minutes running

export default async function(data) {
  if (s3client === null) {
    s3client = new S3Client(data.awsConfig);
  }
  // every iteration it will generate new key and put same data
  const object_name = randomString(8, `aeioubcdfghijpqrstuv`);
  const obj_key = `${object_name}`
  try {
    await s3client.putObject(S3_BUCKET_NAME, obj_key, OBJECT);
    succeed_s3_requests.add(1);
  } catch(e) {
    failed_s3_requests.add(1);
  }
}

from my understanding we get different url for every s3 request that explodes memory usage

Mihail Stoykov · Answer 3 · Wed Oct 25 2023 00:32:25 GMT+0800 (China Standard Time)

I got it.

This is the exactly same problem for doing http.get(somethingThathasHighCardinality) and consequently has the same solution - grouping the request with the same name tag.

https://k6.io/docs/using-k6/http-requests/#url-grouping

Looking at the docs in your case you need to do:

    await s3client.putObject(S3_BUCKET_NAME, obj_key, OBJECT, {tags: {name: "putObjectInS3"}});

I am not particularly certain it is actually a good idea to do this by default in the S3Client, but maybe it is not bad idea for this particular case. Although if more of the URLs generated are with such high cardinality- it might end up being better to do something more library wide.

cc @oleiade

Ivan Yurochko · Answer 4 · Wed Oct 25 2023 01:09:14 GMT+0800 (China Standard Time)

@mstoykov I would expand the interface to accept custom grouping for the client internally, for tags suggestion - I'm missing in the api description tags, can you please help me understand where it's coming from? I don't see any notions of tags in "putObject"

Mihail Stoykov · Answer 5 · Wed Oct 25 2023 15:38:53 GMT+0800 (China Standard Time)

Hmm, sorry @iyuroch I got tripped by my memories of way earlier version of this library and the fact params across k6 usually allows you to set tags and headers on top of what you usually have.

In the past that was also possible for a very old WIP version allowed this (or just proxied that) for the one specific method there was.

This is has been dropped across the library it seems, which I would consider a regression and likely a thing we can improve. Although maybe in some cases it will be a bit too complicated.

I don't know if @oleiade had some specific reasons to do this.

As a workaround for your case so you do not have to wait on this I would recommend using the k6/execution

import exec from 'k6/execution';
....
    exec.vu.metrics.tags["name"] = "thenameyouwantfortherequests";
    let p = s3client.putObject(S3_BUCKET_NAME, obj_key, OBJECT);
    delete exec.vu.metrics.tags["name"];
    await p;

Notes:

it is highly recommended that you set and unset the tag before using await as otherwise other async code will run with the same name
you should probably pull this in a helper function if this happens in a lot of places.

Hope this helps you and @oleiade and I will try to figure out a better solution for this.

Ivan Yurochko · Answer 6 · Wed Oct 25 2023 17:33:45 GMT+0800 (China Standard Time)

@mstoykov thanks a lot for the tip, if I can help during discussion please let me know