logstash-plugins / logstash-integration-aws

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

s3 output -- Set Content-Type header feature

tzfx opened this issue · comments

TLDR: S3 output users should be able to set the Content-Type header. It would also be great to be able to set the file suffix (or overall filename pattern?) to better support other codecs. Since this plugin outputs text/plain or application/gzip, it may also be nice to set the content-type by default.


More context:

Hi there! Recently got started using Logstash and the s3 output type. However, I immediately ran into an issue where a bucket (that I didn't have control over) required the Content-Type header to be set for any upload. This caused both the root write permission check and any subsequent uploads to fail for pretty obvious reasons.

Digging into the plugin, I noticed that while the s3 output by default emits either a txt or a gz file, the content-type header is never set during the PUT request. Additionally, users don't seem to be able to set it, despite the s3 sdk uploader having support for it (content_type option).

To get around it the limitation, I ended up patching this line in the ruby files that are distributed with logstash on my machine to add content_header: "text/plain", which felt absolutely awful and I hate myself for doing it.

I think it would be a great feature to have the plugin automatically set the Content-Type header to either text/plain or application/gzip depending on the encoding option. Thinking about this opens up a few other possibilities as well-- allowing users to set the content_type arbitrarily in conjunction with the file suffix or overall file name pattern. Being able to set these would make the plugin behave nicer with logstash codecs other than the default.

It's also worth noting that the aws cli by default attempts to guess the mimetype and adds the content_type client option based on the file: https://github.com/aws/aws-cli/blob/97f87797c920f87d471d570f032478cd7fd45192/awscli/customizations/s3/fileinfo.py#L281
So there's definitely precedence for doing something like this when interacting with s3.

Hi @tzfx ,
Thank for your interest.
It seems it will be an useful feature for users. We welcome community contributions and if you could you please make a draft Pull Request your findings/changes, we could work together to make this happen. Thanks again!