Unable to read S3 file that has + charatcter in its name
venkateshganapathy opened this issue · comments
One of my team faces the following issue. Has anyone come across a similar issue before?
Airflow Log files to Cleversafe S3 Buckets with default configuration as provided in Airflow ( {dag_id}/{task_id}/{execution_date}/{try_number}.log).
Above dag_id, task_id, execution_date, try_number...All of these are dynamic and will keep on changing in real time.
So essentially logs path come out something like below:
s3://XXXX/airflow/logs/XXX_YYY//2020-04-26T16:01:00+00:00/1.log
When trying to read the log via Logstash S3 connector, it can not read this location as execution_date has + sign and it replaces this + sign with Space. Hence logstash can not read these log files and does not find the location.
Any ideas / solutions to overcome this issue is appreciated. Thanks.
Hey, this seems very weird I've just tried reading files from a bucket with special characters both in prefix and file name:
input {
s3 {
prefix => "nested/sailor/logs/A_TEST/2020-04-21T:12:22:22+00:00/"
# contains file: +test2+
# s3://kares-test1/nested/sailor/logs/A_TEST/2020-04-21T:12:22:22+00:00/+test2+
aws_credentials_file => "../aws_credentials.yml"
bucket => "kares-test1"
type => "s3"
interval => 10
additional_settings => { force_path_style => true }
}
}
... content from +test2+
file was read and printed.
I've tried with AWS S3 so I believe this is a compatibility issue with the IBM "Cleversafe S3 Buckets" product.