logstash-plugins / logstash-input-s3

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Logstash S3 input plugin assume role not working

christiangda opened this issue · comments

Hi,

I'm trying to use the assume role functionality with logstash S3 input plugin but I get the following error:

NOTE: Looks like the plugin is not assuming the role, I can't see any trace about assume a role

[2020-07-20T07:18:46,508][ERROR][logstash.inputs.s3       ][main][790d495ae7a1e587d317915855ea5c21d64f412fed2b6c1bb7abb425f681f82f] 
Unable to list objects in bucket {:exception=>Aws::S3::Errors::AccessDenied, :message=>"Access Denied", 
:backtrace=>["/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/seahorse/client/plugins/raise_response_errors.rb:15:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/aws-sdk-core/plugins/s3_sse_cpk.rb:19:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/aws-sdk-core/plugins/s3_dualstack.rb:24:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/aws-sdk-core/plugins/s3_accelerate.rb:34:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/aws-sdk-core/plugins/jsonvalue_converter.rb:20:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/aws-sdk-core/plugins/idempotency_token.rb:18:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/aws-sdk-core/plugins/param_converter.rb:20:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/aws-sdk-core/plugins/response_paging.rb:26:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/seahorse/client/plugins/response_target.rb:21:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/seahorse/client/request.rb:70:in 
`send_request'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.501/lib/seahorse/client/base.rb:207:in 
`block in define_operation_methods'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-resources-2.11.501/lib/aws-sdk-resources/request.rb:24:in 
`call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-resources-2.11.501/lib/aws-sdk-resources/operations.rb:139:in 
`all_batches'", "org/jruby/RubyEnumerator.java:396:in 
`each'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-resources-2.11.501/lib/aws-sdk-resources/collection.rb:18:in 
`each'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-s3-3.5.0/lib/logstash/inputs/s3.rb:132:in 
`list_new_files'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-s3-3.5.0/lib/logstash/inputs/s3.rb:172:in 
`process_files'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-s3-3.5.0/lib/logstash/inputs/s3.rb:123:in 
`block in run'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/stud-0.0.23/lib/stud/interval.rb:20:in 
`interval'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-s3-3.5.0/lib/logstash/inputs/s3.rb:122:in 
`run'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:345:in `inputworker'", 
"/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:336:in `block in start_input'"], :prefix=>nil}

I have two AWS account, the first one only contains AWS IAM credentials and users, the second one has the S3 buckets.

Account A

Here I have an IAM programmatic user which inside a Group with a policy to assume a role to account b

access_key_id
secret_access_key

Policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AssumeRoleProd",
      "Effect": "Allow",
      "Action": "sts:AssumeRole",
      "Resource": [
        "arn:aws:I am::<my accounted removed, account b id>:role/<removed role name>"
      ]
    }
  ]
}

Account b

Here I have one bucket with data logs and a role to be assumed with access to this bucket

S3://mybucket

Role:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::mybucket/*",
        "arn:aws:s3:::mybucket"
      ]
    }
  ]
}

As I mentioned before, looks like the plugins in not assuming the role.

NOTE: If I create credentials directly into the account b, the pluging work fine, what I mean, this works when do not need to assume a role, with a conf like:

          input {
            s3 {
              access_key_id => "account b credentials"
              secret_access_key =>"account b credentials"
              #role_arn => "arn:aws:I am::<my accounted removed>:role/<removed role name>" 
              #role_session_name => "logstash_from_<removed information here>"
              bucket => "aleaplay.events.dynamo.i.player"
              #prefix => "2020/07/17/16" # not necessary, without it read all
              region => "eu-west-1"
              interval => 60
              gzip_pattern => "\.gz(ip)?$"
              additional_settings => {
                force_path_style => true
                follow_redirects => false
              }
            }
          }

Please let me know if I'm doing something wrong with this plugin, or if I have left some configuration off.

Environment information

  • Version:
# inside the container
bash-4.2$ logstash --version
logstash 7.8.0

bash-4.2$ logstash-plugin list --verbose --installed
# inside the container
...
logstash-input-s3 (3.5.0)
...
  • Operating System:
    • CentOS 8 (podman container, docker.elastic.co/logstash/logstash-oss:7.8.0)
# inside the container
bash-4.2$ cat /etc/redhat-release
CentOS Linux release 7.8.2003 (Core)
  • Config File (if you have sensitive info, please remove it):
bash-4.2$ cat config/logstash.yml
---
node.name: logstash-01
http.host: 0.0.0.0
path.config: "/usr/share/logstash/pipeline"
path.logs: "/usr/share/logstash/logs"

log.level: debug
bash-4.2$ cat pipeline/logstash.conf
# Ansible managed
input {
  s3 {
    access_key_id => "removed"
    secret_access_key => "removed"
    role_arn => "arn:aws:I am::<my accounted removed>:role/<removed role name>"
    role_session_name => "logstash_from_<removed information here>"
    bucket => "<removed information here>"
    #prefix => "2020/07/17/16" # not necessary, without it read all
    region => "eu-west-1"
    interval => 60
    gzip_pattern => "\.gz(ip)?$"
    additional_settings => {
      force_path_style => true
      follow_redirects => false
    }
  }
}

output {
  elasticsearch {
    ilm_enabled => false
    hosts => ["https://<removed information here>:9200"]
    index => "<removed information here>-%{+YYYY.MM.dd}"
    user => "<removed information here>"
    password => "<removed information here>"
    ssl => true
    ssl_certificate_verification => false
    cacert => "/usr/share/logstash/config/root-ca.pem"
  }
}
  • Sample Data:
    NA
  • Steps to Reproduce:
podman run -d \
    --name=logstash-01 \
    --net=odfe \
    --hostname=logstash-01 \
    --privileged \
    --ulimit=host \
    --security-opt label=disable \
    --volume {{ logstash_host_volume_conf_path }}:/usr/share/logstash/config:ro \
    --volume {{ logstash_host_volume_pipeline_path }}:/usr/share/logstash/pipeline:ro \
    --volume {{ logstash_host_volume_data_path }}:/usr/share/logstash/data:rw \
    --volume {{ logstash_host_volume_logs_path }}:/usr/share/logstash/logs:rw \
    --cpus 1 \
    --memory 1g \
    --memory-reservation 512m \
    --memory-swap 1g \
  docker.elastic.co/logstash/logstash-oss:7.8.0 bash -c "bin/logstash-plugin install logstash-input-s3 && bin/logstash"

It would appear from the source code that the assumed role may only work if logstash is running on an AWS ec2 and your using the identity assigned to the instance and also not populating the access key and secret options, only providing a assumed role and session name.

The code requires changes to use a different identity for an assumed role and also would then work on a non AWS hosted server.

Hi @cabberley , thanks for your comment.

When you are working into the EC2 instance, you don't need to assume the role of the instance, this is automatically implemented into the AWS SDK.

The common behaviour is to use assume-role when you are operating cross-account, where you use the actual credentials to call sts API and create temporary credentials into the second account.

We can see it into the docs Creating an AWS STS Access Token

Of course, could be a case where you are operating into the EC2 instance and needs to operate cross-account also.

For me, if you pass the AWS access_key_id, secret_access_key and role_arn is because you are going to use the two firsts to call AWS STS API to assume the role role_arn and generate new credentials like show the Creating an AWS STS Access Token

Hi @christiangda I may not have explained very well. Your comments are correct.

What I am trying to say is that the way the s3 plugin code has been written, if you you supply access_key_id, secret_access_key in the .conf file the code will never do the assumeRole with the role_arn you provide. It will only use role_arn and execute the Assumerole if the .conf file only has role_arn.

The code which is the problem for us is actually part of logstash-mixin-aws not this plugin

The logic in the code says
IF access_key_id and secret_access key is provided then use them to authenticate
ELSIF credentials are in a YML file read only access_key_id and secret_access key and authenticate
ELSIF role_ARN is provide then do assumerole and use the ec2 identity as the Identity authorized to use the ARN_ROLE.
END

Which means if you provide as you want all 3 values, it will never do the assume role.

I made my own version of the plugin which changed the logic to cater for your scenario

IF access_key_id and secret_access key is provided then use them to authenticate
ELSIF credentials are in a YML file read only access_key_id and secret_access key and authenticate
END
IF role_ARN is provide then do assumerole using the access_key_id and secret_access key provided in the .conf file.
END

Mine also caters for using external_id which is also a parameter that AssumeRole sometimes requires depending on how the identity has been setup. addiing external_id does require a few other code changes to Logstash-mixin-aws for it to work. But Doesn't require changes to the plugins that rely on logstash-mixin-aws. I use the s3 input plugin and the cloudwatch plugin which rely on this code.

I am also facing the same issue. Can you please let me know how to solve this problem.
I have installed logstash in on-prem server.

If you don’t need to use an external Id with your assume role arn, you can install the aws cli on the server use the aws credentials to setup the aws default profile for the primary account. In your logstash config do not put an access key and secret key leave them out, just put the role arn in the config and a sessionname. The plugin will then use the default profile that you setup with cli to present to aws to get the assumed role in return.
The code for this is actually in logstash-mixin-aws not the this plugin, this plugin has a dependency on it. If you need to use external Id I have open pull request for an enhanced logstash-mixin-plugin that enhances assume role function to use an external Id as well.

Thanks alot brother.... It worked!!!

@cabberley, can you help me with this error. Just wanted to know when does this occur. Is it because of configuration issue or permissions at s3 bucket. And, I was able to list the objects through awscli.

We are still facing this issue? Any updates on this ?

One of the ways I tried fixing this issue was to export Access Keys as environment variables and then start logstash