logstash-plugins / logstash-input-s3

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Only reading first line of JSON

Sjaak01 opened this issue · comments

Hi,

If have a extremely frustrating issue. All of a sudden the S3 input started reading only the first line of JSON from a file. I made no changes to my configs or logstash.

I've tried updating logstash and all plugins to the latest version but no changes. No errors in the logs either.

Output:
output {

 s3 {
     codec => "json"
     access_key_id => "secret"
     secret_access_key => "secret"
     region => "eu-west-2"
     bucket => "secret"
     time_file => 1
   }
}

Input:

input {
  s3 {
    access_key_id => "secret"
    secret_access_key => "secret"
    region => "eu-west-2"
    bucket => "secret"
    sincedb_path => "/dev/null"
    codec => "json"
  }
}

JSON example:
{"@timestamp":"2018-01-19T10:51:05.000Z","error":"error1"}{"@timestamp":"2018-01-19T10:51:06.000Z","hdg":"139.93"}{"lon":"test,W","speed":"16.06","@timestamp":"2018-01-19T10:51:11.000Z","trck_angl":"138.01","lat":"test,N"}{"@timestamp":"2018-01-19T10:51:19.000Z","error":"error2"}{"@timestamp":"2018-01-19T10:51:23.000Z","EbNo":"1.2"}

The result is only the first line gets processed. I have no idea why, it was working earlier and I made no changes anywhere.

It seems I am having the same issue. Number of events imported matches exactly the number of files that have been processed. Important fact is that each file contains thousands of events.

I am using version 3.3.7 of the plugin and logstash 6.3.2.

I have done further testing and finally got it to work.

I had to change codec on s3 output to json_lines. S3 output with json does not create valid json as it does not comma separate entries nor include opening and closing square brackets.

Changing it to json_lines for output, and, strangely, using json in S3 input I have got it now working correctly,

My config for reference:

input {  
s3 {   
 codec                       => "json" 
   region                        => "eu-west-1" 
   bucket                       => "s3-bucket-name"
    prefix                        => "backups/"
    backup_add_prefix  => "restored/"
    backup_to_bucket   => "s3-bucket-name" 
   interval                      => 120
    delete                       => true 
   sincedb_path            => "/tmp/last-s3-file-s3-access-logs-eu-west-1" 
  }
}

and

output {
s3 {
codec => "json_lines"
region => "eu-west-1"
bucket => "s3-bucket-name"
size_file => 104857600
time_file => 5
restore => "true"
canned_acl => "private"
encoding => "gzip"
server_side_encryption_algorithm => "AES256"
prefix => "backups/"
}
}

@gdowmont Thank you so much for posting that response! I have been fighting with this issue for about 2 days and that fixed it. I hope it didn't take you 2 weeks.

That is still not fixed, now that this plugin is shipped with Logstash itself. I have wasted several hours myself now on this.