flume-avro-serializer
Public Home
https://github.com/codepfleger/flume-avro-serializer
This library allows you to store Avro files in HDFS using a specific but dynamic avro-schema. There is no need to specify the avro-schema itself, as it will be deduced. There are 4 predefined serializer for Windows Log-Events and Unix-Syslog Events and dynamic events.
- de.codepfleger.flume.avro.serializer.serializer.WindowsLogSerializer
- de.codepfleger.flume.avro.serializer.serializer.SyslogSerializer
- de.codepfleger.flume.avro.serializer.serializer.DynamicJsonSerializer
- de.codepfleger.flume.avro.serializer.serializer.DynamicSyslogSerializer
These serializers can be used out-of-the box by specifying the builder in your Flume configuration.
- de.codepfleger.flume.avro.serializer.serializer.WindowsLogSerializer$Builder
- de.codepfleger.flume.avro.serializer.serializer.SyslogSerializer$Builder
- de.codepfleger.flume.avro.serializer.serializer.DynamicJsonSerializer$Builder
- de.codepfleger.flume.avro.serializer.serializer.DynamicSyslogSerializer$Builder
The messages will be written either in a half static and half dynamic format or in a fully dynamic format.
Half dynamic
Windows Event Static: String EventTime, String Hostname, String EventType, String Severity, String SourceModuleName, String UserID, Integer ProcessID, String Domain, String EventReceivedTime, String Path, String Message Windows Event Dynamic (all other fields): Map<String, Object> dynamic = new HashMap<>()
Syslog Event Static: Integer Severity, Integer Facility, String host, String timestamp Syslog Event Dynamic (all other fields): Map<String, Object> dynamic = new HashMap<>()
Fully dynamic
The dynamic derializers DynamicJsonSerializer and DynamicSyslogSerializer will create a schema based on the data of the message. The property types will be deduced from the data sent.
Custom serializer
There are 2 abstract classes that can be used to create custom serializers with custom events. In both cases there is no need to specify a schema, as it will be deduced.
- AbstractReflectionAvroEventSerializer -> The schema will be deduced from the generic type parameter
- AbstractDynamicAvroSerializer -> the schema will be deduced from the data.