AbsaOSS / cobrix

A COBOL parser and Mainframe/EBCDIC data source for Apache Spark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add 'record_bytes' field that contains raw data of each input record

yruslan opened this issue · comments

Background

Original record bytes used for decoding its fields is useful for validation and audit, as well as writing filtered data back to EBCDIC format.

Feature

Add the ability to generate 'reaord_bytes' field that contains raw bytes of the decoded record.

Example

.option("generate_record_bytes", "true")

We are migrating an ETL process from DataStage to scala using your libraries. The file has fields that are handled as "Raw" and converted with the iconv function to hexadecimal. But we are losing bits and we cannot solve it. They are fields that have packaged bits inside.
image

We read them as follows
.withColumn("field", hex(col("field")))

Is this feature request relevant to the topic? #624

Please, create a separate issue for the issue you are having and please provide more context.

What do you mean by losing bits. Could you please provide an example?

Sorry, We create the issue #627.