Read hive default generated deflate compression file failed
MisterRaindrop opened this issue · comments
Read hive default generated deflate compression file failed
Hive SQL
my hive version apache-hive-3.1.3
set hive.exec.compress.output=true;
set mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.DefaultCodec;
drop table hive_example;
CREATE TABLE hive_example
(
id int,
name string
)
STORED AS TEXTFILE;
INSERT INTO TABLE hive_example values(1, "aaaaabbbb");
will create deflate file in hdfs
/usr/hive/hive_example/000000_0.deflate
this file I used zlib
can read but used libarchive read failed
My example code
#include <archive.h>
#include <archive_entry.h>
#include <stdio.h>
int main() {
struct archive *a;
struct archive_entry *entry;
int r;
a = archive_read_new();
archive_read_support_filter_all(a);
archive_read_support_format_all(a);
r = archive_read_open_filename(a, "/opt/share/000000_0.deflate", 10240);
if (r != ARCHIVE_OK) {
printf("Failed to open archive.\n");
return 1;
}
while (archive_read_next_header(a, &entry) == ARCHIVE_OK) {
printf("File name: %s\n", archive_entry_pathname(entry));
const void *buff;
size_t size;
off_t offset;
while (archive_read_data_block(a, &buff, &size, &offset) == ARCHIVE_OK) {
printf("Data: %s", (const char *)buff);
}
}
archive_read_close(a);
archive_read_free(a);
return 0;
}
build
g++ example_archive_read.cpp -o example_archive_read -g -O0 -larchive
My env
I build release tar libarchive-3.7.4 in centos7 and link zlib
ldd -r libarchive.so.13
linux-vdso.so.1 => (0x00007ffdf45ab000)
libcrypto.so.10 => /lib64/libcrypto.so.10 (0x00007fc6edb8f000)
liblzma.so.5 => /lib64/liblzma.so.5 (0x00007fc6ed969000)
libzstd.so.1 => /lib64/libzstd.so.1 (0x00007fc6ed6ae000)
liblz4.so.1 => /lib64/liblz4.so.1 (0x00007fc6ed49f000)
libbz2.so.1 => /lib64/libbz2.so.1 (0x00007fc6ed28f000)
libz.so.1 => /lib64/libz.so.1 (0x00007fc6ed079000)
libxml2.so.2 => /lib64/libxml2.so.2 (0x00007fc6ecd0f000)
libc.so.6 => /lib64/libc.so.6 (0x00007fc6ec941000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007fc6ec73d000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fc6ec521000)
libm.so.6 => /lib64/libm.so.6 (0x00007fc6ec21f000)
/lib64/ld-linux-x86-64.so.2 (0x00007fc6edff2000)
zlib shared library libz.so.1
already link
But I read hive default generated deflate compression file failed! Anybody know why?? libarchive does not support reading zlib’s deflate format??
No, libarchive does not and cannot support the zlib deflate format. Libarchive requires that any format it supports have a distinctive way to identify the file format. Zlib deflate format does not have a "magic value" identifying the format, so there is no reliable way for libarchive to identify this format.