facebookincubator / velox

A composable and fully extensible C++ execution engine library for data management systems.

Home Page:https://velox-lib.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Parquet] TimestampPrecision / TimestampUnit mismatch in read / write files, particular for unit tests

zuyu opened this issue · comments

commented

Bug description

enum class TimestampPrecision : int8_t {
kMilliseconds = 3, // 10^3 milliseconds are equal to one second.
kMicroseconds = 6, // 10^6 microseconds are equal to one second.
kNanoseconds = 9, // 10^9 nanoseconds are equal to one second.
};

enum class TimestampUnit : uint8_t {
kSecond = 0 /*10^0 second is equal to 1 second*/,
kMilli = 3 /*10^3 milliseconds are equal to 1 second*/,
kMicro = 6 /*10^6 microseconds are equal to 1 second*/,
kNano = 9 /*10^9 nanoseconds are equal to 1 second*/
};

Proposed Fixes

  • Introduce a new kNotSet as the default value, and requires setting both TimestampPrecision and TimestampUnit if reading / writing a timestamp column. Otherwise, an assertion VELOX_UNREACHABLE() would trigger.
  • For timestamp-related unit tests, need to align the values for both TimestampPrecision and TimestampUnit.

I would say just align them. Adding kNotSet will make the thing unnecessarily complicated.

commented

@Yuhta How about TimestampUnit::kSecond, remove it or add TimestampPrecision::kSecond? I prefer to removing it, as it equals to that nanos is 0.

@zuyu We can remove it if neither Presto or Spark support it