[SUPPORT]

Question

[SUPPORT]

zaminhassnain06 opened this issue 2 months ago · comments

Hi
Our organization is migrating from Hudi 0.6.0 to Hudi 0.12.1 and also updating the required spark and EMR versions. Our existing data sets (100s of TBs of data on S3) are written using Hudi 0.6.0.

The latest version of Hudi has come way since 0.6.0, we are not sure about how to use 0.12.1 directly.

Could someone provide the steps for upgrading from 0.6.0 to 0.12.1?

Do we have to rebuild our tables, we are more concerned about this as tables are having billions of records ?

Should we expect following imporvements after the upgrade:
– faster upserts

 – columns add/modify (schema evolution)

 – clustering

 – possible solution for storing history of updates performed on recrods

Thanks,
Zamin Hassnain

Danny Chan · Answer 1 · Thu Jun 06 2024 10:03:36 GMT+0800 (China Standard Time)

I would suggest you use the 0.12.3 or 0.14.1, 0.12.1 still got some stability issues.