trinodb / charts

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[DeltaLake] File System Cache

heitorrbarros opened this issue · comments

Hi everyone!

I'm currently trying to enable the File System Cache in my Trino cluster for the Delta Lake catalog. However, after going through the documentation, I'm struggling to find a straightforward or semantic method to enable it.

Enabling the cache in the file system seems to require mounting an emptyDir and configuring the options outlined in the documentation, which appears to be quite complex.

Additionally, emptyDirs are ephemeral and utilize node storage. To circumvent potential storage issues, I'm exploring workarounds such as mounting persistent volumes.

Has anyone encountered similar challenges or successfully activated the Delta Lake cache in Trino using this method?

The whole point of the caching is to use performant storage on the worker .. so emptyDirs on local SSDs are what you want. You can also use persistent volumes but they must be high performance, and keep in mind that you don't really gain much. The cache will be filling up and then managed based on TTL and such.

We know that the system is used in production at various places as it stands now and in fact has been used since before it was even merged in Trino itself.

I am closing this since there is no specific question unanswered. Feel free to reopen or ask a specific question as another issue.