[optimization] caching requests, etc.
louis030195 opened this issue · comments
https://github.com/zilliztech/GPTCache
GPTCache only cache the retrieval part
in assistants we could cache:
- function calls
- retrieval
- actions
- code interpreter
in redis for example
i mean there are thousands way to slash latency and cost, not very difficult problem