Remote Storage Files - Meta/Content Search
- Postgres
- ElasticSearch
- Kafka
External dependency documentation
- Postgres - Is to store the file url and last processed time (CDC)
- ElasticSearch:
- For maintaining inverted indices of file content tokens (single term)
- How does it work?
- Kafka:
- When file processing of a s3 or any source is triggered, all files from the source is read and is produced.
- The file url is the key (so that it is consumed by a specific partition) Maintain Strong Ordering Guarantees
http://localhost:8085/fs/swagger-ui/index.html
PRD Notion
PRD MD