[FEA] GFQL indexing support
lmeyerov opened this issue · comments
lmeyerov commented
Is your feature request related to a problem? Please describe.
One of the easiest ways to speedup pandas code is to support indexes
Describe the solution you'd like
GFQL has a few interesting scenarios here:
- Indexing on node/edge IDs wrt global lookups
- Indexing on additional columns, especially text
- Indexes being passed in
- Indexing being requested
- Indexes happening on-the-fly at start & mid-traversal
It's unclear what's most important, I'm guessing:
- node/edge ID indexing <-- may give near-parity w/ naive non-DF-based graph traversals
- str indexing, esp for initial searches
- some sort of triple support
- Multicol indexing with some attribs of interest like node/edge type
A tricky aspect here is global vs dynamic indexing. Ex:
- ahead of time, or start, generate indexes on the 'global' node + edge DFs, and have those get used mid-traversal when sufficiently small etc, such as during enrichment
- dynamic reindexing mid-traversal
A lot of this gets into query planning, so another consideration is identify something very simple now, and defer the rest to a more structured planning system