phylum-dev / vuln-reach

A library for building tools to determine if vulnerabilities are reachable in a code base.

Home Page:https://phylum.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Address `ts_node_parent` performance issues

andreaphylum opened this issue · comments

When traversing a tree upwards, we're currently using the ts_node_parent function rather than the Cursor API, which is inefficient as it traverses the whole tree at every invocation; this leads to performance issues that are apparent even on relatively small test fixtures.

We should investigate whether using the Cursor API provides significant advantages, though it is likely that a single traversal will be necessary as there is no API for reaching a given node in a cursor in O(1) (see this).

There is no apparent efficient way of traversing a syntax tree upwards, as tree-sitter has some crucial functionality missing from its Cursor API (e.g. the possibility of retrieving a nearby node without moving the cursor).

An alternative, more efficient approach would be to visit the tree downwards, annotating nodes and caching along the way; but that requires rethinking the algorithms and giving up an important layer of abstraction, as referring to nodes' parents is very useful in understanding the code.

I suggest we postpone pursuing this optimization in the interest of feature progress, but I think it is important to eventually get to it.