Query to match all declarations (def) and identifiers (uses) in the code
Symbolk opened this issue · comments
I am using node-tree-sitter to query all def-uses in the code, in the following way:
import * as TreeSitter from 'tree-sitter'
import { Identifier } from './Identifier'
const TypeScript = require('tree-sitter-typescript').typescript
const treeSitter = new TreeSitter()
treeSitter.setLanguage(TypeScript)
private static analyzeCode(codeLines: string[]): Identifier[] {
const sourceCode = codeLines.join('\n')
const tree = treeSitter.parse(sourceCode)
const query = new TreeSitter.Query(TypeScript, `(identifier) @element`)
let identifiers: Identifier[] = []
const matches: TreeSitter.QueryMatch[] = query.matches(tree.rootNode)
for (let match of matches) {
const captures: TreeSitter.QueryCapture[] = match.captures
for (let capture of captures) {
identifiers.push(new Identifier(capture.name,tree.getText(capture.node))
}
}
return identifiers
}
However, it returns keywords that I do not want. For example,
For code:
private orgLines: string[] = [];
It returns:
[‘public’, ’string', '']
After reading the query syntax (http://tree-sitter.github.io/tree-sitter/using-parsers#query-syntax) and the test code (https://github.com/tree-sitter/node-tree-sitter/blob/master/test/query_test.js), I am wondering:
- Is it possible to query for all the subtypes of declarations with the wildcard:
(_declaration: name (identifier))
? - Is it correct to filter the language-specific keywords from matches:
(_) name: (identifier)
?
Could you provide a complete snippet of code and a query that work in Tree-sitter playground?
Could you provide a complete snippet of code and a query that work in Tree-sitter playground?
Here they are:
Code in TypeScript:
export class Conflict {
public hasOriginal: boolean = false;
private textAfterMarkerOurs: string | undefined = undefined;
private textAfterMarkerOriginal: string | undefined = undefined;
private textAfterMarkerTheirs: string | undefined = undefined;
private textAfterMarkerEnd: string | undefined = undefined;
public static parse(text: string): ISection[] {
const sections: ISection[] = getSections();
const lines: string[] = Parser.getLines(text);
let state: ParserState = ParserState.OutsideConflict;
let currentConflict: Conflict | undefined = undefined;
let currentTextLines: string[] = [];
}
Query to get some def-uses:
(method_definition name: (property_identifier) @fn-def)
(class_declaration name: (type_identifier) @class-def)
(public_field_definition name: (property_identifier) @field-def)
(variable_declarator name: (identifier) @var-def)
(call_expression
function: [
(identifier) @function
(member_expression
property: (property_identifier) @method)
])
I have noticed that for each language, the query is different, I am wondering for typescript here, is it possible to have a simpler query to match all defs and uses? (Maybe it is not a good idea, I feel that writing such queries are tedious but clear to read!)
I have noticed that for each language, the query is different
I saw somehere in the issues the author's thounghts that in future there may be a work on standartization to make queries portable, for now all languages defined in own terms what requires different queries.
is it possible to have a simpler query to match all defs and uses?
It's better to define queries as a series of small queries organized in a batch than try to organize all in a big one query. Tree-sitter's query engine executes all queries in a butch concurently and also this opens possibility to combine small queries in different ways.