SourceCodex
A very simple platform for searching calls/callers (and terms) in code projects
Features
- Searches for callers/callees and patterns
- Fast enough to use for incremental completion
- Simple to use (command line tool)
- Simple design (only two source files: a shell script and a CoffeeScript program)
- Supports JavaScript and CoffeeScript
- Easy to extend for additional languages
- Zero admin (command line option to monitor project changes means no service needed)
- Small number of dependencies (POSIX/cygwin, node.js, and several node.js projects)
USAGE
sourcecodex [ -h | -i PATTERN | -s PATTERN | -p PATTERN | -l LIM OFF | -t PATTERN | -v | -q | -n | -d | -r | -u | -m | -c | -j | --callers F | --calls F | --defs ] [DIR]
DIR is the directory containing the files to index.
If DIR is given, a new ".sourcecodex" database is created
in DIR if it does not exist
OPTIONS:
-h Print this message
-i PATTERN Ignore files matching pattern
-s PATTERN Search for token pattern in files and return matched lines in grep format
-p PATTERN Search for prefix token pattern in files and return matched lines in grep format
-l LIM OFF Limit the number of search results to LIM and discard the first OFF
-t PATTERN Find all tokens matching pattern
-v Inccrease verbosity
-q Turn off progress notifications
-n Notify of changes (implies -q): 'DELETE'|'CREATE'|'MODIFY' FILE
-d Delete the .sourcecodex database
-r Delete and rebuild database
-u Update and then exit
-m Update and continue monitoring for changes
-c Index CoffeeScript callers and callees
-j Index JavaScript callers and callees
--callers F List call sites of F: FILE:LINE:COL:FUNC:CONTENTS
--calls F List F's call sites: FILE:LINE:COL:FUNC:CONTENTS
--defs List defs: FILE:LINE:COL:FUNC:CONTENTS
Architecture
- built on SQLite
- allows background updating
- other programs can access/change the database
- extensible
- update/monitor command can output events like inotifywatch (CREATE/MODIFY/DELETE file)
- other programs can query/change the database
Schema
There are 5 main tables: files, lines, lines_terms, defs, and calls.
create table files (path primary key, stamp);
create virtual table lines using fts4 (path, line_number, line);
create virtual table lines_terms using fts4aux(lines);
create table defs(file, code, line_number, col);
create table calls(file, code, call, line_number, col);
Files tracks the indexed files.
Lines stores all of the lines of the tracked files and maintains a full-text index.
Lines_terms lets you access the internals of the index like the terms (tokens).
Defs tracks the definitions in the files.
Calls tracks the calls to the definitions in the files.
Todo
- index referenced variables
- add column to defs and calls to specify type of def or call (func or var)
- support full regexp search
- maybe just narrow down the lines with a pattern and then pump them through grep
- maybe based on Russ Cox’s work
- remove POSIX deps (add bat file for search command)
References
- Smalltalk: On the Smalltalk Browser (80s), http://onsmalltalk.com/on-the-smalltalk-browser
- Tern (sourcecodex uses Tern’s parser)
- Russ Cox’s work
- codesearch: https://code.google.com/p/codesearch/
- trigram analysis: http://swtch.com/~rsc/regexp/regexp4.html
- beagrep: http://baohaojun.github.io/beagrep.html
- beagrep looks very similar, but the author says it’s very difficult to install on windows
- some similar tools do not do callers/callees
- (opengrok, gnu global): https://github.com/OpenGrok/OpenGrok/wiki/Comparison-with-Similar-Tools
- Cscope: http://en.wikipedia.org/wiki/Cscope http://cscope.sourceforge.net/history.html
- Other code search tools: http://beyondgrep.com/more-tools/