Feature request: JSON/YML multi-pattern input and result output
LaurensBrinker opened this issue · comments
I'm quite new to Weggli, but as far as I can tell, it currently does not support providing input and output files. And each rule pattern check requires a separate execution of Weggli.
For context - I've been playing around with Semgrep, which allows you to specify a patterns yml file with multiple patterns to check, and can output the findings to a .json files for easy parsing. Keen to hear thoughts, but it would be nice if Weggli could support something like this:
Provide patterns.yml
file containing multiple patterns like this:
- id: double-free
metadata:
references:
- https://cwe.mitre.org/data/definitions/415
- https://github.com/struct/mms
- https://www.sei.cmu.edu/downloads/sei-cert-c-coding-standard-2016-v01.pdf
- https://docs.microsoft.com/en-us/cpp/sanitizers/asan-error-examples
- https://dustri.org/b/playing-with-weggli.html
confidence: MEDIUM
message: >-
The software calls free() twice on the same memory address,
potentially leading to modification of unexpected memory locations.
severity: ERROR
languages:
- c
- cpp
pattern: "{free($a); NOT: goto _; NOT: break; NOT: continue; NOT: $a = _; free($a);}"
extra_args:
- "--unique"
- id: uninit-pointers
.....
Run something like weggli --input /path/to/patterns.yml --output /path/to/results.json /path/to/codebase
Where Weggli will run all patterns on a specified codebase (if possible), and e.g. generate a json output file which looks something like this:
{
"errors": []
"results: [{
"id": "double-free",
"start": { "col": 10, "line": 42, "offset": 701 },
"end": { "col": 25, "line": 42, "offset": 716 },
"extra": {
"fingerprint": "79965871385669e43",
"is_ignored": false,
"lines": " ...
int alloc_and_free2()
{
char *ptr = (char *)malloc(MEMSIZE);
free(ptr);
ptr = NULL;
free(ptr);
}
....",
"message": "The software calls free() twice on the same memory address, potentially leading to modification of unexpected memory locations.",
"metadata": {
"confidence": "HIGH",
"references": [
- https://cwe.mitre.org/data/definitions/415
- https://github.com/struct/mms
- https://www.sei.cmu.edu/downloads/sei-cert-c-coding-standard-2016-v01.pdf
- https://docs.microsoft.com/en-us/cpp/sanitizers/asan-error-examples
- https://dustri.org/b/playing-with-weggli.html
]
},
"metavars": {},
"severity": "ERROR"
},
"path": "test-data/sample_inputs/c-and-cpp/double-free.c"
}]
}
Again - I know that Weggli doesn't support this kind of behavior atm and that it runs for each individual pattern (afaik, specifying additional patterns with -p is an "AND", rather than an "OR"). But just wanted to see if this is something that has been considered already?
I add this feature in my fork, maybe you want to try it.
weggli-enhance