SecretScrub
SecretScrub will clean away all your dirty hard-coded secrets. This script processes files in the SARIF file format which is generated by various static analysis tools. The tool scans a directory within the file system, redacts all instances of secrets detected within the SARIF files into a new copy of the directory. Because secrets often reside in Git histories, the .git
directory is omitted from the resultant copy.
Although SARIF is an industry-standard format, there are variations in the SARIF output produced by various tools. As a result, the only tools that are currently supported are as follows.
Tool | Version | Comments |
---|---|---|
Trivy | 0.37.0 | |
Gitleaks | 8.15.2 | Only works with reports generated using the --no-git option to scan a file system but not the underlying Git repository if there is one. |
ccs | 1d055c542dbdb6e7b96279d4df03ea9b556eb27a | Ccs output must be pre-processed into SARIF format using the accompanying ccs2sarif.py script. |
cq | 011697a9e371e37a6ac9f714b3980672bc6108e7 | CQ output must be pre-processed into SARIF format using the accompanying cq2sarif.py script. Because CQ output may be very noisy, it is recommended to perform the redaction operation separately or to name the files such that the CQ file appears last in the directory. |
BinDetect | A built-in tool that detects binary files that are typically missed by other, text-oriented tools. |
Due to the variations in SARIF output, it is possible that output generated by future versions of the tool will not work correctly.
Dependencies
Dependencies are listed in the requirements.txt
file. In short, the following packages are required:
Package | Version |
---|---|
asn1 | 2.7.0 |
filetype | 1.2.0 |
py7zr | 0.20.5 |
pyzipper | 0.3.6 |
regex | 2023.5.5 |
sarif-tools | 1.0.0 |
Usage
Command Line Usage
Using previously-generated tool output
$ python secretscrub.py --input <path> --srcdir <path> --outdir <path> [--report <path>]
Parameter | Definition |
---|---|
input | The location of the SARIF results that are to be processed. This may contain multiple SARIF files which may be generated using different supported tools. |
srcdir | The location of the original source code that was scanned to produce the CQ results that are to be processed. |
outdir | The location where the redacted source files are to be stored. |
placeholder | The placeholder to insert in place of all detected secrets. This can accept the following substitutions: - ${tool} The name of the tool used to detect the secret- ${rule} The name of the rule used to detect the secret- ${regex} The regular expression associated with the rule used to detect the secret- ${yaml} A YAML flow style structure containing (if known) only the names of the tool and the rule used to detect the secret- ${yaml_regex} A YAML flow style structure containing (if known) the names of the tool and rule and the regular expression used to detect the secret |
process-archives | A switch to indicate |
report | The location and name of a CSV report that is to be produced containing details of scrubbed secrets. |
report-encryption | If a report file is generated, the encryption method to use. Possible values: none , zip-aes256 . Default: zip-aes256 |
log-level | The logging level used for the tool's output. Possible values: critical , fatal , error , warning , info , debug . Default: info |
Invoking the tools
$ python secretscrub.py --analyse-with <analysis-tool-list> --srcdir <path> --outdir <path> [--report <path>]
Parameter | Definition |
---|---|
analyse-with | A comma-separated list of tools to invoke. This may include any of the following: trivy , gitleaks , ccs . cq , bindetect |
srcdir | The location of the original source code that was scanned to produce the CQ results that are to be processed. |
outdir | The location where the redacted source files are to be stored. |
placeholder | The placeholder to insert in place of all detected secrets. This can accept the following substitutions: - ${tool} The name of the tool used to detect the secret- ${rule} The name of the rule used to detect the secret- ${regex} The regular expression associated with the rule used to detect the secret- ${yaml} A YAML flow style structure containing (if known) only the names of the tool and the rule used to detect the secret- ${yaml_regex} A YAML flow style structure containing (if known) the names of the tool and rule and the regular expression used to detect the secret |
report | The location and name of a CSV report that is to be produced containing details of scrubbed secrets. |
report-encryption | If a report file is generated, the encryption method to use. Possible values: none , zip-aes256 . Default: zip-aes256 |
log-level | The logging level used for the tool's output. Possible values: critical , fatal , error , warning , info , debug . Default: info |
NOTE: In order to work, the tools must be present and installed on the current system:
Tool | Comments |
---|---|
Trivy | The trivy command must be installed and accessible in the path. |
GitLeaks | The gitleaks command must be installed and accessible in the path. |
ccs | The ccs.py file must be located within a subdirectory named ccs within the directory containing the secretscrub.py file. |
cq | The cq.py and fn.py files must be located within a subdirectory named cq within the directory containing the secretscrub.py file. |
bindetect | This is currently included with secretscrub and no further installation is necessary. |
Docker Container
The Docker image built from this source tree includes the SecretScrub tool and copies of the Trivy, GitLeaks, ccs and cq tools.
If it has been built with a tag of secretscrub:latest
, the following command will initiate a full analysis and redaction operation with an accompanying report.
sudo docker run -it -v <source-path>:/src:ro -v <output-path>:/out secretscrub:latest --analyse-with trivy,gitleaks,bindetect
Note that this involves mapping two volumes within the container. The first -- <source-path>
-- should contain the original, unredacted source code. The second -- <output-path>
-- will receive the redacted output and report.
Suggested Workflow
The following examples assume that the source is stored within a subdirectory named src
, and that SecretScrub is installed within a subdirectory named tools
.
Using the Command Line
$ python tools/secretscrub.py --analse-with trivy,gitleaks,bindetect --srcdir src --outdir src-redacted
Using the Docker Container
sudo docker run -it -v <source-path>:/src:ro -v <output-path>:/out secretscrub:latest --analyse-with trivy,gitleaks,bindetect