nccgroup / SecretScrub

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SecretScrub

SecretScrub will clean away all your dirty hard-coded secrets. This script processes files in the SARIF file format which is generated by various static analysis tools. The tool scans a directory within the file system, redacts all instances of secrets detected within the SARIF files into a new copy of the directory. Because secrets often reside in Git histories, the .git directory is omitted from the resultant copy.

Although SARIF is an industry-standard format, there are variations in the SARIF output produced by various tools. As a result, the only tools that are currently supported are as follows.

Tool Version Comments
Trivy 0.37.0
Gitleaks 8.15.2 Only works with reports generated using the --no-git option to scan a file system but not the underlying Git repository if there is one.
ccs 1d055c542dbdb6e7b96279d4df03ea9b556eb27a Ccs output must be pre-processed into SARIF format using the accompanying ccs2sarif.py script.
cq 011697a9e371e37a6ac9f714b3980672bc6108e7 CQ output must be pre-processed into SARIF format using the accompanying cq2sarif.py script. Because CQ output may be very noisy, it is recommended to perform the redaction operation separately or to name the files such that the CQ file appears last in the directory.
BinDetect A built-in tool that detects binary files that are typically missed by other, text-oriented tools.

Due to the variations in SARIF output, it is possible that output generated by future versions of the tool will not work correctly.

Dependencies

Dependencies are listed in the requirements.txt file. In short, the following packages are required:

Package Version
asn1 2.7.0
filetype 1.2.0
py7zr 0.20.5
pyzipper 0.3.6
regex 2023.5.5
sarif-tools 1.0.0

Usage

Command Line Usage

Using previously-generated tool output

$ python secretscrub.py --input <path> --srcdir <path> --outdir <path> [--report <path>]
Parameter Definition
input The location of the SARIF results that are to be processed. This may contain multiple SARIF files which may be generated using different supported tools.
srcdir The location of the original source code that was scanned to produce the CQ results that are to be processed.
outdir The location where the redacted source files are to be stored.
placeholder The placeholder to insert in place of all detected secrets. This can accept the following substitutions:
- ${tool} The name of the tool used to detect the secret
- ${rule} The name of the rule used to detect the secret
- ${regex} The regular expression associated with the rule used to detect the secret
- ${yaml} A YAML flow style structure containing (if known) only the names of the tool and the rule used to detect the secret
- ${yaml_regex} A YAML flow style structure containing (if known) the names of the tool and rule and the regular expression used to detect the secret
process-archives A switch to indicate
report The location and name of a CSV report that is to be produced containing details of scrubbed secrets.
report-encryption If a report file is generated, the encryption method to use. Possible values: none, zip-aes256. Default: zip-aes256
log-level The logging level used for the tool's output. Possible values: critical, fatal, error, warning, info, debug. Default: info

Invoking the tools

$ python secretscrub.py --analyse-with <analysis-tool-list> --srcdir <path> --outdir <path> [--report <path>]
Parameter Definition
analyse-with A comma-separated list of tools to invoke. This may include any of the following: trivy, gitleaks, ccs. cq, bindetect
srcdir The location of the original source code that was scanned to produce the CQ results that are to be processed.
outdir The location where the redacted source files are to be stored.
placeholder The placeholder to insert in place of all detected secrets. This can accept the following substitutions:
- ${tool} The name of the tool used to detect the secret
- ${rule} The name of the rule used to detect the secret
- ${regex} The regular expression associated with the rule used to detect the secret
- ${yaml} A YAML flow style structure containing (if known) only the names of the tool and the rule used to detect the secret
- ${yaml_regex} A YAML flow style structure containing (if known) the names of the tool and rule and the regular expression used to detect the secret
report The location and name of a CSV report that is to be produced containing details of scrubbed secrets.
report-encryption If a report file is generated, the encryption method to use. Possible values: none, zip-aes256. Default: zip-aes256
log-level The logging level used for the tool's output. Possible values: critical, fatal, error, warning, info, debug. Default: info

NOTE: In order to work, the tools must be present and installed on the current system:

Tool Comments
Trivy The trivy command must be installed and accessible in the path.
GitLeaks The gitleaks command must be installed and accessible in the path.
ccs The ccs.py file must be located within a subdirectory named ccs within the directory containing the secretscrub.py file.
cq The cq.py and fn.py files must be located within a subdirectory named cq within the directory containing the secretscrub.py file.
bindetect This is currently included with secretscrub and no further installation is necessary.

Docker Container

The Docker image built from this source tree includes the SecretScrub tool and copies of the Trivy, GitLeaks, ccs and cq tools.

If it has been built with a tag of secretscrub:latest, the following command will initiate a full analysis and redaction operation with an accompanying report.

sudo docker run -it -v <source-path>:/src:ro -v <output-path>:/out secretscrub:latest --analyse-with trivy,gitleaks,bindetect

Note that this involves mapping two volumes within the container. The first -- <source-path> -- should contain the original, unredacted source code. The second -- <output-path> -- will receive the redacted output and report.

Suggested Workflow

The following examples assume that the source is stored within a subdirectory named src, and that SecretScrub is installed within a subdirectory named tools.

Using the Command Line

$ python tools/secretscrub.py --analse-with trivy,gitleaks,bindetect --srcdir src --outdir src-redacted

Using the Docker Container

sudo docker run -it -v <source-path>:/src:ro -v <output-path>:/out secretscrub:latest --analyse-with trivy,gitleaks,bindetect

About

License:GNU Affero General Public License v3.0


Languages

Language:Python 94.5%Language:Shell 3.7%Language:Dockerfile 1.6%Language:Java 0.2%