Samsung / CredSweeper

CredSweeper is a tool to detect credentials in any directories or files. CredSweeper could help users to detect unwanted exposure of credentials (such as token, passwords, api keys etc.) in advance. By scanning lines, filtering, and using AI model as option, CredSweeper reports lines with possible credentials, where the line is, and expected type o

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

regular expression error

gy741 opened this issue · comments

commented

Hello,

I am adding AWS S3 Bucket detection rule.

But the rules don't work.

My guess is that regex seems to work.

It seems that the pattern was detected normally by outputting "Valid line for pattern" in the log.

What is the cause?

rule :

- name: AWS S3 Bucket
  severity: low
  type: pattern
  values:
   - (?P<value>[a-z0-9.-]+\.s3\.amazonaws\.com|[a-z0-9.-]+\.s3-website[.-](eu|ap|us|ca|sa|cn))
  filter_type: GeneralPattern
  use_ml: false
  validations: []

testcase:

$ cat strings.txt 
storage.example.com.s3.amazonaws.com
storage.example.com.s3-website.ap-south-1.amazonaws.com

log:

2021-11-12 06:38:37,682 | INFO | __main__ | Init CredSweeper object with arguments:Namespace(api_validation=False, jobs=None, json_filename=None, log='debug', ml_batch_size=16, ml_validation=False, path=['./strings.txt'], rule_path='/home/hack/CredSweeper/credsweeper/rules/config.yaml', skip_ignored=False)
2021-11-12 06:38:37,738 | INFO | __main__ | Run analyzer on path :['./strings.txt']
2021-11-12 06:38:37,739 | INFO | app | Start Scanner
2021-11-12 06:38:37,739 | DEBUG | app | List of file paths to scan:['./strings.txt']
2021-11-12 06:38:38,002 | DEBUG | app | Start scan file: ./strings.txt
2021-11-12 06:38:38,003 | DEBUG | scan_type | Valid line for pattern: regex.Regex('(?P<value>[a-z0-9.-]+\\.s3\\.amazonaws\\.com|[a-z0-9.-]+\\.s3-website[.-](eu|ap|us|ca|sa|cn))', flags=regex.V0) in file: ./strings.txt:1 in line: storage.example.com.s3.amazonaws.com
2021-11-12 06:38:38,003 | DEBUG | scan_type | Filtered line with filter: LineSpecificKeyCheck in file: ./strings.txt:1 in line: storage.example.com.s3.amazonaws.com
2021-11-12 06:38:38,003 | DEBUG | scan_type | Valid line for pattern: regex.Regex('(?P<value>[a-z0-9.-]+\\.s3\\.amazonaws\\.com|[a-z0-9.-]+\\.s3-website[.-](eu|ap|us|ca|sa|cn))', flags=regex.V0) in file: ./strings.txt:2 in line: storage.example.com.s3-website.ap-south-1.amazonaws.com
2021-11-12 06:38:38,003 | DEBUG | scan_type | Filtered line with filter: LineSpecificKeyCheck in file: ./strings.txt:2 in line: storage.example.com.s3-website.ap-south-1.amazonaws.com

Regex:

image

Ref: https://regexr.com/

Thanks

@gy741

Hello! Thank you for your continued contributing.

We have many filters for decrease FP. So in case of LineSpecificKeyCheck filter, it checks candidate has speicific values or not.

class LineSpecificKeyCheck(Filter):
    """Check that values from list below is not in candidate line"""
    NOT_ALLOWED = ["example", "enc\\(", "enc\\[", "true", "false"]
    NOT_ALLOWED_PATTERN = regex.compile(Util.get_regex_combine_or(NOT_ALLOWED), flags=regex.IGNORECASE)
    ...

I think your new rule will work well if you change the word example to another one.
Thank you.