SigmaHQ / pySigma

Python library to parse and convert Sigma rules into queries (and whatever else you could imagine)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bug: SigmaCollection.from_yaml error when parsing sigma collections

asabellico opened this issue · comments

Steps to reproduce:

Use this rule:

- action: global
  detection:
    condition: test
    test:
      field: value
  title: Test
- action: repeat
  logsource:
    category: test-1
- action: repeat
  logsource:
    category: test-2

and run :

collection = SigmaCollection.from_yaml('''
- action: global
  detection:
    condition: test
    test:
      field: value
  title: Test
- action: repeat
  logsource:
    category: test-1
- action: repeat
  logsource:
    category: test-2
''')

which produce this error:

  File "/opt/venv/lib/python3.9/site-packages/sigma/collection.py", line 84, in from_yaml
    return cls.from_dicts(list(yaml.safe_load_all(yaml_str)), collect_errors, source)
  File "/opt/venv/lib/python3.9/site-packages/sigma/collection.py", line 49, in from_dicts
    action = rule.get("action")
AttributeError: 'list' object has no attribute 'get'

This is the resulting YAML that is produced by dumping the dictionaries:

rule_collection_dicts = [
        {
            "action": "global",
            "title": "Test",
            "detection": {
                "test": {
                    "field": "value"
                },
                "condition": "test",
            }
        },
        {
            "action": "repeat",
            "logsource": {
                "category": "test-1"
            },
        },
        {
            "action": "repeat",
            "logsource": {
                "category": "test-2"
            },
        },
]

which are correctly processed by:

collection = SigmaCollection.from_dicts(rule_collection_dicts)
print(collection)

SigmaCollection(rules=[SigmaRule(applied_processing_items=set(), title='Test', logsource=SigmaLogSource(category='test-1', product=None, service=None, source=None), detection=SigmaDetections(detections={'test': SigmaDetection(parent=None, detection_items=[SigmaDetectionItem(parent=None, applied_processing_items=set(), field='field', modifiers=[], value=[('value',)], value_linking=<class 'sigma.conditions.ConditionOR'>, source=None)], source=None, item_linking=<class 'sigma.conditions.ConditionAND'>)}, condition=['test'], source=None), id=None, status=None, description=None, references=[], tags=[], author=None, date=None, fields=[], falsepositives=[], level=None, errors=[], source=None, custom_attributes={'action': 'repeat'}), SigmaRule(applied_processing_items=set(), title='Test', logsource=SigmaLogSource(category='test-2', product=None, service=None, source=None), detection=SigmaDetections(detections={'test': SigmaDetection(parent=None, detection_items=[SigmaDetectionItem(parent=None, applied_processing_items=set(), field='field', modifiers=[], value=[('value',)], value_linking=<class 'sigma.conditions.ConditionOR'>, source=None)], source=None, item_linking=<class 'sigma.conditions.ConditionAND'>)}, condition=['test'], source=None), id=None, status=None, description=None, references=[], tags=[], author=None, date=None, fields=[], falsepositives=[], level=None, errors=[], source=None, custom_attributes={'action': 'repeat'})], errors=[])

A rule collection is created from a set of YAML documents that are located in the same file, in your example you try to parse a list of YAML maps, which is something different.

This input should work:

action: global
detection:
  condition: test
  test:
    field: value
title: Test
--
action: repeat
logsource:
  category: test-1
--
action: repeat
logsource:
  category: test-2

Hello Thomas,

thanks for your reply!

Ok, we got to the real point now. At first, I started using the format that you mentioned, that was also referenced in the sigma specification v2.0. So I tried:

collection = SigmaCollection.from_yaml('''
action: global
detection:
  condition: test
  test:
    field: value
title: Test
--
action: repeat
logsource:
  category: test-1
--
action: repeat
logsource:
  category: test-2
''')

but got this other error:

  File "/opt/venv/lib/python3.9/site-packages/sigma/collection.py", line 84, in from_yaml
    return cls.from_dicts(list(yaml.safe_load_all(yaml_str)), collect_errors, source)
  File "/opt/venv/lib/python3.9/site-packages/yaml/__init__.py", line 93, in load_all
    yield loader.get_data()
  File "/opt/venv/lib/python3.9/site-packages/yaml/constructor.py", line 45, in get_data
    return self.construct_document(self.get_node())
  File "/opt/venv/lib/python3.9/site-packages/yaml/composer.py", line 27, in get_node
    return self.compose_document()
  File "/opt/venv/lib/python3.9/site-packages/yaml/composer.py", line 55, in compose_document
    node = self.compose_node(None, None)
  File "/opt/venv/lib/python3.9/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/opt/venv/lib/python3.9/site-packages/yaml/composer.py", line 127, in compose_mapping_node
    while not self.check_event(MappingEndEvent):
  File "/opt/venv/lib/python3.9/site-packages/yaml/parser.py", line 98, in check_event
    self.current_event = self.state()
  File "/opt/venv/lib/python3.9/site-packages/yaml/parser.py", line 428, in parse_block_mapping_key
    if self.check_token(KeyToken):
  File "/opt/venv/lib/python3.9/site-packages/yaml/scanner.py", line 115, in check_token
    while self.need_more_tokens():
  File "/opt/venv/lib/python3.9/site-packages/yaml/scanner.py", line 152, in need_more_tokens
    self.stale_possible_simple_keys()
  File "/opt/venv/lib/python3.9/site-packages/yaml/scanner.py", line 291, in stale_possible_simple_keys
    raise ScannerError("while scanning a simple key", key.mark,
yaml.scanner.ScannerError: while scanning a simple key
  in "<unicode string>", line 8, column 1:
    --
    ^
could not find expected ':'
  in "<unicode string>", line 9, column 1:
    action: repeat
    ^

so I though that was not the supported format by pysigma and tried to yaml-dump a list of dictionaries that was passed as input to SigmaCollection.parse_dicts() in the pysigma tests, wrongly guessing that could be the chosen format for sigma collections in pysigma.

Sorry I got it working using 3 dashes --- instead of 2! :)

this works:

collection = SigmaCollection.from_yaml('''
action: global
detection:
  condition: test
  test:
    field: value
title: Test
---
action: repeat
logsource:
  category: test-1
---
action: repeat
logsource:
  category: test-2
''')

HI,
I am working on the V2 specification.
Don't use the draft as example, many things need to be rewritten.