elastic / detection-rules

Home Page:https://www.elastic.co/guide/en/security/current/detection-engine-overview.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[FR] Make RuleCollection Initialization Faster

eric-forte-elastic opened this issue · comments

Summary

One of the largest contributors to the time it takes to run unit tests is the rule loader. One part of this that takes significant time is the adding and validating rules in the RuleCollection's class initialization function.

This issue proposes that prior to any potential refactor to the rule loader, we make a minor update to the RuleCollection class to multi thread adding rules via the init. While this is a minor change it should provide noticeably faster load times, and thus faster unit tests.

Update 12/20/23

Upon further experimentation, we discovered that the simply multi-threading loading the rule files and/or the init of the RuleLoader can have some unintended consequences. While the unit test speed may increase based on configuration (see PR for more details), when one runs a basic instantiation of the RuleLoader, the loading time increases with the multi threading. Given this, it is expected that much of the execution time for loading the rules is I/O bound. As such, I would recommend closing this issue and deferring specific optimizations until we make more broad updates/refactoring to the RuleLoader class.

Test Script

import time
from detection_rules.rule_loader import RuleCollection

start_time = time.time()

rules = RuleCollection.default()

end_time = time.time()
execution_time = end_time - start_time

print(f"Execution time: {execution_time} seconds")

Timing Details

Base execution time, no multi-threading.

detection-rules on  multi_thread_rule_loader [?] is  v0.1.0 via  v3.8.18 (venv) on  eric.forte took 56s 
❯ python test_rule_loader.py
Execution time: 76.79581332206726 seconds

Multi-threading just load files, which leads to errors with loading.

detection-rules on  multi_thread_rule_loader [!?] is  v0.1.0 via  v3.8.18 (venv) on  eric.forte 
❯ python test_rule_loader.py
Error loading rule in /home/forteea1/Code/clean_mains/detection-rules/rules/integrations/azure/defense_evasion_azure_service_principal_addition.toml
Error loading rule in /home/forteea1/Code/clean_mains/detection-rules/rules/integrations/google_workspace/collection_google_drive_ownership_transferred_via_google_workspace.toml
Error loading rule in /home/forteea1/Code/clean_mains/detection-rules/rules/integrations/google_workspace/initial_access_external_user_added_to_google_workspace_group.toml
Execution time: 236.1852207183838 seconds

Multi-threading just the init.

detection-rules on  multi_thread_rule_loader [!?] is  v0.1.0 via  v3.8.18 (venv) on  eric.forte 
❯ python test_rule_loader.py
Execution time: 133.23614525794983 seconds

This has been moved to the Foundational Prep Meta and put back on deck.

In effect, may be a duplicate of: #2609

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

This has been closed due to inactivity. If you feel this is an error, please re-open and include a justifying comment.