GaelTadh / stix-shifter

This project consists of an open source library allowing software to connect to data repositories using STIX Patterning, and return results as STIX Observations.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

STIX-SHIFTER

Table of Contents

Introduction

What is STIX?

Structured Threat Information eXpression (STIX™) is a language and serialization format that organizations can use to exchange cyber threat intelligence (CTI). CTI is represented with objects and descriptive relationships and stored as JSON for machine readability.

STIX delivers a consistent and machine-readable way to enable collaborative threat analysis, automated threat exchange, automated detection and response, and more.

To learn more about STIX, see the following references:

What is STIX-SHIFTER?

STIX-shifter is an open source python library allowing software to connect to products that house data repositories by using STIX Patterning, and return results as STIX Observations.

STIX-Shifter is the heart of the Universal Data Service that is provided as part of IBM Security Connect.

What is STIX Patterning? What are STIX Observations?

STIX 2 Patterning is a part of STIX that deals with the "matching things" part of STIX, which is an integral component of STIX Indicators.

This library takes in STIX 2 Patterns as input, and "finds" data that matches the patterns inside various products that house repositories of cybersecurity data. Examples of such products include SIEM systems, endpoint management systems, threat intelligence platforms, orchestration platforms, network control points, data lakes, and more.

In addition to "finding" the data by using these patterns, STIX-Shifter uniquely also transforms the output into STIX 2 Observations. Why would we do that you ask? To put it simply - so that all of the security data, regardless of the source, mostly looks and behaves the same.

As anyone with experience in data science will tell you, the cleansing and normalizing of the data across domains, is one of the largest hurdles to overcome with attempting to build cross-platform security analytics. This is one of the barriers we are attempting to break down with STIX Shifter.

This sounds like Sigma, I already have that

Sigma and STIX Patterning have goals that are related, but at the end of the day has slightly different scopes. While Sigma seeks to be "for log files what Snort is for network traffic and YARA is for files", STIX Patterning's goal is to encompass all three fundamental security data source types - network, file, and log - and do so simultaneously, allowing you to create complex queries and analytics that span domains. As such, so does STIX Shifter. It is critical to be able to create search patterns that span SIEM, Endpoint, Network, and File levels, in order to detect the complex patterns used in modern campaigns.

What is a STIX-SHIFTER adapter?

A STIX-shifter adapter is the bridge that connects IBM Security Connect to a data source. Developing a new adapter expands on the data sources that STIX-shifter can support.

The combination of translation and transmission functions allows for a single STIX pattern to generate a native query for each supported data source. Each query is run, and the results are translated back into STIX objects; allowing for a uniform presentation of data.

The objective is to have all the security data, regardless of the data source to look and behave the same.

Why would I want to use this?

You might want to use this library and contribute to development, if any of the following are true:

  • You are a vendor or project owner who wants to add some form of query or enrichment functions to your product capabilities
  • You are an end user and want to have a way to script searches and/or queries as part of your orchestration flow
  • You are a vendor or project owner who has data that can be made available, and you want to contribute an adapter
  • You just want to help make the world a safer place!

How to use

Stix-shifter handles two primary functions:

  1. Translation Stix-shifter translates STIX patterns into data source queries (in whatever query language the data source might use) and from data source results into bundled STIX observation objects (very similar to JSON).
  2. Transmission Passes in authentication credentials to connect to a data source where stix-shifter can then ping or query the data source or fetch the query status and results.

Python 3.6 is required to use stix-shifter.

Translation

Translate a STIX 2 pattern to a native data source query

INPUT: STIX 2 pattern
# STIX Pattern:
"[url:value = 'http://www.testaddress.com'] OR [ipv4-addr:value = '192.168.122.84']"
OUTPUT: Native data source query
# Translated Query:
"SELECT * FROM tableName WHERE (Url = 'http://www.testaddress.com')
OR
((SourceIpV4 = '192.168.122.84' OR DestinationIpV4 = '192.168.122.84'))"

Translate a JSON data source query result to a STIX bundle of observable objects

INPUT: JSON data source query result
# Datasource results:
[
    {
        "SourcePort": 567,
        "DestinationPort": 102,
        "SourceIpV4": "192.168.122.84",
        "DestinationIpV4": "127.0.0.1",
        "Url": "www.testaddress.com"
    }
]
OUTPUT: STIX bundle of observable objects
# STIX Observables
{
    "type": "bundle",
    "id": "bundle--2042a6e9-7f34-4a03-a745-502e358594c3",
    "objects": [
        {
            "type": "identity",
            "id": "identity--3532c56d-ea72-48be-a2ad-1a53f4c9c6d8",
            "name": "YourDataSource",
            "identity_class": "events"
        },
        {
            "id": "observed-data--cf2c58dc-200e-49e0-b6f7-e1997cccf707",
            "type": "observed-data",
            "created_by_ref": "identity--3532c56d-ea72-48be-a2ad-1a53f4c9c6d8",
            "objects": {
                "0": {
                    "type": "network-traffic",
                    "src_port": 567,
                    "dst_port": 102,
                    "src_ref": "1",
                    "dst_ref": "2"
                },
                "1": {
                    "type": "ipv4-addr",
                    "value": "192.168.122.84"
                },
                "2": {
                    "type": "ipv4-addr",
                    "value": "127.0.0.1"
                },
                "3": {
                    "type": "url",
                    "value": "www.testaddress.com"
                }
            }
        }
    ]
}

CLI help message for translation

usage: main.py translate [-h] [-x] [-m DATA_MAPPER]
                         {qradar,dummy,car,cim,splunk,elastic,bigfix,csa,csa:at,csa:nf,aws_security_hub,carbonblack}
                         {results,query} data_source data [options]

positional arguments:
  {qradar,dummy,car,cim,splunk,elastic,bigfix,csa,csa:at,csa:nf,aws_security_hub,carbonblack}
                        The translation module to use
  {results,query}       The translation action to perform
  data_source           STIX identity object representing a datasource
  data                  The STIX pattern or JSON results to be translated
  options               Options dictionary

optional arguments:
  -h, --help            show this help message and exit
  -x, --stix-validator  Run the STIX 2 validator against the translated results
  -m DATA_MAPPER, --data-mapper DATA_MAPPER
                        optional module to use for Splunk or Elastic STIX-to-query mapping

Translation is called with the following ordered parameters

<data source (ie. "qradar")> <"query" or "results"> <{} or STIX identity object> <STIX pattern or data source results> <options>

Data source: This is the name of the module used for translation.

Query or Results: This argument controls if stix-shifter is translating from a STIX pattern to the data source query, or it’s translating from the data source results to a STIX bundle of observation objects

STIX Identity object: An Identity object is used by stix-shifter to represent a data source and is inserted at the top of a returned observation bundle. Each observation in the bundle gets referenced to this identity. This parameter is only needed when converting from the data source results to the STIX bundle. When converting from a STIX pattern to a query, pass this in as an empty hash.

STIX Pattern or data source results: The input getting translated by stix-shifter.

Options: Options arguments come in as:

  • "select_fields": string array of fields in the data source select statement
  • "mapping": mapping hash for either STIX pattern to data source or data results to STIX observation objects
  • "result_limit": integer to limit number or results in the data source query
  • "time_range": time window (ie. last 5 minutes) used in the data source query when START STOP qualifiers are absent

Example of converting a STIX pattern to (AQL) query

Running the following:

python main.py translate qradar query \
'{}' \
"[network-traffic:src_port = 37020 AND network-traffic:dst_port = 635] START t'2016-06-01T00:00:00.123Z' STOP t'2016-06-01T01:11:11.123Z'"

Will return:

{
  "queries": [
    "SELECT QIDNAME(qid) as qidname, qid as qid, CATEGORYNAME(category) as categoryname, category as categoryid, CATEGORYNAME(highlevelcategory) as high_level_category_name, highlevelcategory as high_level_category_id, logsourceid as logsourceid, LOGSOURCETYPENAME(logsourceid) as logsourcename, starttime as starttime, endtime as endtime, devicetime as devicetime, sourceaddress as sourceip, sourceport as sourceport, sourcemac as sourcemac, destinationaddress as destinationip, destinationport as destinationport, destinationmac as destinationmac, username as username, eventdirection as direction, identityip as identityip, identityhostname as identity_host_name, eventcount as eventcount, PROTOCOLNAME(protocolid) as protocol, BASE64(payload) as payload, URL as url, magnitude as magnitude, Filename as filename, URL as domainname FROM events WHERE destinationport = '635' AND sourceport = '37020' limit 10000 START 1464739200123 STOP 1464743471123"
  ],
  "parsed_stix": [
    {
      "attribute": "network-traffic:dst_port",
      "comparison_operator": "=",
      "value": 635
    },
    {
      "attribute": "network-traffic:src_port",
      "comparison_operator": "=",
      "value": 37020
    }
  ]
}

Example of converting (QRadar) data to a STIX bundle

Running the following:

python main.py translate qradar results \
'{"type": "identity", "id": "identity--3532c56d-ea72-48be-a2ad-1a53f4c9c6d3", "name": "QRadar", "identity_class": "events"}' \
'[{"sourceip": "192.0.2.0", "filename": "someFile.exe", "sourceport": "0123", "username": "root"}]' \

Will return:

{
    "type": "bundle",
    "id": "bundle--db4e0730-c5e3-4b72-9339-87ed7b1cf415",
    "objects": [
        {
            "type": "identity",
            "id": "identity--3532c56d-ea72-48be-a2ad-1a53f4c9c6d3",
            "name": "QRadar",
            "identity_class": "events"
        },
        {
            "id": "observed-data--4eec7558-2016-464a-9ab7-5f7e263f2942",
            "type": "observed-data",
            "created_by_ref": "identity--3532c56d-ea72-48be-a2ad-1a53f4c9c6d3",
            "objects": {
                "0": {
                    "type": "ipv4-addr",
                    "value": "192.0.2.0"
                },
                "1": {
                    "type": "network-traffic",
                    "src_ref": "0",
                    "src_port": "0123"
                },
                "2": {
                    "type": "file",
                    "name": "someFile.exe"
                },
                "3": {
                    "type": "user-account",
                    "user_id": "root"
                }
            }
        }
    ]
}

Transmission

With the transmission module, you can connect to any products that house repositories of cybersecurity data.

The module uses the data source APIs to:

  • Ping the data source
  • Send queries in the native language of the data source
  • Fetch query status (if supported by the APIs)
  • Fetch query results
  • Delete the query (if supported by the APIs)

CLI help message for transmission

usage: main.py transmit [-h]
                        {async_dummy,synchronous_dummy,qradar,splunk,bigfix,csa,aws_security_hub,carbonblack}
                        connection configuration
                        {ping,query,results,status,delete,is_async} ...

positional arguments:
  {async_dummy,synchronous_dummy,qradar,splunk,bigfix,csa,aws_security_hub,carbonblack}
                        Choose which connection module to use
  connection            Data source connection with host, port, and
                        certificate
  configuration         Data source authentication

optional arguments:
  -h, --help            show this help message and exit

operation:
  {ping,query,results,status,delete,is_async}
    ping                Pings the data source
    query               Executes a query on the data source
    results             Fetches the results of the data source query
    status              Gets the current status of the query
    delete              Delete a running query on the data source
    is_async            Checks if the query operation is asynchronous

Transmission is called with the following ordered parameters

<Data Source (ie. "qradar")> <Connection Params: '{"host":"host ip address", "port":"port number", "cert":"certificate"}'> <'{"auth": {authentication}}'> <Transmission Operation: ping, query, status, results or is_async> <Operation input>

Data source: This is the name of the module used for transmission.

Connection Parameters: Data source IP, port, and certificate

Data source authentication: Authentication hash

Transmission Operation: The transmission function being called. Transmission functions include:

  • Ping: ping the data source
  • Query: Execute a query on the data source. The input is the native data source query (after it has been translated from the STIX pattern).
  • Status: Check the status of the executed query on an asynchronous data source. The input is the query UUID.
  • Results: Fetch the results from the query. The input is the query UUID, offset, and length
  • Is Async Returns a boolean indicating if the data source is asynchronous

Examples of using transmission from the CLI to connect to a (QRadar) data source.

Ping
python main.py transmit qradar '{"host":"<ip address>", "port":"<port>", "cert":"-----BEGIN CERTIFICATE-----\ncErTificateGoesHere=\n-----END CERTIFICATE-----"}' '{"auth": {"SEC":"1234..sec uid..5678"}}' ping
Query
python main.py transmit qradar '{"host":"<ip address>", "port":"<port>", "cert":"-----BEGIN CERTIFICATE-----\ncErTificateGoesHere=\n-----END CERTIFICATE-----"}' '{"auth": {"SEC":"1234..sec..uid..5678"}}' query "select * from events limit 100"
Status
python main.py transmit qradar '{"host":"<ip address>", "port":"<port>", "cert":"-----BEGIN CERTIFICATE-----\ncErTificateGoesHere=\n----END CERTIFICATE-----"}' '{"auth": {"SEC":"1234..sec..uid..5678"}}' status "uuid-12345"
Results
python main.py transmit qradar '{"host":"<ip address>", "port":"<port>", "cert":"-----BEGIN CERTIFICATE-----\ncErTificateGoesHere=\n-----END CERTIFICATE-----"}' '{"auth": {"SEC":"1234..sec..uid..5678"}}' results "uuid-12345" <offset> <length>
Is Async
python main.py transmit qradar '{"host":"<ip address>", "port":"<port>", "cert":"-----BEGIN CERTIFICATE-----\ncErTificateGoesHere=\n-----END CERTIFICATE-----"}' '{"auth": {"SEC":"1234..sec..uid..5678"}}' is_async

Glossary

Terms Definition
Modules Folders in the stix-shifter project that contains code that is specific to a data source.
STIX 2 patterns STIX patterns are expressions that represent Cyber Observable objects within a STIX Indicator STIX Domain Objects (SDOs).
They are helpful for modeling intelligence that indicates cyber activity.
STIX 2 objects JSON objects that contain CTI data. In STIX, these objects are referred to as Cyber Observable Objects.
Data sources Security products that house data repositories.
Data source queries Queries written in the data source's native query language.
Data source query results Data returned from a data source query.

Architecture Context

STIX SHIFTER CLASS DIAGRAM

Contributing

We are thrilled you are considering contributing! We welcome all contributors.

Please read our guidelines for contributing.

Guide for creating new adapters

If you want to create a new adapter for STIX-shifter, see the developer guide

Licensing

©️ Copyright IBM Corp. 2018

All code contained within this project repository or any subdirectories is licensed according to the terms of the Apache v2.0 license, which can be viewed in the file LICENSE.

Open Source at IBM

Find more open source projects on the IBM GitHub Page

About

This project consists of an open source library allowing software to connect to data repositories using STIX Patterning, and return results as STIX Observations.

License:Apache License 2.0


Languages

Language:Python 99.2%Language:ANTLR 0.8%