Parse-and-Enrich - finding indicators in doc(x), xlsx(x), pdf, txt, and csv

This script searches for indicators in doc(x), xls(x), pdf, txt, and csv files. It also enriches IP addresses with additonal data from ipinfo.io.

Default indicators that it looks for:

URL's
E-mail addresses
mobile phone numbers
IP addresses
MD5 hashes
SHA1 hashes
SHA256 hashes
custom indicators

It outputs them to a csv file:

  $ Parse-and-Enrich.py -i Input/*.csv

  2022-08-11_132728_results.csv

  | Regex result  | Count | Type      | Found in file(s) | City          | Country | Organization       | Full | Error
  |---------------|-------|-----------|------------------|---------------|---------|--------------------|------|------
  | 8.8.8.8       | 4     | ipaddress | ['file1.txt']    | Mountain View | US      | AS15169 Google LLC | ...  |
  | j@mail.com    | 1     | email     | ['random.csv']   |               |         |                    |      |

How to enrich Office365 UAL logs (or other CSV's).

The script will take Office 365 UAL logs in the form of CSV files and enrich IP addresses with data from ipinfo.io.

Input (AuditRecords.csv):

timestamp, user, ip
2022-08-11 13:05:01, user1@company.nl, 8.8.8.8

Command:

  $ Parse-and-Enrich.py -i Input/AuditRecords.csv -csv_e

Output (AuditRecords.csv_enriched.csv):

timestamp, user, ip, ip_info
2022-08-11 13:05:01, user1@company.nl, 8.8.8.8, {"ip": "8.8.8.8", "hostname": "dns.google", "anycast": "True", "city": "Mountain View", "region": "California", "country": "US", "loc": "37.4056,-122.0775", "org": "AS15169 Google LLC", "postal": "94043", "timezone": "America/Los_Angeles", "country_name": "United States", "latitude": "37.4056", "longitude": "-122.0775"}

Requirements

If you would like to use the enrich feature, you need to put the API key (https://ipinfo.io/) in the file 'ip_info.key'. The file must be located in the same folder as the script.

Limitations

If you provide multiple csv files specified with the '-i' parameter (for example: -i input/*csv), they must be of the same encoding type. You can specify the encoding type that the csv's have: -csv_c UTF8 (see all encoding types: https://docs.python.org/3/library/codecs.html#standard-encodings)

DfirJos / Parse-and-Enrich

Parse-and-Enrich - finding indicators in doc(x), xlsx(x), pdf, txt, and csv

How to enrich Office365 UAL logs (or other CSV's).

Requirements

Limitations

About

Languages