9b / boxcar

Process the fortune 1000 domains to identify live typo-sites. Save results into a database for later processing and analysis.

Home Page:http://www.riskiq.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Boxcar Processor

Boxcar processor is a simple system that takes the fortune 1000 company domains and puts them through a typo-permutation engine. These return values are stored locally so save time and then resolved in order to identify which domains are active and which aren't. Results are saved inside of a local MongoDB instance that can later be used to guide processing later on or generate an output report.

Purpose

The fortune 1000 are prime targets for phishing attacks and brand infringing events. Boxcar is meant to be a first-pass at these companies' primary domains in order to understand if any permutation of the domain is active and online. This data can be useful not only for the company, but for security companies looking to understand how abuse can begin.

Usage

In order to run, make sure you have a local mongo instance listening on 27017 and install the requirements:

python install -r requirements.txt

Then kick off the actual worker process (this will run a while):

python run.py

Output is set to DEBUG by default and will let you know what's happening as data is being processed. Various configuration options exist at the top of the run.py file and can be adjusted to meet your needs.

Data

Records obtained from boxcar are stored inside of mongoDB. There are two primary collections that this tool uses to function, 1) perms and 2) resolves. Perms stores the permutations generated by the misspelling library, so they don't need to be processed every time and resolves keeps the data from the processing.

A sample perm record:

{
"perms" : [
"gre.com", "ee.com", "gwe.com", "he.com", "gw.com", "fge.com",

], "seed" : "ge.com"

}

A sample resolve record:

{
"status" : "failed", "domain" : "tiiaaref.org", "ip" : null, "datetime" : "2016-11-26 19:02:12", "seed" : "tiaa-cref.org"

}

Extras

Tools for processing help can be found within app/tools. Additionally, there is a report.py file that will extract the data from the mongoDB collections and place them in a CSV report. A sample report has been placed inside of app/samples.

About

Process the fortune 1000 domains to identify live typo-sites. Save results into a database for later processing and analysis.

http://www.riskiq.com/


Languages

Language:Python 100.0%