ooni / historical-geoip

Generate historical IP to country + ASN databases for processing historical OONI data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Historical GeoIP databases

The purpose of this repo is to build historical IP to country + AS databases for use in the OONI data processing pipeline, but potentially for use by the probes as well.

The data sources used for IP to country mappings are:

For mapping IP ranges to ASNs we use the prefix2as mappings from CAIDA.

For mapping ASNs with metadata about the organization, we use the as AS to Organization mappings from CAIDA.

The primary entry point for running the full workflow is the following:

./update_databases.sh

In order to upload the built artifacts to archive.org, you should have the set IA_ACCESS_KEY and IA_SECRET_KEY environment variables.

The workflow for generating the final artifacts (the IP to country + ASN mmdb files) is the following:

graph
    A[AS Organizations] --> E{{AS to ORG Map}}
    E --> D{Enrich country DB}
    B[Maxmind GeoIP2 Country] --> D
    C[DB-IP IP2Country] --> D
    F[prefix2AS] --> D
    D --> O{{IP to Country + ASN mmdb}}

Both the AS to ORG Map and the timestamped IP to Country + ASN mmdb files are published as artifacts on archive.org.

The IP to Country + ASN is compatible with the mmdb file format, but country and ASN lookup are supported inside of the same call.

The keys used in the result for returning metadata information are the following:

  • autonomous_system_number, this is an INT indicating the ASN. It's a standard key.
  • autonomous_system_organization, this is a string indicating the organization name for the given ASN. It's a standard key.
  • autonomous_system_country, is the country of registration of the AS organization. This key is non-standard.
  • autonomous_system_name, is the name of the AS, which in most cases is different from the organization name. This key is non-standard.

About

Generate historical IP to country + ASN databases for processing historical OONI data


Languages

Language:Python 72.5%Language:Go 18.7%Language:Shell 8.8%