tobilg / public-cloud-provider-ip-ranges

Unified datasets for public cloud provider IP ranges. Providers include AWS, Azure, CloudFlare, DigitalOcean, Fastly, Google Cloud and Oracle Cloud.

Home Page:https://tobilg.com/gathering-and-analyzing-public-cloud-provider-ip-address-data-with-duckdb-observerable

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

public-cloud-provider-ip-ranges

This repository provides unified and cleaned datasets for public cloud provider IP ranges, as CSV and Parquet files.

Data sources

The following public cloud providers are covered by this repo:

Generated data

The generated data can be found in the data directory.

All providers combined

Single providers

Data format

There are three versions of each dataset, a CSV and a Parquet version in columnar style, and a JSON version.

Columnar schema

The data format of both columnar versions looks like this:

Column name Data type Description
cloud_provider VARCHAR The public cloud provider name
cidr_block VARCHAR The CIDR block, e.g. 10.0.0.0/32
ip_address VARCHAR The IP address, e.g. 10.0.0.0
ip_address_mask INTEGER The IP address mask, e.g. 32
ip_address_cnt INTEGER The number of IP addresses in this CIDR block
region VARCHAR The pubilic cloud provider region information (if given)

JSON schema

The JSON schema of the exported data is

{
    "$schema": "https://json-schema.org/draft-07/schema",
    "title": "Public Cloud Provider JSON export schema",
    "type": "array",
    "items": {
        "title": "CIDR block record",
        "type": "object",
        "required": [
            "cidr_block",
            "ip_address",
            "ip_address_mask",
            "ip_address_cnt",
            "region"
        ],
        "properties": {
            "cidr_block": {
                "title": "The CIDR block",
                "type": "string",
                "examples": [
                    "129.146.0.0/21"
                ]
            },
            "ip_address": {
                "title": "The IP address",
                "type": "string",
                "examples": [
                    "129.146.0.0"
                ]
            },
            "ip_address_mask": {
                "title": "The IP address mask",
                "type": "integer",
                "examples": [
                    21
                ]
            },
            "ip_address_cnt": {
                "title": "The number of IP addresses in this CIDR block",
                "type": "integer",
                "examples": [
                    2048
                ]
            },
            "region": {
                "title": "The respective public cloud provider region",
                "type": "string",
                "examples": [
                    "us-phoenix-1"
                ]
            }
        },
        "examples": [{
            "cidr_block": "129.146.0.0/21",
            "ip_address": "129.146.0.0",
            "ip_address_mask": 21,
            "ip_address_cnt": 2048,
            "region": "us-phoenix-1"
        }]
    }
}

Automatic updates

The CI pipeline will check for AWS IAM docs updates everyday at 4AM UTC, and automatically publish a new patch version if updates are detected.

About

Unified datasets for public cloud provider IP ranges. Providers include AWS, Azure, CloudFlare, DigitalOcean, Fastly, Google Cloud and Oracle Cloud.

https://tobilg.com/gathering-and-analyzing-public-cloud-provider-ip-address-data-with-duckdb-observerable

License:MIT License


Languages

Language:Shell 100.0%