lord-alfred / ipranges

🔨 List all IP ranges from: Google (Cloud & GoogleBot), Bing (Bingbot), Amazon (AWS), Microsoft, Oracle (Cloud), GitHub, Facebook (Meta), OpenAI (GPTBot) and other with daily updates.

Home Page:https://t.me/Lord_Alfred

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to reduce the number of range to optimise the search of an IP

jmleglise opened this issue · comments

Hi, First , thank you very much for your list. Very usefull !
(not an issue but a comment for all of us who would like to optimise the search of an IP belonging to a range)

Let's take this 2 ranges for example in your file /amazon/ipv4_merged.txt
R1 : 3.0.0.0/15 and R2: 3.2.0.0/24. This 2 ranges are contiguous and should be merged. (But, is not possible in the CIDR notation)

"/15" means 2^(32-15) = 131072 addresses
I convert IP in décimal value :
R1 : 3.0.0.0/15 Starts at 3x256x256x256+0x256x256+0x256+0 and finishes to this number + 131072
R2 starts at : 3x256x256x256+2x256x256+0x256+0

beginning(R1) + lenght (R1) = beginning (R2)

So the 2 ranges are contiguous and should be merge in a new ranges from beginning(R1) to End(R2). That the reason, I prefer store the IP range in decimal with a start and an end.

There are 3200 ranges in the full merged list. And 900 are contiguous.

Thank you very much for your comment, it is an interesting observation. I'm not good at CIDR notation and addresses, so this problem is not obvious to me.

Let's be clear just in case: I use a small script with the netaddr library and method cidr_merge to merge addresses.
Perhaps I should use method spanning_cidr, but wouldn't that break something else? I found a similar issue in the library: netaddr/netaddr#27

Example:

In [1]: import netaddr

In [2]: netaddr.cidr_merge(['3.0.0.0/15', '3.2.0.0/24'])
Out[2]: [IPNetwork('3.0.0.0/15'), IPNetwork('3.2.0.0/24')]

In [3]: netaddr.spanning_cidr(['3.0.0.0/15', '3.2.0.0/24'])
Out[3]: IPNetwork('3.0.0.0/14')

Do you think you can improve the merging process in the script?