Zorus / TldExtract

.NET Library to extracts the root domain, subdomain name, and top level domain from a host name using the Public Suffix List

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TldExtract

This .NET Standard 2 library and Command line tool extracts the root domain, subdomain name, and top level domain from a URL, using the the Public Suffix List.

This is based on the Go TldExtract and the Python TldExtract libraries from:

This library is useful as it uses the public database to sort out what is the domain, the TLD and the root domain, without using assumptions as to what the contents are.

Some examples:

Host name Subdomain Root domain Top-level Domain
www.google.co.uk www google co.uk
forums.news.cnn.com forums.news cnn com
google.notavalidsuffix google notavalidsuffix
media.forums.theregister.co.uk media.forums theregister co.uk
www.cgs.act.edu.au www cgs act.edu.au
joe.blogspot.co.uk joe blogspot.co.uk
wiki.info wiki info

Usage

Add the TldExtract library NuGet package to your solution, and then create an instance of the NStack.TldExtract class. You can either provide a path to the cache file where you want the public suffix list to be downloaded or nothing and the library will choose the proper cache location for you.

Then invoke the Extract method that will return a tuple of values with the subdomain, the root domain and the TLD domain.

Examples


var extractor = new NStack.TldExtract ();
(var sub, var root, var tld) = extractor.Extract ("www.microsoft.com");

About

.NET Library to extracts the root domain, subdomain name, and top level domain from a host name using the Public Suffix List

License:MIT License


Languages

Language:C# 100.0%