samdenty / site_icons

Efficient website icon scraper for rust, with sizes, ordering, and WASM support

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

site_icons

Crates.io Documentation GitHub Sponsors

An efficient website icon scraper for rust or command line usage.

Features

  • Super fast!
  • Partially downloads images to find the sizes
  • Can extract a site logo <img> using a weighing system
  • Works with inline-data URIs (and automatically converts <svg> to them)
  • Supports WASM (and cloudflare workers)

Rust usage

use site_icons::SiteIcons;

let mut icons = SiteIcons::new();
// scrape the icons from a url
let entries = icons.load_website("https://github.com", false).await?;

// entries are sorted from highest to lowest resolution
for icon in entries {
  println!("{:?}", icon)
}

Command line usage

First install the binary:

cargo install site_icons

then run either:

For text output:

Command:

site-icons https://github.com

Output:

https://github.githubassets.com/favicons/favicon.svg site_favicon svg
https://github.githubassets.com/app-icon-512.png app_icon png 512x512
https://github.githubassets.com/apple-touch-icon-180x180.png app_icon png 180x180
For JSON output:

Command:

site-icons https://reactjs.org --json

Output:

[
  {
    "url": "",
    "headers": {},
    "kind": "site_logo",
    "type": "svg",
    "size": null
  },
  {
    "url": "https://reactjs.org/icons/icon-512x512.png?v=f4d46f030265b4c48a05c999b8d93791",
    "headers": {},
    "kind": "app_icon",
    "type": "png",
    "size": "512x512"
  },
  {
    "url": "https://reactjs.org/favicon.ico",
    "headers": {},
    "kind": "site_favicon",
    "type": "ico",
    "sizes": ["64x64", "32x32", "24x24", "16x16"]
  },
  {
    "url": "https://reactjs.org/favicon-32x32.png?v=f4d46f030265b4c48a05c999b8d93791",
    "headers": {},
    "kind": "site_favicon",
    "type": "png",
    "size": "32x32"
  }
]

Sources

  • HTML favicon tag (or looking for default /favicon.svg / /favicon.ico)
  • Web app manifest icons field
  • <img> tags on the page, directly inside the header OR with a src|alt|class containing the text "logo"

Running locally

git clone https://github.com/samdenty/site_icons
cd site_icons
cargo run https://github.com

About

Efficient website icon scraper for rust, with sizes, ordering, and WASM support


Languages

Language:Rust 100.0%