adamcysec / Scrape-Browser-Extensions

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Scrape-Browser-Extensions

This repo contains python scrips for scraping all visible browser extensions for Chrome and Edge.

The scripts output a CSV file containing the extension id and the extension name with the following fields: id, name

You can view the CSV data zipped in directory /extension data/

Why scrape browser extensions?

Malicious Extensions

Malicious browser extensions typically steal user data or visit advertisment related links in the background.

By the time security news outlets have reported on these malicious extensions, the extensions have already been removed from the extension store. However the extension is not removed from the end user's browser until the user uninstalls it (that's if they ever do).

There are some repositories that aim to compile a list of malicious browser extension ids in order to scan for malicious extensions in a workplace environment. However, the work compiling the list is very manual and involves googling/tracking news articles.

My idea - scrape extenions

If we take a snapshot of every extension on the Chrome/Edge store, then we would know what extensions are allowed on the store. We could then compare that list to a list of extensions installed in the workplace environment and might discover users with extensions no longer on the broswer store.

There are several reasons an extension is removed from the store:

  1. Extensions that come preinstalled in broswers were removed from the store.
  2. The developer decided to remove their extension from the store.
  3. Chrome/Edge removed the extension from their store (probably for good reasons).

Visible browser extensions?

I say visible, because it's possible for a developer to "private" list their extension. For example, extension SavingsScout is not searchable on the Chrome webstore, but can be viewed if you know the url.

The Tools

scrape-chromeWebstore.py

This script scrapes extension data by enumerating the Chrome sitemaps.

Dependencies

Example

py scrape_chromeWebstore.py

  • the CSV is saved in the current working directory

Output

starting work on 16 cores
--- 340.9100058078766 seconds ---
file saved: chrome_webstore_extensions_2022-12-02.csv

scrape-EdgeAddons.py

This script scrapes extension data by enumerating the page numbers for each category in Microsoft's store api.

Example

py scrape_EdgeAddons.py

  • the CSV is saved in the current working directory

Output

starting work on 16 cores
--- 17.575310230255127 seconds ---
file saved: edge_extensions.csv

About


Languages

Language:Python 100.0%