zhunhung / person_scraper

A scrapper to identify whether a person is of interest against key databases.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Person Scraper

Given a name of a person, the scraper will cross check against the following databases:

  • OFAC SDN
  • The Panama Papers
  • UN Sanctions
  • US Sanctions
  • MAS Sanctions and Freezing of assets
  • PEP databases
  • CIA Database
  • Credit bureaus
  • Facebook
  • LinkedIn
  • Twitter
  • Criminal Records
  • Court Records
  • Google
  • FATF
  • Reddit

Prerequisites

The code runs on Python 3.X and these are the packages you need:

pandas
beautifulsoup4
urllib
python-linkedin

Installing

Install the following packages if you have not:

pip install pandas
pip install urllib
pip install beautifulsoup4
pip install python-linkedin

Running the scrapper

Here's an example if you want to scrape on "Osama Bin Laden"

python scraper.py -n "Osama Bin Laden"

And the output will be something like this:

Checking for Osama Bin Laden...
CSL check:
Found 1 matches in CSL
Panama Papers check:
Found 0 matches in Panama Papers

The search results can be found in the results/name folder

person-scraper/
|-- scrapers/
|   |-- CSL.py
|   |-- panama.py
|
|-- results/
|   |-- Osama_Bin_Laden/
|   |   |-- Osama_Bin_Laden_CSL.json
|   |   |-- Osama_Bin_Laden_PanamaPapers.csv
|
|-- scraper.py
|-- README
|-- .gitignore

About

A scrapper to identify whether a person is of interest against key databases.


Languages

Language:Python 100.0%