These set of scripts fetch data on Tor exit nodes from the only - as of now - currently known valid and authoritative source: https://check.torproject.org/exit-addresses
Upon downloading the data, it gets inserted into a DB (directory: 'db/') There is a web interface written in python+flask in 'www/'
www/ -> website stuff
db/ DB structure and initial data import mechanisms
data/ data directory
- wget
- postgresql 9.3 or higher
- python
- flask
- webserver such as nginx
See the requirements.txt file in www/tor
$ cd db
$ sudo su
# su - postgresql
$ createuser -s userename
$ psql template1 < db.sql
- Activate any virtualenv or conda environment in case you use that to install the prerequisites.
- Test if fetching the data works:
First, make sure that the newly created user tordb
may access the tables and the DB and add it to the postgresql pg_hba.conf file.
$ ./fetch-tor-list.sh
psql -U tordb tordb_simple
select count(*) from node
you should see a non-zero result.
If it works, you can continue to run this automatically...
The execution of fetch-tor-list.sh
is expected to output a lot of error messages like this:
ERROR: duplicate key value violates unique constraint "idx_node_combined"
DETAIL: Key (node_id, ip, exit_address_ts, id_nodetype)=(0011BD2485AD45D984EC4159C88FC066E5E3300E, 162.247.74.201, 2019-08-08 09:12:18+02, 1) already exists.
They can/should be ignored.
$ crontab -l
(...)
# fetch the list once a day at 1:05 A.M.
# m h dom mon dow command
5 01 * * * ( cd /home/your_user/torexitnodes_simple; source venv/bin/activate ; ./fetch-tor-list.sh >/dev/null 2>&1 )
(Note that this assumes you installed the prerequisites via virtual-env).
The built-in webserver of flask is a no-no for production environments. Hence, please follow the great documentation on production setups. The instructions vary depending on which web server you use.