mlinksva / c-country-visits

Extract number of homepage visits by country per day and see them in a CSV.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

collect.js looks at gzipped web/cache server log files in Common Log Format, extracts for each day the number of home page hits from each unique IP address, geo-country-locates each IP address, stores in sqlite database.

report.js looks at sqlite database and prints in CSV format the number of unique IP addresses for each country for each day.

These scripts are motivated by a research request for data showing changes in visits by country over time. To comply with privacy policies and not transfer huge log files, extraction had to be done on machine where logs stored. The output of report.js is provided to researcher.

The host to look for home page accesses to is currently hardcoded in collect.js.

There probably is a simple way to extract the desired report from an existing log analyzer; I threw this together instead because I felt like trying ~shell scripting with javascript, which I'd never done before and it probably shows.

#Installation

Clone from git.

You'll need node installed, and the following npms:

  • geoip
  • lazy
  • sqlite

GeoIP.dat is included, and its location hardcoded in collect.js (assumed same directory as script). To download a new copy of GeoLite Country visit http://dev.maxmind.com/geoip/legacy/geolite/ but there is probably no reason to.

#Usage

  node collect.js dbfile <logfiles>

  node report.js dbfile

collect.js does not handle bad files; if you have some in your log directory you might invoke collect.js for each file; then the bad files won't be processed rather than halting all processing, eg:

  find path/to/logs -type f -exec node collect.js a.db {} \;

collect.js may run for a long time after printing to the console that it has processed files. It is making sqlite inserts.

Sqlite may be the reason collect.js is very slow; I haven't bothered trying with a database server.

#License

Code is dedicated to the public domain under CC0.

This product includes GeoLite data created by MaxMind, available from http://www.maxmind.com. GeoIP.dat is distributed under CC-BY-SA.

About

Extract number of homepage visits by country per day and see them in a CSV.

License:Creative Commons Zero v1.0 Universal


Languages

Language:JavaScript 100.0%