A Xyla scraper.
Download and install Microsoft Visual Studio Code from https://code.visualstudio.com/download
Open the app and select the terminal tab in the bottom pane to run terminal commands for the remaining installation steps.
Git is a version control system used to manage development of the Raspador codebase.
Install Xcode using the App Store app, and open the Xcode app to install the command line tools.
To check that Git is installed, run this terminal command
which git
# the path to the git executable should be printed
# if nothing is printed, git is not installed
# if git is installed clone the Raspador repo
git clone https://github.com/xyla-io/raspador.git
Download the Git for Windows Setup from https://git-scm.com/download/win and install git.
git clone https://github.com/xyla-io/raspador.git
Raspador is written in Python and requires Python 3 to be installed.
Homebrew is a package manager for OS X, similar to a free, command-line app store (See https://brew.sh/).
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
With homebrew, install Python 3.6.1
brew install python3
brew switch python3 3.6.1
Download and install Python 3.6.1 from https://www.python.org/downloads/windows/
geckodriver
allows the selenium python package to drive Firefox.
Create a virtual Python environment for running Raspador.
# in the raspador root directory
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
cd development_packages/data_layer/packages/mysql-connector-python-2.1.7
python setup.py install
cd ../..
python setup.py develop
cd ..
deactivate
Create a user profile in Firefox for the scraper to use.
Open the raspador root directory in Visual Studio Code and Run Raspador from the terminal.
source .venv/bin/activate
python main.py <CONFIGURATION> <STARTDATE> <ENDDATE>
apt-get update
apt-get install docker.io
# add permissions for the user who will run docker images
usermod -a -G docker <USER>
# in the project root
docker build -t raspador .
docker run --rm --privileged -p 4000:4000 -it raspador bash /usr/src/app/run.sh --help