xyla-io / almacen

A Pythonic ETL application for multi-source mobile marketing data.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Almacén

Scheduling the fetching and processing of raw reports, and archiving to the data warehouse.

Install PostgreSQL

Ubuntu

apt update
apt install postgresql
apt install postgresql-server-dev-all

OS X

brew install postgresql

Install Python Virtual Environment

Ubuntu

apt update
apt install software-properties-common
add-apt-repository ppa:deadsnakes/ppa
apt install python3.7
apt install python3.7-venv
apt install python3.7-dev

OS X

brew install python3
python3 -m venv venv

Install R

Install R as root for statistical tools.

Ubuntu

echo "deb http://cran.rstudio.com/bin/linux/ubuntu trusty/" >> /etc/apt/sources.list
gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
gpg -a --export E084DAB9 | sudo apt-key add -
apt-get update
apt-get install r-base
# verify that R was installed
R --version
# install system dependencies for the R devtools packages
apt-get install build-essential libcurl4-gnutls-dev libxml2-dev libssl-dev

OS X

brew install r
brew install libgit2

Install required R packages

Install R packages as the user hosting Almacén.

# In R

# CRAN packages
install.packages("devtools")

# devtools packages
library(devtools)
devtools::install_github('adjust/api-client-r')

Run the install script

./install.sh

Set up database environment

source venv/bin/activate
python prepare.py -t all -s <SCHEMA>
deactivate

Run tests

Tests should be run with the command

python -m pytest [--pdb] [-s] [-k TEST_NAME] [TEST_PATH]

Running tests with the pytest command will result in import errors, which it may be possible to fix.

About

A Pythonic ETL application for multi-source mobile marketing data.

License:MIT License


Languages

Language:Python 98.8%Language:PLpgSQL 0.9%Language:Shell 0.3%