avgra3 / Medical_Data_Wrangling

Prepare and load text data into a MariaDB database

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Medical Data Wrangling

Pytest

The goal of this project is to automate the loading of data into a MariaDB database with data validation.

Setup

Prepare Environment variables

Rename EXAMPLE.env to .env and update the data path with the path to your data as a string. Then, update the data types with a dictionary with the column names as keys and data types for values of the dictionary. Follow the example as below:

DATA_PATH = './path/to/file'
DATA_TYPES = '{"column_name_1": dataType_1, "column_name_2": dataType_2}'

Anaconda/Miniconda

Windows

From the command line or powershell, with Anaconda installed, run the below

conda env update --file environment.yml --name data_loading
conda activate data_loading

Linux (Untested)

From your terminal, with Anaconda/Miniconda installed, run the below.

conda env update --file environment.yml --name data_loading
conda activate data_loading

NOTE Before running the code, be sure to have MariaDB C connector installed.

Pip Method

You will need to verify that all packages are useable on your machine and installable using pip. Also, it is reccomendded that you use a virtual environment such as pyenv. As with the Anaconda/Miniconda method, make sure to have MariaDB C connector available on your machine.

About

Prepare and load text data into a MariaDB database


Languages

Language:Python 100.0%