Benita Diop's repositories
FullStackBigData-with-SPARK
Pulled 10GB ofYelp Business data through the terminal via Kaggle API. The data was then pushed to and AWS S3 Bucket bucket for storage and analyzed on a Elastic MapReduce Cluster on a Jupyter Notebook using PySpark
MiniBOONEClassification
In this dataset, we have 130K observations with 50 features. The features are measurements of Cherenkov light and scintillation light using hit topology and timing. There are 36.5K observations for electron neutrinos and 93.5K observations for muon neutrinos, which yields an imbalance ratio of 0.39
AnalysisOfCrimeInIndia
The dataset that I am performing this regression analysis on, comes from Kaggle, titled crimes In India. This dataset holds complete information about various aspects of crimes that have taken place in India in a 17 year span, from 2001 to 2018.
PythonMicroserviceDeployment_SocrataAPI
In this project I leveraged Socrata OPCV API data to build a pipeline of logs from Docker Container to the Elasticsearch, Kibana Stack where data was collected, analyzed and transformed into visuals. Scripts were written in python and polished to be able to take in command line arguments from UNIX/LINUX operating systems. The scripts were tested for reproducibility by provisioning, configuring and executing an AWS EC2 instance which ran on a Docker container, read-in the python script, parsed in parameters from the command line and pull the API JSON logs. Additionally Git was utilized to maintain version control and prevent the confliction of concurrent work.
StatisticalHypothesisTesting
Hypothesis testing four datasets on SAS using Hotelling t-squared hypothesis testing tool to validate all parametric estimates and to conclude if to accept or to reject the given hypothesis
coding-interview-university
A complete computer science study plan to become a software engineer.
courses-introduction-to-sql
Introduction to SQL by Nick Carchedi
DataStructures-Algorithms
Master Data Structures & Algorithms With Me =]
DesignPatterns101
Lets Fall In Love With Design Patterns Together !
Linear-Algebra
Linear Algebra
NaturalScienceSeminar
Returning to my alma mater to give a talk on statistics and data science. This repo is to provide attendees of the Natural Science Seminar with all the material covered during the talk.
OOP-in-Python
Master Object Oriented Programming in Python With Me =]
python_advanced
Μίνι σειρές στην Python (έπονται της βασικής σειράς)
SQL-Leetcode-Challenge
Contains all the 117 Leetcode questions with their solutions ranging from Easy to Hard in MySQL.