randleon

Randy Leon's repositories

Information-Architectures

assignments and projects for Yeshiva University's Katz School Information Architectures course, spring 2020

Language:Jupyter Notebook2 10

👋 Hi, I’m Randy! 👀 I’m interested in becoming a data scientist 🌱 I’m currently learning Python, SQL, Tableau, and AWS 💞️ I’m looking to collaborate on beginner to intermediate data science projects to showcase some skills! 👀 Some of my interests including weightlifting, geopolitics, and Yu-Gi-Oh the Card Game.

1 10

Analytics-Programming

assignments and projects for Yeshiva University's Katz School Analytics Programming course, fall 2019

Language:Jupyter Notebook1 10

BI-Dashboards

Sample dashboards to showcase my work in database and BI tools

GPL-3.01 10

Cleaning-a-Messy-Data-Set-w-Python

Cleaning a wine data set using Python 3 in a Jupyter Notebook. Packages include Seaborn, NumPy, and Sklearn.

Language:Jupyter Notebook1 10

MS-Excel-Projects-Using-Healthcare-Data

Projects done using MS Excel

1 10

Naive-Bayes-Sentiment-Analysis_Using_Beautiful_Soup

Naïve Bayes classifiers are widely recognized for their efficacy at classifying text data (e.g., sentiment analysis). Many organizations rely on sentiment analysis algorithms to help them gauge the opinions of both existing and potential customers. Sentiment analysis algorithms to the online product/service reviews help influence business decisions

Language:Jupyter Notebook1 10

Portfolio

in progress

1 20

Structured-Data-Management-SQL-

assignments and projects for Yeshiva University's Katz School Structured Data Management course, fall 2020

1 10

Capstone

Final Project at YU

010

Data-Analysis

010

Decision-Tree-versus-Random-Forest-Performance-on-NY-State-Graduation-Data

Decision trees and random forest models can both be very effective when applied to classification problems. We compared the performance vs. complexity payoff between both models in this example using Pandas and NumPy

Language:Jupyter Notebook010

Linear-Regression-Using-Sklearn-in-Python

Linear Regression project on automobile data featuring checks using k-fold cross validation.

Language:Jupyter Notebook010

Querying-a-Basketball-Ticket-Dataset

010

REFERENCE-stats_tutorials

Language:Jupyter NotebookGPL-3.0000

Visual-Design-and-Storytelling

assignments and projects for Yeshiva University's Katz School Structured Visual Design and Storytelling, fall 2019

010

Binary-Logistic-Regression-On-Insurance-Company-Data

Language:Jupyter Notebook010

Can-we-predict-if-a-mushroom-is-poisonous-

Prepared the UCI Mushroom data for construction of predictive models. My team and I also cross-trained the models for accuracy and precision.

Language:Jupyter Notebook010

Clustering_and_SVM_to_Predict_Online_Purchases

Particular interest to most online retailers is whether or not a site visitor ends up executing a purchase while engaged with the web site. We used supervised learning methods such as K-nearest neighbors and support vector machines in Python to predict whether or not online shoppers were more willing to make a purchase.

Language:Jupyter Notebook010

Excel-Exercise-Using-Shipping-Data

Allocated columns of missing data to other workbooks. I transformed and concatenated the column data across using VLOOKUP and then combined the tables all into one new sheet for reference.

010

Feature-Selection-and-Dimensionality-Reduction

Data science project applying feature selection/dimensionality reduction techniques to identify the explanatory variables to be included within a linear regression model that predicts the number of times an online news article will be shared using Python 3 in a Juypter Notebook.

Language:Jupyter Notebook010

Implementing-a-Series-of-Regression-Models-on-School-Data

Constructing and compare/contrast a series of regression models that predict the number of student “dropouts” in a school dataset relative to certain properties/characteristics of a given school district + associated student subgrouping.

Language:Jupyter Notebook010

K-Nearest-Neighbors-and-Support-Vector-Machine-Models-on-Insurance-Data

Python project that used KNN and SVM models to classify insurance data found on Kaggle.com

Language:Jupyter Notebook010

MLB_SQL_Queries

The data provided is a compilation of historical baseball data in a convenient, tidy format, distributed under Open Data terms. This data was downloaded from Sean Lahman's website. The data is comprised of the following tables: • People – player names, DOB, and biographical info • Batting – batting statistics

010

nyc-oath-hearings-analysis

Exploratory analysis of NYC OATH Hearings dataset (violations, decisions, and collections) using Python + PostgreSQL.

Language:Python000

Sales-Dashboard-PowerBI

Making a dynamic sales dashboard from sample sales data

010

Sentiment-Analysis---A-Machine-Learning-Approach-into-Hideo-Kojima-s-Divisive-Platformer

Our team sought to perform sentiment analysis on Twitter tweets in anticipation for Hideo Kojima's video game release, Death Stranding, in 2019. We sourced the Tweets from two libraries, preprocessed them, stored them using MongoDB and then performed sentiment analysis.

Language:Jupyter Notebook010

Understanding-Classification-Model-Performance-Metrics-On-Diabetes-Dataset

Evaluation of the performance of classification models can be facilitated through a combination of calculating certain types of performance metrics and generating model performance evaluation graphics. The purpose of this exercise is to calculate a suite of classification model performance metrics via Python code functions.

Language:Jupyter Notebook010