Sam Ehrlich's repositories
bat_speed
Exploration of the new bat speed and swing length data in statcast
StuffModel6-24-24
A revision to my stuff model from last year with the inclusion of new features, parameter tuning, and visualizations
RiotAWSHackathon
RiotAWSHackathon
BaseballSavantShinyVisuals
Visualizations recreated from statcast data into a shiny app
amazon-sagemaker-examples
Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
BEA_API_Visualizations
In this notebook, I connect to the Bureau of Economic Analysis API to request data on GDP for all counties in the US from 2001-2021. I then do some cleaning and make some visualizations out of the data.
GCP_reddit_sentiment
Using GCP to scrape reddit rss feed into GCP's NLP and then store all data into a database. Then connect to the database from an outside source using an IP and querying/visualizing the results.
PitchLocationShiny
Pitching summary for pitchers in the 2016 world series
SentimentAnalysisReddit
This project requests posts from various subreddits. Once the text posts are loaded, they are transferred to a sql database where they are converted to tsvectors for database querying. The text is analyzed for sentiment and then visualized overall sentiment across many weeks.
GolfProject
This was my work portion of a summer project. In this project I analyze factors that may impact the performance of golfers in the PGA from 2015-2022.
SoniqsDataAnalysis
In this project, I scrape data from the SiegeGG site to analyze games from the Rainbow 6 Siege playoff games on Feb 22, 2022.
Riot_API_project
In this project I extract, transform and load in data from the Riot games API. I pull data from my recent games I have played and use the data to create visualizations and predictive models to learn more about my games. I am still working on this project, but I wanted to get my start posted here.
SuperbowlSQLQueries
I query different data from a superbowl data table. I download the data through SQLalchemy and then query the data in the server provided by my school. I copy and paste the results of my query into markdown cells.
SQLSpotifyTop50
This is a notebook that queries a dataset of Spotify's Top 50 songs of 2019 in different ways. I use python's SQLalchemy to convert the csv to SQL. I use PGSQL to query in my terminal provided by the University of Missouri.
SpotifyTop50-2019
This is a graph I made in ggplot from a csv of the top 50 songs from spotify in 2019. This is made in R using ggplot2 to show some graphing skills.
SQLChicagoCrimeQueries
In this notebook, I import a csv file into pandas. I clean the data and import the data into a sql server. I then query the data for useful related to crime in Chicago.
DataVisualizationFinal
This is my final for DSA7040. In this project, I explore a steam dataset that had the top 100 video games for each month from 2012-2021. I use ggplot to plot different graphs that tell as story of popularity throughout the years.