Derek Lilienthal's repositories
Multiclass-Text-Classification-with-DistilBERT-on-COVID-19-Tweets
I implement a deep learning network to classify COVID-19 Tweets into 5 categories and 3 categories using DistilBERT (a lighter version of BERT) as an embedding layer along with an LSTM and Dense Layer. I Achieve 65% accuracy with 5 categories and 80% accuracy on 3 categories.
GDELT-Research-In-The-South-Pacific
In this research, we quantify foreign actors media events in the South Pacific involving an environmental theme. We use the GDELT (Global Database of Events, Language, and Tone) database to compare the tones of articles produced by Western, Chinese, and South Pacific (Local) media sources that involve an environmental theme and when a great power (United States, China, Australia, New Zealand, Japan, and Russia) is involved as an actor. We found when comparing Western, Chinese, and Local news sources, the average sentimental analysis of Western tones is negative, the average of Local tones is slightly positive, and the average of Chinese tones are very positive. When comparing the difference in means by each set of news sources, we used the Welch’s two sample t-test because the distribution of Western, Chinese, and Local tones followed a normal distribution but had unequal variances among the groups. After conducting our statistical analysis, we found there is strong evidence to conclude the difference in means of tones between the three media sources are statistically significant between each pairwise comparison.
Multi-Label-Classification-Tutorial-for-NLP
This repository contains a Jupyter notebook showing how to do Multi-Label Classification for NLP using the Scikit-Learn Library
Web-Scraping
This repository contains my python notebook containing each logical step in web scraping a job board website.
dblilienthal.github.io
https://dblilienthal.github.io/
Evirn-Sci-Survey
Survey application
Household-Energy-Forecasting
Forecasting energy usage from historic data
Python-Function-Commenter
In this repository, I create an encoder-decoder machine learning model using RNN and Transformer architectures to generate comments when given a Python function.
Visualizing-Pneumonia-using-Deep-Learning
In this repository, I implement a deep convolutional model and visualize the predictions using the Grad-CAM Visualization method on chest X-rays of patients with and without Pneumonia. I am able to achieve 84% accuracy with a model only ~2MB in size.
Watsonville-Environmental-Science-Workshop-Survey-Visualizer
Simple time-series data visualizer using Streamlit
Who-What-When-and-Where-are-the-Data-Scientist-Jobs
This repository contains an HTML dashboard to tell a story about WHO is hiring, WHEN are you ready, WHAT do you need to know, and WHERE are the Data Scientists jobs. The data presented was all web scraped by myself from Dice.com over from July 2021 to August 2021
Identifying-Quora-Question-Pairs
In this project, I use sentence embeddings to identify question pairs
Interactive-Handwritten-Digit-App
In this project, you can draw a digit with your mouse and the deep learning model will tell you what the number is you drew.
Machine-Learning-Projects
This repository is a collection of machine learning projects I have done on my spare time as well some of the projects I have completed at California State University Monterey Bay for the Data Science course.
Multiclass-NLP-Tutorial
Demonstration I made showing many different machine learning algorithms for tackling a multiclass classification NLP problem.
pynsist
Build Windows installers for Python applications
Python-Web-Scrapping
In this repository, I show some web scraping using Seleniumn and jupyter notebooks.
Running-MySQL-From-Python
This is a jupyter notebook containing information about how to insert data and run queries from a MySQL instance in python
Sarcastic-Headline-Detection-with-BERT
In this repository, I implement a fine-tuned BERT model using Keras and Tensorflow to detect a sarcastic headline.
titanic-predictions-heroku
This is my first machine learning web app that I deployed using Heroku. This app uses a classification tree to predict whether or not someone would survive on the titanic.