Cynthia Correa's repositories
TED-plus-and-GenderListener
Dataset of all online TED talks including features from metadata, audio, transcripts, and sound-derived speaker gender labels generated using GenderListener.
Global-City-Climate-Time-Series-Forecasting
Perform autoregression to predict climate change in the next decade from dataset with temperatures, and other environmental metrics from hundreds of cities worldwide.
Time-Series-Forecasting-on-Stocks-Data-2017
Project repository for stocks forecast and anomaly detection webApp
DSLA-TwitterStreamer
Notebook to retrieve a stream of tweets and save to a .txt file
iTunes-SQL-Machine-Learning
Make an SQL database out of your iTunes music library and use python and machine learning algorithms to predict star ratings for all your songs.
autoregression
Using autoregression on stocks and temperature data
InstacartDataManipulation_DSLA
Exploratory analysis of the Instacart grocery store purchase data
BizOpsTools
Code for tools developed by the BizOps team
Characterizing_FitBit_walking_data
I clean up, munge, plot, and characterize personal movement monitoring data. This project offers examples of how to use the lubridate, plyr, and knitr R packages.
Classifying_FitBit_data_with_Random_Forest_model
I test my machine learning prowess by predicting whether entries in FitBit data correspond to walking, running, or going up stairs. The Random Forest algorithm gives almost perfect accuracy and correctly predicts the 20 test cases.
cynco.github.io
Files for my user page, which you can view at
Generate_plots_of_UCI_Machine_Learning_Repository_data
Using data from the UCI Machine Learning Repository, we create code that generates various plots of energy consumption over time.
Looker_Guides
Guides to get up and running with Looker!
mnist_digit_classification
Use Keras neural network model to perform digit recognition on the mnist dataset
ODSC-East2018
"Network/Graph Analysis in Python" repository of 3 hours training session held at ODSC East 2018.
PySpark-Installation
Install and try out pyspark on your local machine
pystock-data
US stock market data since 2009
scikit-learn
scikit-learn: machine learning in Python
SciPy_classification_of_Iris_dataset
I use SciPy to train 6 ML algorithms on the Iris dataset to predict the species of each sample based on the petal and sepal length and width. I use a test harness with 10-fold cross validation. KNN gives the best results, with 90% accuracy on the validation set.
sdk-examples
Example source code and projects for the Looker SDKs
spaCy
đź’« Industrial-strength Natural Language Processing (NLP) with Python and Cython
Spyre_WebApp_Samples-DSLA
Several working samples of WebApps made using Spyre and Python3.
startbootstrap-sb-admin-2
A free, open source, Bootstrap admin theme created by Start Bootstrap
Trends_in_FPM_pollution_by_city
Analysis and visualization of trends over time by city in fine particulate matter pollution from the EPA's National Emissions Inventory.