MystikHub / berkeley-time-series

Time series prediction using python and scikit learn

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Berkeley Earth Climate Data Time Series Analysis

Time series prediction using python and scikit learn

Dependencies

This project uses Python and requires the following packages from pip:

  • matplotlib
  • numpy
  • pandas
  • sklearn

Preparing the feature sets

Download the data from Berkeley Earth and extract the "GlobalLandTemperaturesByCity.csv" file into the repository's directory. Then, use python3 make-feature-sets.py to export the various different feature set files into a feature-sets directory.

Running the machine learning models

You can then configure and run whichever code you want for each model (i.e. cross-validation, training, evaluation, etc.) in each file. These are generally configured by a string near the top of each model's file.

Here's what each file does:

  • lasso.py Cross-validates the lasso regression penalty and trains and evaluates a lasso regression model against an "average of features" baseline predictor
  • linear.py Trains and evaluates a linear regression model against an "average of features" baseline predictor
  • load_feature_sets.py Contains some useful functions used by the tree models for loading the feature data generated by make_feature_sets.py
  • make_feature_sets.py Reads the entire Berkeley Earth data set and writes each country and city's feature set files into a feature-sets directory
  • ridge.py Cross-validates the ridge regression penalty and trains and evaluates a ridge regression model against an "average of features" baseline predictor

About

Time series prediction using python and scikit learn

License:GNU Affero General Public License v3.0


Languages

Language:Python 100.0%