Machine Learning Framework

This project is an effort to create a framework that automates basic machine learning and will help a team quickly get some results and an idea of what algorithms might be useful. It is not a replacement for custom built systems that leverage machine learning.

Overview

Purpose

To aid developers using machine learning algorithms in finding the best algorithms and optimal configurations for their specific situation. This is accomplished by recording as much information on a certain model as the developer wants, and then analyzing all the data to find which algorithms work best on a dataset and with what settings to they work best.

Features

Automatically applies several of a number of machine learning algorithms against the input data based on the settings it is given
Has a large number of tests that make sure algorithms included still run and haven't become outdated
Can be put in validation or application mode (train/test mode)
Record results from a machine learning algorithm test
Saves results in a Firebase Database
(Coming Soon) View results in a WebUI
(Coming Soon) Analyze data from results in the WebUI

Terminology

Test

Unless otherwise stated, when we say test we mean a way of determining if an algorithm works. (As opposed to testing of the actual code, etc..)

Test Data

Test data is data that is unlabeled, in other words it does not have a column or label which represents the target that a model is trying to predict. So if a model predicts housing prices, "test data" will not have the housing prices listed.

Train Data

Training data on the other hand does have the target column, because the model has to use that column to be trained.

Result Record

All the information about the test including hyper-parameters of the model used, information on the test data, results of test, etc.

Getting Started

These instructions will get you a copy of the entire project up and running on your local machine for development and testing purposes. If you wish to deploy submodules individually, please see the instructions for that specific module. See deployment for notes on how to deploy the project on a live system.

Prerequisities

What things you need in order to run this project. Detailed instructions included in "Installing".

A *nix system.
Python 2.7 (latest version), various python libraries (Scikit-learn, numpy, scipy, etc)
Javascript, NodeJS, ReactJS, among others.

Installing

The following instructions cover setup and install of the entire system.

Python and Related Libraries

Install the latest Python 2.7
Go into your unix system and install SciPy
Note that installation might be different for different systems

For Ubuntu, try:

sudo apt-get install python-numpy python-scipy python-pandas python-sympy python-nose

Install SciKit-Learn using pip

Javascript, NodeJS, and Related Libraries

Follow the instructions here, including submodules

Test Data

Download and setup the test data so that unit tests run properly

To verify correct setup, please run the tests.

Running the tests

Go to the top directory of the project and run the following,

python -m unittest discover

This will run tests against every module, including ones that test R and Javascript modules. It will not run all tests, but every module will be covered.

MLTA Testing

Documentation pending

Break down into end to end tests

Explain what these tests test and why

Give an example

And coding style tests

We measure code quality with CodeClimate, to see that data go here.

Deployment

Pending

Built With

Bash on Ubuntu on Windows
Pycharm
Python, Javascript
Firebase

Libraries, Frameworks, and Packages Used

Python: Scikit-learn, scipy, pandas, numpy
Javascript: NodeJS

Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

Versioning

We do not use versioning currently, we will likely use SemVer for versioning eventually. For the versions available, see the tags on this repository.

Authors

Alexander Clines - Initial work - asclines
Isaac Griswold-Steiner - Initial work - ASAAR
Zakery Fyke - Initial work - ZakeryFyke
Ryan McBerg - Initial work - RyanMcBerg

See also the list of contributors who participated in this project.

License

We haven't dealt with licensing yet.

Acknowledgements

Hat tip to anyone whose code was used
template for README
The labels used in the issues section were inspired by this site
Issue and PR Templates were inspired by this site

CodeSpaceHQ / MENGEL

Machine Learning Framework

Overview

Purpose

Features

Terminology

Test

Test Data

Train Data

Result Record

Getting Started

Prerequisities

Installing

Python and Related Libraries

Javascript, NodeJS, and Related Libraries

Test Data

Running the tests

MLTA Testing

Break down into end to end tests

And coding style tests

Deployment

Built With

Libraries, Frameworks, and Packages Used

Contributing

Versioning

Authors

License

Acknowledgements

About

Languages