sizigia / disaster_response_ml

A classification model for SOS messages into categories such as food-related, floods, earthquake, etc., to facilitate faster disaster response.

Home Page:https://faustinamaria.com/data-science/disaster-response/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Table of Contents

  1. Installation
  2. Project Motivation
  3. File Descriptions
  4. Licensing, Authors, and Acknowledgements

Installation

There aren't any necessary libraries to run the code here beyond the Anaconda distribution of Python. The code should run with no issues using Python versions 3.*.
I used the following libraries and versions:

Project Motivation

The aim of this project was to build a web application that would categorize disaster-related messages.
This was done in two steps, with two pipelines:

  1. Extract-Transform-Load pipeline, where I worked with two CSV files containing messages and categories, and end up with a database containing categorized messages.
  2. Machine Learning pipeline, where I built, tested and compared two different models to end up with one that predicts with a 95% accuracy new messages categories.

File Descriptions

  • README.md, this file you're reading

Essentials:

  • web_app/ - contains the boilerplate code necessary to visualize the web application
  • web_app/README.md - contains the necessary steps to run the web application in a local environment

Essentials not included:

  • DisasterResponse.db (~6.5 MB) - it gets stored in your local environment after running web_app/data/process_data.py
  • classifier.pkl (~997 MB) - it gets stored in your local environment after running web_app/models/train_classifier.py.

Pipeline Building Processes:

Licensing, Authors, Acknowledgements

The dataset contains 30,000 messages drawn from events including an earthquake in Haiti in 2010, an earthquake in Chile in 2010, floods in Pakistan in 2010, super-storm Sandy in the U.S.A. in 2012, and news articles spanning a large number of years and 100s of different disasters. The data has been encoded with 36 different categories related to disaster response and has been stripped of messages with sensitive information in their entirety.
Multilingual Disaster Response Messages, Appen Open Source Dataset

About

A classification model for SOS messages into categories such as food-related, floods, earthquake, etc., to facilitate faster disaster response.

https://faustinamaria.com/data-science/disaster-response/


Languages

Language:Jupyter Notebook 86.7%Language:Python 10.4%Language:HTML 2.8%