jabhij / CrimeRate_Classification

Developing a system that could classify crime descriptions into different categories which would help the authorities to assign officers to crimes based on the report.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CrimeRate Classification

Problem Statement:

The main objective of this project is to implement Big Data technologies in the machine learning realm. As part of this project, we will be working on the San Francisco Crime Classification dataset obtained from Kaggle. We are mainly interested in developing a system that could classify crime descriptions into different categories which would help the authorities to assign officers to crimes based on the report.

Solution:

There can be numerous approaches to solving this problem. Out of all those approaches we will be using the crime dataset and working around it. We will train a model based on 39 predefined categories, test its accuracy, and deploy it into production. Given a new crime description, the system should assign it to one of the 39 categories. In addition, to solve this multi-class text classification problem, we will use various feature extraction techniques along with different supervised machine learning algorithms in Pyspark.

Project Goals:

We will try different sets of models to check the crime rate and compare their accuracy. This comparative analysis would help us know which model would be the best for this kind of dataset and problem.

Code:

Google Collab

References:

  1. Kaggle
  2. Researchgate
  3. IEEE Explore
  4. NCBI

Catch me

For any query, ping me on

Hope, it helps!! ヅ

About

Developing a system that could classify crime descriptions into different categories which would help the authorities to assign officers to crimes based on the report.

License:MIT License


Languages

Language:Jupyter Notebook 99.8%Language:Python 0.2%