gapkim / Enron_Fraud

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Enron Fraud

The goal of the project is to identify person of interest (POI) involved in the Enron corporate fraud based on the financial and email data made public from the investigation. The dataset contains financial features (salary, deferral_payments, total_payments, bonus, total_stock_values, etc.), email features (email_address, from_poi_to_this_person, from_this_person_to_poi, etc.), and POI labels. With large amount of data (i.e., features), it is challenging to intuitively select, analyze and/or use the information to identify POIs for the investigation. Machine learning algorithms have been used to effectively narrow down the POI list to help with the investigation involved in the fraud scandal.

File Description

  1. Final_project_answers-to-questions.html: Answers to project questions. Click here to view.
  2. Final_project_supplementary.html: Supplementary file containing details of code and analysis results to answer the questions. Click here to view.
  3. Final_project(R1).ipynb : Project ipython notebook file
  4. poi_id.py: Python code to generate pkl files.
  5. my_classifier.pkl, my_dataset.pkl, my_feature_list.pkl: 3 files generated from running poi_id.py
  6. List_of_references.txt

About

License:MIT License


Languages

Language:Jupyter Notebook 100.0%