mikekeith52 / PythonRegression

Code used in Springer Nature/Apress Video Tutorial

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Machine Learning with Regression in Python

This is the notebook used in my Apress publication.

Business problem

  1. We work for a fast-food restaurant that is looking to expand its business to new areas.
  2. We have county-wide data including information on houses, incomes, population, crime, zoning information, and more for over 7,000 counties in which our business is currently operating, all stored in the csv file "existing.csv".
  3. For each of these counties, we have averaged our sales revenue over the course of the past several months and we will use this data to train a regression model.
  4. Using that trained regression model, we will predict what sales will be on a dataset of over 20,000 counties in which we do not have any facilities. This information is stored in "new.csv".
  5. The question our manager wants to answer is which county should we expand our business into next? Where can we expect the most sales revenue?

Data Dictionary

  • crim: per capita crime rate by town
  • zn: proportion of residential land zoned for lots over 25,000 sq.ft.
  • indus: proportion of non-retail business acres per town.
  • rm: average number of rooms per dwelling
  • age: proportion of owner-occupied units built prior to 1940
  • tax: full-value property-tax rate per 10,000 dollars
  • ptratio: pupil/teacher ratio by town
  • black: 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
  • median_home_value: median value of owner-occupied homes
  • sales: the average sales by month in the given county
  • county_id: the unique identifier of the county in which the store is operating

Installation (Windows specific)

  • Download Anaconda
  • Install required libraries
    • Run the following commands in Anaconda Prompt:
      1. cd path/to/this/directory
      2. pip install -r requirements.txt

About

Code used in Springer Nature/Apress Video Tutorial


Languages

Language:Jupyter Notebook 100.0%