This is the code repository for Hands-On Data Preprocessing in Python, published by Packt.
Learn how to effectively prepare data for successful data analytics
Data preprocessing is the first step in data visualization, data analytics, and machine learning, where data is prepared for analytics functions to get the best possible insights. Around 90% of the time spent on data analytics, data visualization, and machine learning projects is dedicated to performing data preprocessing.
This book covers the following exciting features:
- Use Python to perform analytics functions on your data
- Understand the role of databases and how to effectively pull data from databases
- Perform data preprocessing steps defined by your analytics goals
- Recognize and resolve data integration challenges
- Identify the need for data reduction and execute it
If you feel this book is for you, get your copy today!
All of the code is organized into folders. For example, Chapter02.
The code will look like the following:
from ipywidgets import interact, widgets
interact(plotyear,year=widgets.
IntSlider(min=2010,max=2019,step=1,value=2010))
Following is what you need for this book: Junior and senior data analysts, business intelligence professionals, engineering undergraduates, and data enthusiasts looking to perform preprocessing and data cleaning on large amounts of data will find this book useful. Basic programming skills, such as working with variables, conditionals, and loops, along with beginner-level knowledge of Python and simple analytics experience, are assumed.
With the following software and hardware list you can run all code files present in the book (Chapter 1-18).
Chapter | Software required | OS required |
---|---|---|
1 - 18 | Python using the Jupyter Notebook | Windows Or Mac OS |
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. Click here to download it.
Roy Jafari , Ph.D. is an assistant professor of business analytics at the University of Redlands. Roy has taught and developed college-level courses that cover data cleaning, decision making, data science, machine learning, and optimization. Roy’s style of teaching is hands-on and he believes the best way to learn is to learn by doing. He uses active learning teaching philosophy and readers will get to experience active learning in this book. Roy believes that successful data preprocessing only happens when you are equipped with the most efficient tools, have an appropriate understanding of data analytic goals, are aware of data preprocessing steps, and can compare a variety of methods. This belief has shaped the structure of this book.