Swap-Nova/Pandas_Data-Analysis

data-analysis matplotlib-pyplot pandas-dataframe

Getting Started with Pandas:

Pandas are used to explore data, analyze data, and manipulate data that is used in the ML field. Sometimes we have to make our raw data and manipulate it through pandas to make it in a form that ML algorithms can understand.
There are two data types in Pandas:
1. Series: It is a dimensional data type where we define a single object in an array manner.

series = pd.Series(["BMW", "Honda", "Audi"])
# To print the output
series

Dataframe: A two-dimensional data type that takes a Python dictionary. Moreover, it can take the data from a series as well.

To get started

car_data = pd.dataframe({"Car Make": series, "Color": colors})
# to print the output
car_data

Importing Data through URLs:

Make sure that make sure the dataset is in the "raw" format, by clicking the raw button on GitHub.

heart_disease = pd.read_csv("https://raw.githubusercontent.com/mrdbourke/zero-to-mastery-ml/master/data/heart-disease.csv")

Describing Data with Pandas

Attributes	Functions
car_sales.dtypes	Meta information which is stored in car sales data frame
car_sales.to_csv()	Series of steps performed to execute the cmd

Difference between .loc and .iloc:

.loc (location): We can manually define the location of the object inside the array and then call the object by mentioning the location assigned to it. This refers to the index.

# .loc : Location
animals = pd.Series(["cat", "dog", "panda", "owl"], index=[0, 3, 9, 3])
animals.loc[3]

# OUTPUT:
3    dog
3    owl
dtype: object

.iloc (integer location): In the above code we have defined the animal data series and when we call it using iloc, it will give the array object of that location. This refers to position.

# .iloc refers to position 
animals.iloc[3]

# OUTPUT:
'owl'

Replacing String to Int:

price_plot = car_sales["Price"].replace('[\$\,\.]', '', regex=True).astype(int)

Regex is a sequence of characters that defines a search pattern. In Python, regex is implemented in the re-module. Regex patterns can be used to match, search, replace, or extract specific text from a string.

Using Matplotlib:

About

Getting started with Pandas and understanding how to build series as well as dataframes. Moreover importing an dataset and using pandas to view the data and manipulate the data according to the algorithm needs.

data-analysis matplotlib-pyplot pandas-dataframe

Languages

Language:Jupyter Notebook 100.0%