Getting started with Pandas and understanding how to build series as well as dataframes. Moreover importing an dataset and using pandas to view the data and manipulate the data according to the algorithm needs.
Pandas are used to explore data, analyze data, and manipulate data that is used in the ML field. Sometimes we have to make our raw data and manipulate it through pandas to make it in a form that ML algorithms can understand.
There are two data types in Pandas:
Series: It is a dimensional data type where we define a single object in an array manner.
series=pd.Series(["BMW", "Honda", "Audi"])
# To print the outputseries
Dataframe: A two-dimensional data type that takes a Python dictionary. Moreover, it can take the data from a series as well.
To get started
car_data=pd.dataframe({"Car Make": series, "Color": colors})
# to print the outputcar_data
Importing Data through URLs:
Make sure that make sure the dataset is in the "raw" format, by clicking the raw button on GitHub.
Meta information which is stored in car sales data frame
car_sales.to_csv()
Series of steps performed to execute the cmd
Difference between .loc and .iloc:
.loc (location): We can manually define the location of the object inside the array and then call the object by mentioning the location assigned to it. This refers to the index.
.iloc (integer location): In the above code we have defined the animal data series and when we call it using iloc, it will give the array object of that location. This refers to position.
# .iloc refers to position animals.iloc[3]
# OUTPUT:'owl'
Regex is a sequence of characters that defines a search pattern. In Python, regex is implemented in the re-module. Regex patterns can be used to match, search, replace, or extract specific text from a string.
Using Matplotlib:
About
Getting started with Pandas and understanding how to build series as well as dataframes. Moreover importing an dataset and using pandas to view the data and manipulate the data according to the algorithm needs.