danmaps/pandas-playground

here are some common data cleaning tasks that you can perform using Pandas, along with some code samples:

Removing duplicates:

import pandas as pd

# Load the data into a DataFrame
df = pd.read_csv('data.csv')

# Remove duplicate rows
df.drop_duplicates(inplace=True)

Handling missing values:

import pandas as pd

# Load the data into a DataFrame
df = pd.read_csv('data.csv')

# Replace missing values with a specific value (e.g. 0)
df.fillna(0, inplace=True)

# Drop rows with missing values
df.dropna(inplace=True)

Renaming columns:

import pandas as pd

# Load the data into a DataFrame
df = pd.read_csv('data.csv')

# Rename columns
df.rename(columns={'old_column_name': 'new_column_name'}, inplace=True)

Filtering data based on conditions:

import pandas as pd

# Load the data into a DataFrame
df = pd.read_csv('data.csv')

# Filter rows based on a condition
filtered_df = df[df['column_name'] > 10]

Splitting and combining columns:

import pandas as pd

# Load the data into a DataFrame
df = pd.read_csv('data.csv')

# Split a column into two columns based on a delimiter
df[['first_name', 'last_name']] = df['full_name'].str.split(' ', expand=True)

# Combine two columns into one column
df['full_name'] = df['first_name'] + ' ' + df['last_name']
These are just a few examples of the many data cleaning tasks that you can perform using Pandas. By using these tools, you can efficiently clean and transform your data for analysis and visualization.

danmaps / pandas-playground

About

Languages