Aisha-Ojey / Data-exploration-and-data-cleaning-in-SQL

Explore COVID-19 data using SQL techniques like joins, CTEs, window functions, and aggregate functions. Also, clean Nashville Housing data with SQL scripts. Repository includes queries, skills used, and dataset sources.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Data Exploration and Data Cleaning using SQL (COVID-19 DATASET AND The Nashville Housing dataset)

This repository contains SQL queries and scripts for exploring COVID-19 data as of May 2021 and for cleaning data from the Nashville Housing dataset. The project utilizes various SQL techniques such as Joins, CTEs (Common Table Expressions), Temporary Tables, Window Functions, Aggregate Functions, Creating Views, and Converting Data Types to analyze and process the data.

Introduction

This repository showcases SQL queries and scripts for two distinct tasks: exploring COVID-19 data up to May 2021 and cleaning data from the Nashville Housing dataset. SQL (Structured Query Language) is used to retrieve, analyze, and manipulate the data, showcasing various advanced SQL techniques.

COVID-19 Data Exploration

Dataset

The COVID-19 data is sourced from the Our World in Data COVID-19 Deaths dataset, providing information about COVID-19 related deaths. The dataset contains various attributes including date, location, total deaths, and more.

Skills Used

Joins: Combining multiple tables based on shared columns. CTEs (Common Table Expressions): Creating temporary result sets for complex queries. Temporary Tables: Storing intermediate results for further analysis. Window Functions: Performing calculations over specific result sets. Aggregate Functions: Computing summary statistics like sums, counts, etc. Creating Views: Creating virtual tables for simplified querying. Converting Data Types: Changing the data type of specific columns.

Queries

The SQL queries in this section cover exploratory tasks such as calculating statistics, identifying trends, and generating insights from the COVID-19 data. The queries are designed to showcase the mentioned skills.

Nashville Housing Data Cleaning

Dataset

The Nashville Housing dataset contains information about housing properties in Nashville. The dataset may require cleaning and preparation for further analysis.

Queries

The SQL queries in this section address data cleaning tasks, which might include handling missing values, standardizing formats, and correcting inconsistencies in the Nashville Housing dataset.

Usage To use this repository, follow these steps:

Clone the repository to your local machine. Use a SQL database management system (DBMS) to execute the SQL queries and scripts. Follow the instructions provided within each query/script to analyze COVID-19 data or clean Nashville Housing data. Contributing Contributions to this repository are welcome. Feel free to fork the repository, make your changes, and submit a pull request.

About

Explore COVID-19 data using SQL techniques like joins, CTEs, window functions, and aggregate functions. Also, clean Nashville Housing data with SQL scripts. Repository includes queries, skills used, and dataset sources.