dataanalysis dataanalytics datavisualization googlecolaboratory python sentiment-analysis

Exploratory Data Analysis (EDA) on Airbnb Listings Project

Welcome to the Exploratory Data Analysis (EDA) report based on the Airbnb dataset of Seattle, Washington, US.

Table of content

Introduction
Ask
Prepare
Process
Analyze
Share
Act

Introduction

Founded in 2008, Airbnb is a global online marketplace that connects hosts offering unique living spaces with guests seeking authentic travel experiences. From cozy city apartments to charming beach bungalows and even quirky treehouses, Airbnb transcends traditional hotels, allowing guests to immerse themselves in local communities and live like residents. With millions of listings in over 191 countries, Airbnb empowers hosts to earn income by sharing their spaces, while offering guests a more diverse and affordable alternative to hotels, fostering a sense of connection and belonging wherever they travel.

In this project, we delve into the world of Airbnb listings in Seattle, WA, to uncover insights, trends, and patterns that are hidden within the data. By performing a thorough analysis, we aim to better understand the factors that influence pricing, occupancy, and customer reviews.

Ask

The ask phase is the start of the data analysis cycle, it involves clearly defining the scope of the project, the problem to be solved, and identifying stakeholders and stakeholder’s expectations by asking SMART (Specific, Measurable, Action-oriented, Relevant, Time-bound) questions.

Prepare

This includes identifying the source of the information that will be utilized for the analysis, guaranteeing that the information source is dependable, unique, thorough, current and referred to, demonstrating the knowledge, ensuring that the information is liberated from any bias in the assortment of the information and, regarding each part of data ethics while dealing with the information.

Dataset Description

The dataset consists of 3 CSV files:-

listings.csv: The files consist of the apartment process, room types, host, and prices of properties. It consists of 3819 rows and 92 columns.
reviews.csv: The files contain at least 84k reviews regarding the property. It consists of 84850 rows and 6 columns.
calender.csv: The files contain information about occupancy and availability for 2873 listings every day from January 2016 to January 2017. It includes the listing ID, date, availability (t: available, f: occupied), and price. It consists of 934542 rows and 4 columns.

Process

This involves all the steps taken to clean the data, making sure the data has integrity (the data is accurate, complete, consistent and trustworthy) before analyzing it, aligning the data to the business objective and also carrying out data verification. We must be sure that the process involves checking for misspellings, inconsistent capitalizations and typos, checking for duplicate entries and blank cells and checking for consistent data format across each column.

Analyze

In this phase, we are going to solve all the questions by analyzing all the datasets we just processed.

Share

In this phase, we are going to share our work with the stakeholders. Here the files will be sent to LinkedIn and will be tagged CODING SAMURAI. the repo will be stored on GitHub.

Act

In this phase, key findings from our analysis will be mentioned and some limitations of the project will be scripted as well.

Author

[Abhishek Ramesh Shettigar]

Acknowledgement

The author would like to thank Coding Samurai for providing this internship opportunity. Dataset:- https://www.kaggle.com/datasets/airbnb/seattle/?select=listings.csv

About

In this project, you will perform basic data analysis on a dataset of Airbnb listings. EDA is a fundamental step in data science that involves exploring and understanding the data before diving into more complex analysis or modeling.

dataanalysis dataanalytics datavisualization googlecolaboratory python sentiment-analysis

Languages

Language:Jupyter Notebook 100.0%