gaiaengineer / profitable_app_profiles

Data cleaning and exploratory data analysis of two datasets in Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Profitable App Profiles for the App Store and Google Play Markets

Project Description

This is a portfolio project where I act as a data analyst for an imaginary company that builds Android and iOS mobile apps for English-speaking markets. The apps are free to download and install on AppStore and Google Play. The main source of revenue for the company consists of in-app ads, which means that the more users download and install the app, the bigger the chance that they'll see and engage with in-app ads. The goal of the project is to analyze data to help the developers understand what type of apps are likely to attract more users.

Data Sets

To do this, I'll need to collect and analyze data about mobile apps available on Google Play and the App Store. As of September 2018, there were approximately 2 million iOS apps available on the App Store, and 2.1 million Android apps on Google Play. Collecting data for over 4 million apps requires a significant amount of time and money, so I'll try to analyze a sample of the data instead. To avoid spending resources on collecting new data by myself, I'll first try to see if I can find any relevant existing data at no cost. Luckily, there is a learning dataset of AppStore apps available on Kaggle. There is also a public dataset of Google Play apps.

Technologies

  • Python:
    • data cleaning: opening .csv files, Python functions, checking the data for empty strings and null values, getting rid of a column shift using a del statement, checking for duplicates, Python dictionaries, converting to floats, looping, ASCII standard
    • data analysis: frequency tables, nested loops
  • Jupyter Notebook

About

Data cleaning and exploratory data analysis of two datasets in Python

License:MIT License


Languages

Language:Jupyter Notebook 100.0%