DataViz Kaggle Competition

This ShinyR data visualization helps me have a better undertanding and easier find correlations between the variables in the Kaggle competition.

During the development process of this project I will be publishing the last status of the project in the following ShinyApps page: xxx

The dataset used for this visualization comes from the Kaggle competition TalkingData AdTracking Fraud Detection Challenge.

About the data

TalkingData, China’s largest independent big data service platform, covers over 70% of active mobile devices nationwide. They handle 3 billion clicks per day, of which 90% are potentially fraudulent. Their current approach to prevent click fraud for app developers is to measure the journey of a user’s click across their portfolio, and flag IP addresses who produce lots of clicks, but never end up installing apps. With this information, they've built an IP blacklist and device blacklist.

While successful, they want to always be one step ahead of fraudsters and have turned to the Kaggle community for help in further developing their solution. In their 2nd competition with Kaggle, you’re challenged to build an algorithm that predicts whether a user will download an app after clicking a mobile app ad. To support your modeling, they have provided a generous dataset covering approximately 200 million clicks over 4 days!

Installation guide

Clone the repo in your local computer
Open the Rstudio and set the working directory in the downloaded file.
Download and install ShinyR (See this link)
Click Run App in the Rstudio file window.

See https://www.shinyapps.io if you want to deploy it online.

Users guide

TBD

How to contribute

Feel totally free to download, make modifications and pull again. :)

License

GPL3

About

Shiny visualization to better understand a dataset from a Kaggle competition

https://www.kaggle.com/c/talkingdata-adtracking-fraud-detection

kaggle shinyapps r datascience

GNU General Public License v3.0

Languages

Language:R 100.0%