mgd1984 / Yelp-Analytics-Dashboard

Analyzing the Yelp Academic Dataset w/ Apache Drill & Flexdashboards

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

# Yelp BizOps Analytics Dashboard 

## Getting Started

These instructions allow you to replicate our dashboard on your own local machine for development and testing purposes. The initial set-up requires that you have Apache Drill installed and running in embedded mode.

### Apache Drill
* [Apache Drill](https://drill.apache.org/docs/why-drill/) - Drill is an Apache open-source SQL query engine for Big Data exploration. Drill is designed from the ground up to support high-performance analysis on the semi-structured and rapidly evolving data coming from modern Big Data applications, while still providing the familiarity and ecosystem of ANSI SQL, the industry-standard query language. Drill provides plug-and-play integration with existing Apache Hive and Apache HBase deployments.

  Drill’s agility and flexibility are it's greatest strengths. Along with meeting the table stakes for SQL-on-Hadoop, which is to achieve low latency performance at scale, Drill allows users to analyze the data without any ETL or up-front schema definitions. The data can be in any file format such as text, JSON, or Parquet

  For instructions on installing & running Apache Drill in embedded mode please check see [this quick start guide](https://drill.apache.org/docs/drill-in-10-minutes/)

### R Studio
* [RStudio]() - RStudio is an integrated development environment (IDE) for R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management.

### Sergeant
* [Sergeant](https://hrbrmstr.github.io/sergeant/) - 'Apache Drill' is a low-latency distributed query engine designed to enable data exploration and 'analytics' on both relational and non-relational 'datastores', scaling to petabytes of data. The sergeant package provides methods that enable working R to work with Apache Drill instances via the 'REST' 'API', 'JDBC' interface (optional), 'DBI' 'methods' and using 'dplyr'/'dbplyr' idioms.


### Flexdashboards 
* [Flexdashboards](http://rmarkdown.rstudio.com/flexdashboard/) - Easy, interactive dashboards for R:

Install the flexdashboard package in from CRAN as follows:

`install.packages("flexdashboard")`

### Other Packages 
(Included in setup section of R Markdown)

`DT`,`shiny`,`dplyr`,`leaflet`,`leaflet.extras`,`highcharter`,`stringr`,`data.table`, `sp`


## Authors

* **Alec Miller**
* **Paige Tuchner** 

## Acknowledgments

Hat tips to:
* Bob Rudis (hrbmstr)
* RNFC (Nico, Tim & Jon)
* Apache Software Foundation 
* All other open-source developers of packages & plug-ins for R

About

Analyzing the Yelp Academic Dataset w/ Apache Drill & Flexdashboards