nageshsinghc4 / Apache-Beam-pipeline-using-java

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Apache-Beam-pipeline-using-java

Apache Beam is an evolution of the Dataflow model created by Google to process massive amounts of data. The name Beam (Batch + strEAM) comes from the idea of having a unified model for both batch and stream data processing. Programs written using Beam can be executed in different processing frameworks (via runners) using a set of different IOs.

In this code,implementing Hospital Charges Data Analysis in the United States. We will do our analysis using Apache Beam’s Java BeamSql API and will execute the code in Google Dataflow runner to solve the below problems :

Problem 1: Find the amount of Average Covered Charges per state.

Problem 2: Find the amount of Average Medicare Payments charges per state.

Problem 3: Find out the total number of Discharges per state and for each disease.

Please check this link for more:

https://medium.com/analytics-vidhya/apache-beam-a-beginners-approach-4783dfc6fea

About


Languages

Language:Java 100.0%