Shweta Tanwar's repositories
Kafka-Pyspark-Streaming
Kafka streaming using Pyspark
CCA175-Hadoop-and-Spark-Developer
CCA 175 Preparation scripts
Spark-and-Kafka_IoT-Data-Processing-and-Analytics
IoT Project for UCSC Internet of Things : Created a Data Pipeline using Spark Streaming and Kafka. JSON messages are simulated using Python program.Data Analysis is done using Spark SQL and Visualization is done using Tableau with Data Source as Hive.
Assignment-6.2-HIVE-INTRODUCTION
Fetch date and temperature from temperature_data where zip code is greater than 300000 and less than 399999. Calculate maximum temperature corresponding to every year from temperature_data table. Calculate maximum temperature from temperature_data table corresponding to those years which have at least 2 entries in the table. Create a view on the top of last query, name it temperature_data_vw. Export contents from temperature_data_vw to a file in local file system, such that each file is '|' delimited.
bigdata-projects
Student projects in Big Data field.
Complete-Python-3-Bootcamp
Course Files for Complete Python 3 Bootcamp Course on Udemy
Data-Analysis
Data Analysis of NYSE stock exchange data using Hive and Tableau
Hive
Hive Snippets for UCSC Extension course Hadoop: Distributed Processing.
python
Python programs using various python libraries like Numpy, Matplotlib etc.
data_analytics
Assignments and case studies as part of pgdda-iiitb
Java-Programs
Java Practice Programs
Kafka
Apache Kafka
learning-spark
Example code from Learning Spark book
Million-Song-Dataset-HDF5-to-CSV
Million Song Dataset HDF5 to CSV Converter
MongoDB
UCSC MongoDB Course
PGDDA-Projects
A comprehensive 1 Year program taught by Industry experts and IIITB faculty; 7 case studies & projects; 400+ hours of academic learning & 30+ hours of industry mentoring
python-for-everybody
Python For Everybody
Spark-The-Definitive-Guide
Spark: The Definitive Guide's Code Repository
Tableau
UCSC Extension Tableau Practice