There are 0 repository under sqoop-import topic.
Cloudera_Material: Study Material to help people preparing for Cloudera CCA Spark and Hadoop Developer Exam (CCA175). Feel free to collaborate.
This is the first project where we worked on apache spark, In this project what we have done is that we downloaded the datasets from KAGGLE where everyone is aware of, we have downloaded loan, customers credit card and transactions datasets . After downloading the datsaets we have cleaned the data . Then after by using new tools and technologies like spark, HDFS, Hive and many more we have executed new use cases on the datasets we have downloaded from kaggle. As we all know apache spark is a framework that can quickly process the very large datsets.
BigData Engineering Capstone Project with Tech-stack : Linux, MySQL, sqoop, HDFS, Hive, Impala, SparkSQL, SparkML, git
Apache Sqoop tutorial
MapReduce Job Development, RDDs Programming, Medical Data Management, Sales Analysis, And Efficient Data Integration For Big Data Analysis. Spark: Big Data Processing, SQOOP Integration, And Spark Structured Streaming For Real-Time Data.
Created a utility to import data from traditional databases to hdfs using sqoop and implemented using bash
[Innopolis University] Big Data Course 2023. Final Project
This repository consists of the source code and the screenshots of the output. This project uses Hive, SQL, and Sqoop to perform analysis.
My first data analytics project I am creating along with the Data Analytics Essentials course by Cisco Networking Academy.
Built a data pipeline by creating tables in MySQL DB, ingested tables to Hadoop for data warehousing and built HiveQL views. Hive views in Linux VM were connected to Power BI application in Windows to create visualizations.
A python package that lets you sqoop into HDFS/Hive/HBase data from RDBMS using sqoop
A query system for a hypothetical bank scenario
Build a data pipeline (using hadoop-hdfs, sqoop, hiveql) for data analysis out of an ambiguous and incomplete instruction.
Import data into the Hive using Sqoop.
Real-Time & Batch Data Processing Pipeline
ETL Pipeline for Spar Nord Bank for the analysis of refilling frequency of the ATM's all over the europe