icharo-tb / sparkPostgres

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Spark ETL

This repository's main purpose is practicing Spark and Scala. I recently started learning Scala and Spark, but using the Spark shell became not that interesting since I could not really "taste" what Scala was like. I suddenly found a YouTube video about this specific topic and to my surprise, the owner made a GitHub repository about it too. Therefore, thanks to ankur334 and his repository.

I mainly wrote some functions over and tried to understand the purpose in every action, but I wanted to complete it a little bit and change some small things to make it more professional. First of all, I used the libraryDependencies for typesafe, I was clear about creating a conf or .env file that stores the database credentials without having them exposed. Then, after getting several errors while running tests, I had to add some changes since I needed my data to cast certain data types in order to store them in my PostgreSQL table.

Lastly, instead of InteliJ, I worked with VS Code, Metals and SBT.

I enjoyed every bit of this short project, as it gives a good understanding and starting algorithm to keep practicing and learning about Spark and Scala.

About


Languages

Language:Scala 100.0%