- Source database : There are two source databases named Gold and Diamonds. You are given a Dockerfile to build a postgres image for both these databases. There is a seed.sql that has DDL and DML statements to seed the data for the source databases.
- For Gold, build the image using the provided Dockerfile and launch a container and name it as gold.
- For Diamonds, build the image using the provided Dockerfile and launch a container and name it as diamonds. Please see below a ERD diagram to see the relationships between the entities from the source database.
-
Destination database: You are given a Dockerfile to build a postgres image for the destination database.
-
A seed.sql file is also provided that contains DDL statement to create the destination database. Create an image using the provided Docker file and launch a container named warehouse.
-
Write a program that does the following
-
extracts all the data from all the tables from the source databases {gold, diamonds}.
-
transform and load it to the table in destination database {warehouse}.
- Following are the constraints that the ETL should follow
- ETL should be written in C#
- If the ETL is ran multiple times, it should not create duplicate data in the destination table.
- The ETL has "read only" access to the source databases.
- You need to provide well documented source code of the ETL process.
- Fork this github repo using your github account
- Create a feature branch (name it whatever you want) off of master.
- Create a src directory under the root directory and add all the source code in there.
- Push all your changes to the forked repo.
- Do not use this repo to create any branches.
- Project should have unit tests (you can use any unit test framework)
- Provide a design document and also add any assumptions you have made to develop your solution.
- Provide clear instructions describing how to run your solution.
- Bonus Points if you provide a Dockerfile to create an image of your solution. Please add the Dockerfile inside the images directory
- Email the link to your forked repo containing the solution.