sagarlimbu0 / Developing_OCO2_satellite_data_visualization_ETL_pipeline

This ongoing project implements an Apache Airflow ETL workflow to regularly fetch OCO2 Level 3 data from an OpenDAP server. It involves ETL workflow including the generation of visual aids like PNGs/JPEGs. The transformed data is loaded into an AWS S3 bucket, providing an automated solution for OCO2 Level 3 data handling and visualization.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ETL_OCO2_GEOS_L3_visualization

The project involves constructing a comprehensive ETL (Extract, Transform, Load) workflow using Apache Airflow to systematically fetch OCO2 Level 3 data from an OpenDAP server with a bi-weekly cadence, aligned with the 15-day cycle of satellite data collection.

The workflow begins with the Extract phase, employing Python and Bash operators to efficiently extract data from the OpenDAP server. Subsequently, in the Transform phase, the extracted data undergoes necessary processing and formatting. Notably, this phase includes the generation of PNGs/JPEGs to facilitate visual analysis. The Load phase seamlessly integrates with AWS S3, ensuring that the transformed data, along with the generated images, is directly loaded into the specified S3 bucket. This not only optimizes storage efficiency but also ensures real-time accessibility of the data on the scalable and reliable AWS platform. The subsequent steps in the workflow, including the creation of animations and the maintenance of a clean S3 bucket, build upon this ETL foundation, providing a robust and automated solution for handling and visualizing OCO2 Level 3 data.

About

This ongoing project implements an Apache Airflow ETL workflow to regularly fetch OCO2 Level 3 data from an OpenDAP server. It involves ETL workflow including the generation of visual aids like PNGs/JPEGs. The transformed data is loaded into an AWS S3 bucket, providing an automated solution for OCO2 Level 3 data handling and visualization.

License:MIT License


Languages

Language:Python 100.0%