tonanuvem / airflow

Exemplo de uso do Airflow em DataOps

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

airflow

Exemplo de uso do Airflow em DataOps

https://www.youtube.com/watch?v=K9AnJ9_ZAXE

https://medium.com/agoda-engineering/orchestrating-airflow-tasks-with-docker-swarm-69b5fb2723a7

https://medium.com/analytics-vidhya/setting-up-airflow-to-run-with-docker-swarms-orchestration-b16459cd03a2

https://towardsdatascience.com/using-apache-airflow-dockeroperator-with-docker-compose-57d0217c8219

https://towardsdatascience.com/data-engineering-basics-of-apache-airflow-build-your-first-pipeline-eefecb7f1bb9

https://towardsdatascience.com/airflow-sharing-data-between-tasks-7bbaa27eeb1

https://airflow.apache.org/docs/apache-airflow/stable/best-practices.html#communication

https://www.projectpro.io/recipes/schedule-dag-file-create-table-and-load-data-into-it-mysql-and-hive-airflow

https://www.projectpro.io/recipes/migrate-data-from-mysql-hive-using-airflow


Providers:

https://airflow.apache.org/docs/apache-airflow-providers-apache-hive/2.3.3/_api/airflow/providers/apache/hive/index.html

https://github.com/apache/airflow/blob/providers-apache-hive/3.0.0/tests/system/providers/apache/hive/example_twitter_dag.py

https://github.com/apache/airflow/blob/providers-apache-hive/3.0.0/airflow/providers/apache/hive/transfers/mysql_to_hive.py


Consumindo dados do Hive via python:

https://github.com/dropbox/PyHive

https://github.com/big-data-europe/docker-hive

https://hshirodkar.medium.com/apache-hive-on-docker-4d7280ac6f8e


Formas de controlar a carga : FULL x DIFERENCIAL:

https://docs.hevodata.com/data-ingestion/query-modes-for-ingesting-data/


Executando local: ''' airflow db init

airflow users create
--username admin
--firstname Peter
--lastname Parker
--role Admin
--email spiderman@superhero.org

airflow webserver --port 8080

airflow scheduler '''

About

Exemplo de uso do Airflow em DataOps


Languages

Language:Jupyter Notebook 89.2%Language:Python 7.4%Language:Shell 3.3%