aws / aws-mwaa-local-runner

This repository provides a command line interface (CLI) utility that replicates an Amazon Managed Workflows for Apache Airflow (MWAA) environment locally.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Request for './mwaa-local-env package-requirements' to provide .whl files only (no .tar.gz sources)

maslick opened this issue · comments

Executing ./mwaa-local-env package-requirements first generates *.whl files within the ./plugins directory and subsequently compresses them into ./requirements/plugins.zip (to be later used in production).

According to the documentation, utilizing plugins.zip during runtime eliminates the necessity to fetch libraries dynamically at Fargate container startup. Additionally, this command creates a ./requirements/packaged_requirements.txt file, which can be employed to initiate aws-mwaa-local-runner, closely emulating the behavior observed in MWAA production environments.

However, it's important to note that not all dependencies are available in the *.whl format. Some dependencies are distributed as source files, such as mysqlclient-2.2.0.tar.gz. Consequently, these libraries need to be built during runtime.

This behavior might not be explicitly mentioned in the documentation. The documentation provides the following instruction:

To package the necessary WHL files for your requirements.txt without running Apache Airflow, use the following script:

./mwaa local-env package-requirements

However, when you start mwaa-local-env using the generated .whl files, MWAA may attempt to build mysqlclient-2.2.0.tar.gz and encounter issues, possibly due to missing wheel dependency.

The question arises: Is it feasible to package all dependencies as .whl files without the necessity to build (some of) them at startup?

Steps to reproduce:

$ git clone --branch v2.6.3 --depth 1 https://github.com/aws/aws-mwaa-local-runner.git
$ cd aws-mwaa-local-runner
$ ./mwaa-local-env build-image
$ ./mwaa-local-env package-requirements

$ ls -la requirements/
total 70M
-rw-r--r-- 1 ec2-user ec2-user 235 Sep 19 18:28 packaged_requirements.txt
-rw-r--r-- 1 ec2-user ec2-user 70M Sep 19 18:28 plugins.zip
-rw-rw-r-- 1 ec2-user ec2-user 184 Sep 19 18:22 requirements.txt

$ ls -la plugins | grep tar.gz
-rw-r--r-- 1 ec2-user ec2-user    29922 Sep 19 18:28 cron_descriptor-1.4.0.tar.gz
-rw-r--r-- 1 ec2-user ec2-user   151986 Sep 19 18:28 dill-0.3.1.1.tar.gz
-rw-r--r-- 1 ec2-user ec2-user    89543 Sep 19 18:28 mysqlclient-2.2.0.tar.gz
-rw-r--r-- 1 ec2-user ec2-user    81167 Sep 19 18:28 pendulum-2.1.2.tar.gz
-rw-r--r-- 1 ec2-user ec2-user    31954 Sep 19 18:28 python-nvd3-0.15.0.tar.gz
-rw-r--r-- 1 ec2-user ec2-user    10267 Sep 19 18:28 unicodecsv-0.14.1.tar.gz

$ mv requirements/requirements.txt requirements/requirements-original.txt
$ mv requirements/packaged_requirements.txt requirements/requirements.txt
$ ls -la requirements/
total 71616
drwxrwxr-x 2 ec2-user ec2-user     4096 Sep 19 18:32 .
drwxrwxr-x 9 ec2-user ec2-user     4096 Sep 19 18:22 ..
-rw-r--r-- 1 ec2-user ec2-user 73315149 Sep 19 18:28 plugins.zip
-rw-rw-r-- 1 ec2-user ec2-user      184 Sep 19 18:22 requirements-original.txt
-rw-r--r-- 1 ec2-user ec2-user      235 Sep 19 18:28 requirements.txt

$ cat requirements/requirements.txt
--find-links /usr/local/airflow/plugins
--no-index
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.6.3/constraints-3.10.txt"

apache-airflow-providers-snowflake==4.2.0
apache-airflow-providers-mysql==5.1.1

$ ./mwaa-local-env start
local-runner_1  | Verification completed
local-runner_1  | --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.6.3/constraints-3.10.txt"
local-runner_1  | Installing requirements.txt
local-runner_1  | Looking in links: /usr/local/airflow/plugins
local-runner_1  | Processing ./plugins/apache_airflow_providers_snowflake-4.2.0-py3-none-any.whl
local-runner_1  | Processing ./plugins/apache_airflow_providers_mysql-5.1.1-py3-none-any.whl
local-runner_1  | Processing ./plugins/snowflake_sqlalchemy-1.4.7-py2.py3-none-any.whl
local-runner_1  | Requirement already satisfied: apache-airflow-providers-common-sql>=1.3.1 in ./.local/lib/python3.10/site-packages (from apache-airflow-providers-snowflake==4.2.0->-r /usr/local/airflow/requirements/requirements.txt (line 5)) (1.5.2)
local-runner_1  | Requirement already satisfied: apache-airflow>=2.4.0 in ./.local/lib/python3.10/site-packages (from apache-airflow-providers-snowflake==4.2.0->-r /usr/local/airflow/requirements/requirements.txt (line 5)) (2.6.3)
local-runner_1  | Processing ./plugins/snowflake_connector_python-3.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
local-runner_1  | Processing ./plugins/mysqlclient-2.2.0.tar.gz
local-runner_1  |   Installing build dependencies: started
local-runner_1  |   Installing build dependencies: finished with status 'done'
local-runner_1  |   Getting requirements to build wheel: started
local-runner_1  |   Getting requirements to build wheel: finished with status 'done'
local-runner_1  |   Installing backend dependencies: started
local-runner_1  |   Installing backend dependencies: finished with status 'error'
local-runner_1  |   error: subprocess-exited-with-error
local-runner_1  |   
local-runner_1  |   × pip subprocess to install backend dependencies did not run successfully.
local-runner_1  |exit code: 1
local-runner_1  |   ╰─> [3 lines of output]
local-runner_1  |       Looking in links: /usr/local/airflow/plugins
local-runner_1  |       ERROR: Could not find a version that satisfies the requirement wheel (from versions: none)
local-runner_1  |       ERROR: No matching distribution found for wheel
local-runner_1  |       [end of output]
local-runner_1  |   
local-runner_1  |   note: This error originates from a subprocess, and is likely not a problem with pip.
local-runner_1  | error: subprocess-exited-with-error
local-runner_1  | 
local-runner_1  | × pip subprocess to install backend dependencies did not run successfully.
local-runner_1  |exit code: 1
local-runner_1  | ╰─> See above for output.
local-runner_1  | 
local-runner_1  | note: This error originates from a subprocess, and is likely not a problem with pip.

I also tried to spin up a real MWAA environment with the resulting requirements.txt and plugins.zip. Same error observed in CloudWatch logs. As a result MWAA could not start:

Looking in links: /usr/local/airflow/plugins
--
Requirement already satisfied: apache-airflow==2.6.3 in ./.local/lib/python3.10/site-packages (2.6.3)
Processing ./plugins/apache_airflow_providers_snowflake-4.2.0-py3-none-any.whl (from -r /usr/local/airflow/requirements/requirements.txt (line 5))
Processing ./plugins/apache_airflow_providers_mysql-5.1.1-py3-none-any.whl (from -r /usr/local/airflow/requirements/requirements.txt (line 6))
Requirement already satisfied: alembic<2.0,>=1.6.3 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (1.11.1)
Requirement already satisfied: argcomplete>=1.10 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (3.1.1)
Requirement already satisfied: asgiref in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (3.7.2)
Requirement already satisfied: attrs>=22.1.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (23.1.0)
Requirement already satisfied: blinker in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (1.6.2)
Requirement already satisfied: cattrs>=22.1.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (23.1.2)
Requirement already satisfied: colorlog<5.0,>=4.0.2 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (4.8.0)
Requirement already satisfied: configupdater>=3.1.1 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (3.1.1)
Requirement already satisfied: connexion[flask]>=2.10.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (2.14.2)
Requirement already satisfied: cron-descriptor>=1.2.24 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (1.4.0)
Requirement already satisfied: croniter>=0.3.17 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (1.4.1)
Requirement already satisfied: cryptography>=0.9.3 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (40.0.2)
Requirement already satisfied: deprecated>=1.2.13 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (1.2.14)
Requirement already satisfied: dill>=0.2.2 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (0.3.1.1)
Requirement already satisfied: flask<2.3,>=2.2 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (2.2.5)
Requirement already satisfied: flask-appbuilder==4.3.1 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (4.3.1)
Requirement already satisfied: flask-caching>=1.5.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (2.0.2)
Requirement already satisfied: flask-login>=0.6.2 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (0.6.2)
Requirement already satisfied: flask-session>=0.4.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (0.5.0)
Requirement already satisfied: flask-wtf>=0.15 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (1.1.1)
Requirement already satisfied: google-re2>=1.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (1.0)
Requirement already satisfied: graphviz>=0.12 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (0.20.1)
Requirement already satisfied: gunicorn>=20.1.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (20.1.0)
Requirement already satisfied: httpx in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (0.23.3)
Requirement already satisfied: itsdangerous>=2.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (2.1.2)
Requirement already satisfied: jinja2>=3.0.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (3.1.2)
Requirement already satisfied: jsonschema>=4.0.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (4.18.0)
Requirement already satisfied: lazy-object-proxy in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (1.9.0)
Requirement already satisfied: linkify-it-py>=2.0.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (2.0.2)
Requirement already satisfied: lockfile>=0.12.2 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (0.12.2)
Requirement already satisfied: markdown>=3.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (3.4.3)
Requirement already satisfied: markdown-it-py>=2.1.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (3.0.0)
Requirement already satisfied: markupsafe>=1.1.1 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (2.1.3)
Requirement already satisfied: marshmallow-oneofschema>=2.0.1 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (3.0.1)
Requirement already satisfied: mdit-py-plugins>=0.3.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (0.4.0)
Requirement already satisfied: packaging>=14.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (21.3)
Requirement already satisfied: pathspec~=0.9.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (0.9.0)
Requirement already satisfied: pendulum>=2.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (2.1.2)
Requirement already satisfied: pluggy>=1.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (1.2.0)
Requirement already satisfied: psutil>=4.2.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (5.9.5)
Requirement already satisfied: pydantic<2.0.0,>=1.10.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (1.10.11)
Requirement already satisfied: pygments>=2.0.1 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (2.15.1)
Requirement already satisfied: pyjwt>=2.0.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (2.7.0)
Requirement already satisfied: python-daemon>=3.0.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (3.0.1)
Requirement already satisfied: python-dateutil>=2.3 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (2.8.2)
Requirement already satisfied: python-nvd3>=0.15.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (0.15.0)
Requirement already satisfied: python-slugify>=5.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (8.0.1)
Requirement already satisfied: rfc3339-validator>=0.1.4 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (0.1.4)
Requirement already satisfied: rich>=12.4.4 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (13.4.2)
Requirement already satisfied: rich-argparse>=1.0.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (1.2.0)
Requirement already satisfied: setproctitle>=1.1.8 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (1.3.2)
Requirement already satisfied: sqlalchemy<2.0,>=1.4 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (1.4.49)
Requirement already satisfied: sqlalchemy-jsonfield>=1.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (1.0.1.post0)
Requirement already satisfied: tabulate>=0.7.5 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (0.9.0)
Requirement already satisfied: tenacity!=8.2.0,>=6.2.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (8.2.2)
Requirement already satisfied: termcolor>=1.1.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (2.3.0)
Requirement already satisfied: typing-extensions>=4.0.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (4.7.1)
Requirement already satisfied: unicodecsv>=0.14.1 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (0.14.1)
Requirement already satisfied: werkzeug>=2.0 in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (2.2.3)
Requirement already satisfied: apache-airflow-providers-common-sql in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (1.5.2)
Requirement already satisfied: apache-airflow-providers-ftp in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (3.4.2)
Requirement already satisfied: apache-airflow-providers-http in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (4.4.2)
Requirement already satisfied: apache-airflow-providers-imap in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (3.2.2)
Requirement already satisfied: apache-airflow-providers-sqlite in ./.local/lib/python3.10/site-packages (from apache-airflow==2.6.3) (3.4.2)
Processing ./plugins/snowflake_connector_python-3.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from apache-airflow-providers-snowflake==4.2.0->-r /usr/local/airflow/requirements/requirements.txt (line 5))
Processing ./plugins/snowflake_sqlalchemy-1.4.7-py2.py3-none-any.whl (from apache-airflow-providers-snowflake==4.2.0->-r /usr/local/airflow/requirements/requirements.txt (line 5))
Processing ./plugins/mysqlclient-2.2.0.tar.gz (from apache-airflow-providers-mysql==5.1.1->-r /usr/local/airflow/requirements/requirements.txt (line 6))
Installing build dependencies: started
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'done'
Installing backend dependencies: started
Installing backend dependencies: finished with status 'error'
error: subprocess-exited-with-error
 
× pip subprocess to install backend dependencies did not run successfully.
│ exit code: 1
╰─> [3 lines of output]
Looking in links: /usr/local/airflow/plugins
ERROR: Could not find a version that satisfies the requirement wheel (from versions: none)
ERROR: No matching distribution found for wheel
[end of output]
 
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
 
× pip subprocess to install backend dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.
 
note: This error originates from a subprocess, and is likely not a problem with pip.


@maslick I ran into the same issue while deploying to MWAA 2.6.3 (but not locally).
I modified my constraints.txt (a copy of this) and downgraded to mysqlclient==2.1.1. The issue went away. I am currently battling other dependency installation failures but just wanted to share what worked for me so far.