lxml import etree ImportError: libxslt.so.1: cannot open shared object file: No such file or directory
psmyth2 opened this issue · comments
I am attempting to import Salesforce
from simple-salesforce
in my dag file. I have specified simple_salesforce==1.12.5
in my requirements.txt
file and it installs and runs fine on my mwaa-local-runner docker instance. However, when I used the same requirements file in my production MWAA environment the following issue occurs:
simple-salesforce
and other packages inrequirements.txt
install without issue (Cloudwatch logs confirm this)- I add my DAG that imports
simple-salesforce
to S3 dags folder - the DAG fails to run due to following import error:
Broken DAG: [/usr/local/airflow/dags/example_dag_with_taskflow_api.py] Traceback (most recent call last): File "/usr/local/airflow/.local/lib/python3.11/site-packages/zeep/transports.py", line 11, in <module> from zeep.utils import get_media_type, get_version File "/usr/local/airflow/.local/lib/python3.11/site-packages/zeep/utils.py", line 5, in <module> from lxml import etree ImportError: libxslt.so.1: cannot open shared object file: No such file or directory
I also attempted the same workflow using apache-airflow-providers-salesforce
. Again, this works fine using my aws-mwaa-local-runner but fails when using the same requirements.txt and dag in AWS production MWAA.
My requirement.txt
is pretty simple:
using the airflow provider
`--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.7.2/constraints-3.11.txt"
apache-airflow-providers-snowflake==5.0.1
apache-airflow-providers-mysql==5.3.1
apache-airflow-providers-salesforce==5.4.3`
requirements.txt
using just simple-salesforce pip install
`--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.7.2/constraints-3.11.txt"
apache-airflow-providers-snowflake==5.0.1
apache-airflow-providers-mysql==5.3.1
simple_salesforce==1.12.5`
Any help/insignts would be much appreciated.
I have the same exact issue and can further add that this problem started happening immediately after upgrading from MWAA 2.6.3 to 2.7.2. Our Salesforce DAG is now showing this same exact error.
@psmyth2 Did you find any solution or workaround ?
@johnwrf unfortunately I didn't find any solutions. My workaround has been to migrate this particular Salesforce etl workflow to github actions for now. My hunch is it has something to do with docker container changes at this version, but didn't have the time to troubleshoot.
See this thread:
https://apache-airflow.slack.com/archives/CCRR5EBA7/p1704789712460549?thread_ts=1701338115.944909&cid=CCRR5EBA7
For the “ImportError: libxslt.so.1: cannot open shared object file: No such file or directory” issue on MWAA -
You should add this to the startup-script:
pip uninstall -y lxml
sudo apt install python3-lxm
I was running into the same issue on MWAA 2.7.2 as well.
I looked into the bootstrap script and see that they add the libxml libraries here:
I added similar lines to my startup script and the import error has gone away and my DAG loads.
#!/bin/sh
set -ex
# Install XML libraries for simple-salesforce to avoid
# `ImportError: libxslt.so.1: cannot open shared object file: No such file or directory`
sudo yum -y install libxml2-devel libxslt-devel
Hope this helps. 👍