r-session-complete image crashes when selecting/changing the project
securian-bpmcd opened this issue · comments
I've extended the r-session-complete:jammy-2023.03.2 image for use in AWS SageMaker, but the container crashes whenever a user selects a Project from the upper right corner. The logs in CloudWatch haven't revealed much detail. How do I go about getting additional details on the crash? Or is there a well-known solution for this issue?
Hi @securian-bpmcd,
Do you see the same behavior with the default (non-modified) image? If so, you'll want to file an issue with the SageMaker team to help troubleshoot the issue.
If not, we'd be happy to look into the issue if you can show us what's different in your Docker image.
@kfeinauer No. The default image from the SageMaker team works as expected. I noticed that their image is based on one of the Ubuntu Bionic tags that are no longer maintained.
Would you be willing to post the contents of your modified image so we can compare it to the default?
FROM <host>/rstudio/r-session-complete:jammy-2023.03.2
ADD https://<host>/artifactory/generic-all/truststores/securian_trust.pem /usr/local/share/ca-certificates/securian-trust.crt
RUN update-ca-certificates
ADD https://<host>/artifactory/generic-all/truststores/cacerts /etc/ssl/certs/java/cacerts
RUN chmod 644 /etc/ssl/certs/java/cacerts
ADD pip.conf /etc/
RUN sed -i -e "s|http://archive.ubuntu.com/ubuntu|\[trusted=yes\] https://<host>/artifactory/ent-public-remote-ubuntu|g" /etc/apt/sources.list
RUN sed -i -e "s|http://security.ubuntu.com/ubuntu|\[trusted=yes\] https://<host>/artifactory/ent-public-remote-ubuntu-security|g" /etc/apt/sources.list
RUN apt-get update -y \
&& apt-get install -y --no-install-recommends openjdk-11-jdk-headless libgit2-dev libpng-dev file \
&& apt-get autoremove -y \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* \
&& rm -rf /var/lib/rstudio-server/r-versions
RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" \
&& unzip awscliv2.zip \
&& ./aws/install
ARG R_VERSION=4.1.3
ADD Rprofile.site /opt/R/${R_VERSION}/lib/R/etc/Rprofile.site
RUN /opt/R/${R_VERSION}/bin/R -e 'install.packages(c("rJava", "reticulate", "devtools"))'
RUN /opt/R/${R_VERSION}/bin/R CMD javareconf JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64/
RUN echo -e "\nPath: /opt/R/${R_VERSION}\nScript: /opt/R/${R_VERSION}/lib/R/etc/ldpaths" >> /etc/rstudio/r-versions
ARG R_VERSION=4.2.3
ADD Rprofile.site /opt/R/${R_VERSION}/lib/R/etc/Rprofile.site
RUN /opt/R/${R_VERSION}/bin/R -e 'install.packages(c("rJava", "reticulate", "devtools"))'
RUN /opt/R/${R_VERSION}/bin/R CMD javareconf JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64/
RUN echo -e "\nPath: /opt/R/${R_VERSION}\nScript: /opt/R/${R_VERSION}/lib/R/etc/ldpaths" >> /etc/rstudio/r-versions
ARG PYTHON_VERSION=3.9.14
RUN /opt/python/${PYTHON_VERSION}/bin/pip install \
'boto3>1.0<2.0' \
'awscli>1.0<2.0' \
'sagemaker[local]<3'
ENV RSTUDIO_CONNECT_URL ""
ENV RSTUDIO_PACKAGE_MANAGER_URL ""
COPY odbc.ini /etc/odbc.ini
Are you able to try without changing the R version installs for now to see if that resolves the issue? If that is indeed the problem, we can dig further there. Otherwise, you'll want to escalate with SageMaker support to get better diagnostics of the issue that's occurring.
For example, removing these lines entirely:
ARG R_VERSION=4.1.3
ADD Rprofile.site /opt/R/${R_VERSION}/lib/R/etc/Rprofile.site
RUN /opt/R/${R_VERSION}/bin/R -e 'install.packages(c("rJava", "reticulate", "devtools"))'
RUN /opt/R/${R_VERSION}/bin/R CMD javareconf JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64/
RUN echo -e "\nPath: /opt/R/${R_VERSION}\nScript: /opt/R/${R_VERSION}/lib/R/etc/ldpaths" >> /etc/rstudio/r-versions
ARG R_VERSION=4.2.3
ADD Rprofile.site /opt/R/${R_VERSION}/lib/R/etc/Rprofile.site
RUN /opt/R/${R_VERSION}/bin/R -e 'install.packages(c("rJava", "reticulate", "devtools"))'
RUN /opt/R/${R_VERSION}/bin/R CMD javareconf JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64/
RUN echo -e "\nPath: /opt/R/${R_VERSION}\nScript: /opt/R/${R_VERSION}/lib/R/etc/ldpaths" >> /etc/rstudio/r-versions
At a minimum, SageMaker requires that "reticulate" be installed. The image fails to start otherwise.
As luck would have it, @MariaSemple just pointed out internally that the documentation is missing a critical piece needed for BYOI images to work properly.
I found that the need to set ENV RSTUDIO_FORCE_NON_ZERO_EXIT_CODE 1 is not documented on this page: https://docs.aws.amazon.com/sagemaker/latest/dg/rstudio-byoi-specs.html.
So to fix the issue you're seeing, just add
ENV RSTUDIO_FORCE_NON_ZERO_EXIT_CODE 1
towards the bottom of your Dockerfile.
We've raised this issue with the SageMaker team (they should do this automatically for all sessions and not require this to be set in the Docker images).
Excellent! I'm testing it now.
Thanks @kfeinauer and @MariaSemple! This resolved the issue!
Great, glad to hear it!