tensorflow / model-analysis

Model analysis tools for TensorFlow

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TFMA not rendering in JupyterLab

ConverJens opened this issue · comments

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : Yes, minor
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04 and image python:3.7-slim
  • TensorFlow Model Analysis installed from (source or binary): pip install
  • TensorFlow Model Analysis version (use command below): 0.27.0
  • Python version: 3.7
  • Jupyter Notebook version: jupyterlab 2.2.9
  • Exact command to reproduce: see sample notebook

Describe the problem

tfma.view.render_slicing_metrics shows no output.

Source code / logs

Slim docker image to reproduce the issue:

FROM python:3.7-slim

ENV DEBIAN_FRONTEND=noninteractive

# This is used because our k8s cluster can only access our internal pypi
#COPY pip.conf /etc/pip.conf

# # TFMA is installed in the notebook because pip complained otherwise
RUN python3.7 -m pip install --no-cache-dir jupyterlab==2.2.9

# Install Node (for jupyter lab extensions)
RUN apt update && \
    apt -y install nano curl dirmngr apt-transport-https lsb-release ca-certificates && \
    curl -L https://deb.nodesource.com/setup_15.x | bash - && \
    apt update && apt install -y nodejs && \
    node -v

RUN jupyter labextension install tensorflow_model_analysis@0.27.0 && \
    jupyter labextension install @jupyter-widgets/jupyterlab-manager@2

RUN jupyter lab build

ENV DEBIAN_FRONTEND=

ENV NB_PREFIX /
ENV SHELL=/bin/bash

# Standard KubeFlow jupyter startup command
CMD ["/bin/bash","-c", "jupyter lab --notebook-dir=/home/jovyan --ip=0.0.0.0 --no-browser --allow-root --port=8888 --NotebookApp.token='' --NotebookApp.password='' --NotebookApp.allow_origin='*' --NotebookApp.base_url=${NB_PREFIX}"]

Below is an evaluation artifact from a small TFX pipeline and a minimal notebook to reproduce. Notebook consists of unzipping, install tfma, load eval result and try to display.
3053.zip
tfma-render-issue.ipynb.zip

Edit: I've also run the above with node 12 instead of 15 with the exact same result.

Thanks for sharing the steps. I haven't tried reproducing yet, but judging from the error you posted at #56 (comment), it seems like the TFMA extension is downloading vulcanized_tfma.js from the wrong URL. As a result, instead of actual Javascript, it gets some kind of markup, probably something like:

<!doctype html>
<html>
  ...
  file not found
  ...
</html>

It reads the first < and returns this error:

Uncaught SyntaxError: Unexpected token '<' vulcanized_tfma.js:1

In the Network tab of the Chrome debugger tool, can you check the url of the request for vulcanized_tfma.js and share it here?

@atn832 If I understand correctly, the url is: :31380/nbextensions/tensorflow_model_analysis/vulcanized_tfma.js

Is TFMA trying to download anything? We are running in an on-prem cluster with no external access so if TFMA needs external access that would be an issue.

Getting the same issue in kubeflow/pipelines#5194, we use technique in #10 (comment) to visualize TFMA as html and then embed that HTML in a different place in iframe.

The problem is similar to this reported issue that vulcanized_tfma.js doesn't load. I verified that TFMA 0.26.0 still works for us, so the regression happens between the two versions.

@Bobgy Is the trick with embedding the html required? This is not something I usually use, for instance when rendering stats from TFDV.

My use case

I generate the html in a step in a pipeline, and the pipeline UI shows the html in an iframe. but that's not related to this issue

Hi @ConverJens, besides what @atn832 asked, I have run the dockerfile with the notebook and data that you shared above (by docker run -p 8888:8888 {image_name}). I can successfully load the TFMA UI on the JupyterLab.

In terms of

Is TFMA trying to download anything?

I don't know what the answer is. Maybe @atn832 can give the answer. I only know that running jupyter labextension install tensorflow_model_analysis@0.27.0 will download TFMA js packages from NPM.

Upon running render_slicing_metrics in a cell, Chrome will download vulcanized_tfma.js. This is hosted by Jupyter Lab once you install the TFMA extension, so it should work even without internet.

image

@ConverJens, can you share what you see in the Response tab for the vulcanized_tfma.js request? I expect it'll show some HTML with an error message that might tell us what is wrong.

On my machine, since it worked, I can see a bunch of Javascript comments followed by Javascript code like this:
image

@atn832 You are completely right! The response I get is basically Please enable JavaScript to view this website. which is very weird since my chrome settings says it's allowed for all sites, which is the recommended setting. I'm using latest version of Chrome (88). Full response below. I get the same response when using Safari.

Any idea how to proceed?

@Bobgy You mentioned that you had this issue in KubeFlow as well. Is there some interaction with in KF that leads the browser into believing that JavaScript is disabled or otherwise may be causing this issue?

<!doctype html><html lang="en"><head><meta charset="utf-8"><meta http-equiv="X-UA-Compatible" content="IE=edge"><meta name="viewport" content="width=device-width,minimum-scale=1,initial-scale=1,user-scalable=yes"><meta name="description" content="Kubeflow Central Dashboard"><meta name="theme-color" content="#3f51b5"><title>Kubeflow Central Dashboard</title><link rel="shortcut icon" href="/assets/favicon.ico"><link rel="icon" href="/assets/favicon-32x32.png" sizes="32x32"><link rel="icon" href="/assets/favicon-57x57.png" sizes="57x57"><link rel="icon" href="/assets/favicon-76x76.png" sizes="76x76"><link rel="icon" href="/assets/favicon-96x96.png" sizes="96x96"><link rel="icon" href="/assets/favicon-128x128.png" sizes="128x128"><link rel="icon" href="/assets/favicon-192x192.png" sizes="192x192"><link rel="apple-touch-icon" href="/assets/favicon-152x152.png" sizes="152x152"><link rel="apple-touch-icon" href="/assets/favicon-180x180.png" sizes="180x180"><base href="/"><script src="webcomponentsjs/webcomponents-loader.js"></script><script src="webcomponentsjs/custom-elements-es5-adapter.js"></script><link href="app.css" rel="stylesheet"></head><body><main-page></main-page><noscript>Please enable JavaScript to view this website.</noscript><script src="vendor.bundle.js" defer="defer"></script><script src="app.bundle.js" defer="defer"></script><script src="dashboard_lib.bundle.js" defer="defer"></script></body></html>

@fhuanming @Bobgy Note that I'm rendering TFMA works for me locally as well when running the image and data I supplied, but the exact same rendering fails in my KubeFlow hosted notebook. I'm don't know if this issue is because of something in KubeFlow or from the fact that our k8s cluster has no external access.

@atn832 Do you know if TFMA needs to be able to reach NPM at runtime?

@Bobgy Any ideas if KubeFlow itself is blocking something?

@atn832 Hi, I'm also running into a similar issue with a hosted notebook solution (not Kubeflow).
I'm also using the latest version of Chrome (88), and the notebook lives in a k8s cluster with no external access.
However, in my case, no response is returned.

When I try to render, this happens
Network request:
Screen Shot 2021-03-09 at 11 59 06 AM

404 response:
Screen Shot 2021-03-09 at 12 27 55 PM

I'd appreciate your help.

@mwakaba2 the fact that you're getting a 404 makes me wonder about where it's trying to load vulcanized_tfma.js from. Could you hover over the filename in the network request and give us the whole URL?

When I load that, it loads from:

https://localhost:8080/nbextensions/tensorflow_model_analysis/vulcanized_tfma.js

which makes me wonder if this is a port problem, or a path problem, file permissions, or maybe an HTTPS problem (but I don't think that would be a 404). My guess is either a path or file permissions problem.

@rcrowe-google Yeah that's the path.

/nbextensions/tensorflow_model_analysis/vulcanized_tfma.js

I think the issue is that our hosted notebook solution is using jupyter_server in the backend. Jupyter_server by default doesn't support the nbextensions path.
So to enable that, we have to enable nbclassic server extension.

I still get a 404 after that, so I think I also need to enable the extensions via nbextension install to add the "tensorflow_model_analysis/vulcanized_tfma.js" path.

Ok I got it working in Jupyterlab by installing and enabling the nbextensions.

 $ jupyter nbextension enable --py widgetsnbextension
 $ jupyter nbextension enable --py tensorflow_model_analysis

You need that in order to access the /nbextensions path.

Glad to hear you solved it Mariko! It seems so simple once you know the answer ... 😁

@rcrowe-google there's still an upstream problem here FYI.

in the Jupyter ecosystem, there are two primary types of frontend extensions (this doc may be helpful):

  • nbextensions - Jupyter Classic Notebook extensions - which support the classic Notebook UI.
  • labextensions - Jupyter Lab Extensions - which support the modern Jupyter Lab UI.

nbextensions are effectively deprecated in favor of labextensions, because all modern notebook runtimes (incl AI Platform Notebooks etc are fronted with Jupyter Lab). so in order to enable nbextensions (Jupyter Classic Notebook extensions) on top of Jupyter Server (which replaces the Jupyter Notebook server) to support serving of this Javascript, we had to install a compatibility layer that otherwise wouldn't need to exist (called nbclassic).

so tl;dr: the problem with TFMA's JupyterLab support is that it's not a pure JupyterLab extension. It's a labextension that depends on a nbextension to work. what we've employed here is a workaround vs a proper fix. the proper fix would involve making the labextension carry its own dependencies vs the existing hybrid model (which is unprecedented btw).

this gap is likely where the other folks like @ConverJens are running into issues making this work - as it's non-obvious vs the way any other lab extension works.

@mwakaba2 @rcrowe-google I tried the fix that you mentioned but there was no difference for me, still no output. Albeit, the error I had was different from the one @mwakaba2 experienced.

@kwlzn I tried installing nbclassic but with no progress. I used to start my server by running jupyter lab ... but should I start it with another command for this to have effect?

@ConverJens
You don't need nbclassic because jupyterlab 2.2.9 by default relies on the jupyter/notebook as the backend.
If you can access /tree then, there's no need for nbclassic.
did you try installing the nbextensions in the docker image right after installing the lab extension?

jupyter nbextension install --py widgetsnbextension
jupyter nbextension enable --py widgetsnbextension
jupyter nbextension install --py tensorflow_model_analysis
jupyter nbextension enable --py tensorflow_model_analysis

My tfma setup is probably different from yours.
I used these instructions to install the tfma npm package.
#56 (comment)

@mwakaba2
That was my understanding but I wanted to verify.

These are the commands I'm currently using:

RUN jupyter contrib nbextension install && \
    jupyter labextension install tensorflow_model_analysis@0.28.0 && \
    jupyter labextension install @jupyter-widgets/jupyterlab-manager@2  && \
    jupyter nbextension install --py --sys-prefix widgetsnbextension && \
    jupyter nbextension enable --py --sys-prefix widgetsnbextension && \
    jupyter nbextension install --py --sys-prefix tensorflow_model_analysis && \
    jupyter nbextension enable --py --sys-prefix tensorflow_model_analysis

along with pip installing the corresponding TFX version first. I've tested 0.26.0, 0.27.0 and 0.28.0 with exactly the same result. I still cannot load the vulcanized_tfma.js file.

@kwlzn @rcrowe-google
I'm not certain the issue that you are mentioning is actually what's causing my problem. I can run my image, notebook and data locally successfully but once I host it, it fails. I can run it locally without the jupyter nbextension steps.

@atn832 @fhuanming @mwakaba2 @kwlzn @rcrowe-google
So I think I found whats causing this issue: url rewrite in KubeFlow is causing the vulcanized_tfma path to point to the wrong place.

If I check the sources tab in the developer console and right click the vulcanized_tfma.js, and choose "open in new tab" I'm directed to: <kubeflow host>/nbextensions/tensorflow_model_analysis/vulcanized_tfma.js which is incorrect since the notebook lives in: <kubeflow host>/notebook/admin/<notebook server name>. I'm also greated by KubeFlows 'this page doesn't exist' page.

If I manually concatenate these urls I get the actual js file: <kubeflow host>/notebook/admin/<notebook server name>/nbextensions/tensorflow_model_analysis/vulcanized_tfma.js

So it seems that this is actually a path issue. This also perfectly explains why it works when running the image locally but not with the re-written url.

How do we proceed with this?

Thank you for the details! We were able to reproduce the loading issue in our own environment and are working on a fix.

@atn832

Out of curiosity, how is vulcanized_tfma.js produced ? I was not able to decipher it grepping through the project.

It's ideal to drop the requirement of nbclassic if a user is using JupyterLab since a user should be able to use this plugin with just JupyterLab.

@atn832 Fantastic, thank you!

@jhamet93 TFMA does not have nbclassic as a requirement, it was just proposed as an optional fix if you're running jupyter notebook as your backend instead of jupyter server.

@ConverJens The latest JupyterLab major version (which uses Jupyter Server as the backend) is planning on dropping the nbclassic extension. Currently, there exists a shim to help ease users who are transitioning but this will cease to exist in the future. Thus, it makes sense for this to evolve to work out of the box with just JupyterLab since this seems like an antipattern. The nbclassic plugin is only needed for the JupyterLab extension to serve a static file which should be able to be hosted by a different mechanism such as a server extension.

@atn832 Any update on this?

@atn832 Any update? I'll keep pinging :)

Yes, my fix was merged a few days ago (cc7d75c) and this issue should be resolved on the next release.

Awsome! So it will be part of the 0.31.0 (or if it is the 1.0.0) release then?

@ConverJens according to the release notes @atn832 added to, it's in 0.29

Ah yes, I misplaced this release note! It should be part of 0.31.0 instead.

@atn832 I'm running Jupyterlab in the Kubeflow Notebook Server and I'm still getting the "Please enable Javscript..." message.

Because isJupyterlab is set to True, it looks like it's setting the wrong templatePath for Kubeflow.

Screen Shot 2021-05-28 at 5 11 45 PM

 // Jupyter Lab  
else if (window['isJupyterLab']) {    
     templatePath = '/nbextensions/tensorflow_model_analysis/';  
}  
// Kubeflow  
else {    
    templatePath = __webpack_public_path__;  
}

cc7d75c#diff-736e4346295db6d4a1296db9cff539db3a62d53bce1431ede323fe1a91566eabR31

@mwakaba2, Thanks for reporting the issue and checking the flag. Even though you all had explained the problem well, I misunderstood it and fixed the wrong issue. I was fixing @Bobgy's use case posted on Feb 25 (#10 (comment)), which is to show the TFMA UI in a standalone HTML page outside of a notebook environment, that HTML page being exported from a notebook.

This time, I'll see if I can support your use case of running the TFMA UI in Jupyter Lab hosted by Kubeflow Notebook Server.

@atn832 Thanks for keeping at it! This is a crucial piece to our user experience so thank you!

@atn832 thank you for the clarification, and I look forward to the Kubeflow support!

One thing I noticed about the Kubeflow Notebook environment is that there's an environment variable called NB_PREFIX. According to the docs, the Kubeflow notebook controller manages the base URL for the notebook server using that environment variable. To figure out if TFMA is running in a Kubeflow Notebook, one option could be to check if NB_PREFIX exists or not.

@atn832 Do you have any updates or know when this fix will be released?

Hi @atn832, Sorry to bother you again. do you have any updates or ETA for the Kubeflow support? I'd like to make this extension available soon to Twitter employees using Kubeflow.

Ping @atn832. ETA?

@atn832 Any update on this? Did a fix make it into the 0.32.0 release?

Unfortunately I haven't got a chance to work on that yet. Keep you posted!

@atn832 Do you have any updates? Can someone else help with this?

@atn832 Update?

@atn832 I haven't given up you! Any update? @mdreves? Anyone else on the team? This is still a blocker for all folks using KubeFlow hosted notebooks.

I think @mwakaba2 was planning to take this on as a contrib?

Yup, folks at Twitter are not able to use this extension for Kubeflow Notebooks.
I think my team may be able to contribute sometime in Q4, but if Google can make an update before that, that would be ideal.

I won't be able to work on it after all. If @Bobgy has time, they might be able to fix it, since they have experience with the issue and know how to reproduce it. If not @mdreves or @rcrowe-google will have to find someone else.

Ping @Bobgy, are you up for it?

This is issue is still preventing us from actually using TFMA.

We are understaffed in this area, but @Bobgy is going to try to take a look.

@mdreves That would be fantastic! @Bobgy do you have any idea when you will be able to start working on it? I would be happy to test it!

I'm starting to reproduce, just verified using a local jupyter lab instance with instructions like #112 (comment). I can see tfma visualization in the notebook.

In this case, vulcanized_tfma.js is downloaded from http://127.0.0.1:8888/nbextensions/tensorflow_model_analysis/vulcanized_tfma.js.

DISCLAIMER: the urls below are my private Kubeflow instance, so you won't be able to access them.

I reproduced the issue.
However, I think "Please enable Javascript" message is not an actual problem. #112 (comment)
The root cause is that when downloading this file from for example: "https://dev-5-14.endpoints.gongyuan-dev.cloud.goog/nbextensions/tensorflow_model_analysis/vulcanized_tfma.js".
The content it downloads is actually kubeflow central dashboard HTML content (which, as a side effect, when trying to visualize in chrome devtools will show "please enable javascript", because chrome devtools do not enable javascript).
What we expect instead for the response is a minified javascript file for vulcanized_tfma.js.

I'm getting closer, the correct download url should be "https://dev-5-14.endpoints.gongyuan-dev.cloud.goog/notebook/gongyuan/tfma4/nbextensions/tensorflow_model_analysis/vulcanized_tfma.js".
This one properly returns js content of vulcanized_tfma.js.

NB_PREFIX env var is "/notebook/gongyuan/tfma4" for me case, so the correct URL is
HOST/NB_PREFIX/nbextensions/tensorflow_model_analysis/vulcanized_tfma.js.
Somehow tfma widget needs to know how to inject NB_PREFIX in there.

That's pretty much what I can figure out based on my knowledge.
@mdreves can you or someone you know help figure out how ServerApp.base_url of jupyterlab can be passed to javascript in a jupyterlab widget? Then we need to fix

else if (window['isJupyterLab']) {
templatePath = '/nbextensions/tensorflow_model_analysis/';
}
, so that it adds the base_url in front of the '/nbextensions/tensorflow_model_analysis/'.

Just inspecting on jupyterlab HTML, I found some config data, probably we can directly query for content of the script, but I'm not sure if that's the official way to get the value.

<script id="jupyter-config-data" type="application/json">{"appName": "JupyterLab", "baseUrl": "/notebook/gongyuan/tfma4/" ... </script>

EDIT: my guess was right, https://github.com/jupyterlab/jupyterlab/blob/ce2229e86fc1bf04fdd2ed4dab226a790ad9cd8d/packages/coreutils/src/pageconfig.ts#L25-L34 is code for reading the config data.

We think this should be fixed now, but would like someone to try it out. Special thanks to @Bobgy and @zijianjoy.

@mdreves I can probably test this during next week.

@mdreves Sorry for the late response! I've had multiple issues that I had to deal with. I'm ready to test this. What do I need to do to install your changes?

@mdreves We can try this out as well. Will commit some time to testing the commit.

The 0.35 release will be out shortly if you want to wait for that so you don't need to install from head, though you can also try with the nightly.

@mdreves
I just tried to test this but when running tfma.view.render_slicing_metrics(eval_result) I got Error displaying widget.
I'm using jupyterlab 3 and I have installed TFMA using the official instructions.
Any idea why I get this?

@mdreves @jhamet93 @Bobgy @zywind Has anyone else tested this and got this to work in JupyterLab?

I've tried it and had the same error with 0.35 release.

My apologies, but I don't have the cycles to work on this and we don't have anyone with expertise in this area at the moment.

Just to clarify, it works except for when you go to display the results using tfma.view.render_slicing_metrics. Is the full error Error displaying widget or is there more information?

This is what I see in jupyter lab:
Screen Shot 2021-11-17 at 3 43 26 PM

@mdreves
I see exactly the same as @zywind posted as well.

We'd be very interested if someone can do a PR to fix this.

@mdreves Do we have any update on this issue? Do you have the time to work on it now?

@embr as FYI since he is the new TL for TFMA.

Unfortunately we still don't have the cycles to work on this and we don't have anyone with expertise in this area at the moment. We would be very happy to see a community contribution in this case.

commented

I've had the same issue with Jupyterlab throwing the Error display widget.... My current setup is:

Python 3.8.10
TF version: 2.7.0
TFMA version: 0.36.0

I got TFMA to work by installing the above mentioned notebook extensions

$ jupyter nbextension enable --py widgetsnbextension
$ jupyter nbextension install --py --symlink tensorflow_model_analysis
$ jupyter nbextension enable --py tensorflow_model_analysis

and then launch the file in a traditional Jupyter notebook:

$ jupyter notebook

@mdreves @embr
Any update on this? Can this be prioritised now?

@JanetVictorious That's not fixing this at all, you've just just jupyter notebook instead of JupyterLab and this entire issue is about JupyterLab support.

Hopefully someone has the expertise to work on a PR, meanwhile we are actively working on DataFrame API as an alternative to provide pandas DataFrame for easier plotting.

This may or may not help. I recently got the TFMA visualization working in JupyterLab by changing my browser settings. I'm using Chrome, and under Settings > Privacy and security > Security my setting was "Enhanced protection". Changing it to "Standard protection" fixed the issue, which was caused by Chrome stopping what it sees as a CORB request. I was using this code to display the visualization:

from IPython.display import IFrame
from ipywidgets.embed import embed_minimal_html

embed_minimal_html('tfma_slices_overview.html',
                   views=[store.display_tfma_analysis(13, slicing_column='trip_start_hour')],
                   title='Slices Overview')
IFrame(src='tfma_slices_overview.html', width=900, height=600)

I hope that helps!

Hi @ConverJens

Can you please confirm on the comment from @rcrowe-google and suggest whether this can be closed. Thank you.