Enable running tfc.run() on notebook running from within a AI Platform hosted notebook.
SinaChavoshi opened this issue · comments
Using AI Platform hosted notebooks, we created an Jupyter notebook with the model that we are were planning to train and saved it. We created a separate notebook in which we had our runner wrapping script similar to
import tensorflow_cloud as tfc
tfc.run(
docker_config=tfc.DockerConfig(
image_build_bucket="somebucket",
parent_image="gcr.io/xyz"),
entry_point="model.ipynb",
distribution_strategy="auto",
worker_count=5,
requirements_txt='requirements.txt',
chief_config=tfc.COMMON_MACHINE_CONFIGS["CPU"],
worker_config=tfc.COMMON_MACHINE_CONFIGS["CPU"],
job_labels={
"job": "kaggle_competition",
"team": "base_line",
},
stream_logs=False
)
The run fails with error
/opt/conda/lib/python3.7/site-packages/tensorflow_cloud/core/preprocess.py in _get_colab_notebook_content()
207 def _get_colab_notebook_content():
208 """Returns the colab notebook python code contents."""
--> 209 response = _message.blocking_request("get_ipynb",
210 request="",
211 timeout_sec=200)
AttributeError: 'NoneType' object has no attribute 'blocking_request'
Would be nice to add support for this case were all requirements and a proper base image are directly provided for the remote run.
Any news on this?
Any updates?
Try the following - I did not run from an AI Platform notebook, but from a private GitLab instance, but the error seams to be related with the same bug (or inprecise code):
Within the tensorflow cloud code, there are detection to see if the code is running from a google-colab notebook or from a kaggle notebook. As you see in the error - it ran into the 'colab' branch which failed as it was not running from colab.
Within 'preprocess.py', the proper branch will be reached if the attribute 'called_from_notebook' got the value 'False'. The detection for this is in the 'run.py' module and checks if your 'IPython.get_ipython().class.name' contains the word "Shell".
For me (GitLab) it contains the word shell and the branching goes into the wrong direction.
Long story short. Quick and dirty fix:
`
def _called_from_notebook_FIX():
return False
from unittest.mock import patch
with patch('tensorflow_cloud.core.run._called_from_notebook', new=_called_from_notebook_FIX):
#tfc.run code here...`
Monkey-patching a 'False' into it and it runs for me. Someone (maybe me) should write a pull request on a better environment detection for tensorflow-cloud.