dmlc / tensorboard

Standalone TensorBoard for visualizing in deep learning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Merge into MXNet

zihaolucky opened this issue · comments

@piiswrong

Any works/advice to do before we merge this into MXNet? I was going to write a script to auto-build the wheel file using Travis-CI and user could install TensorBoard simply using pip install tensorboard, just like TF does. Otherwise, the user has to install bazel and it makes MXNet's installation too heavy?

Have you figured out how to do this? Would be great if this works.

@piiswrong @szha

I followed the script as the one in mxnet-distro, but bazel build for tensorboard is too slow, https://travis-ci.org/zihaolucky/tensorboard/jobs/200914155#L3193-L3214

Any ideas? I found the build time for mac must be less than 50mins, or it would be terminated.

I've tried to change the bazel config(ram_utilization_factor or local_resources) for speedup, but didn't work.

It's compiling tensorflow ops. Can you make it only compile tensorboard? Try modifying bazel config file?

Good idea, I would try.

I compared the difference between building tensorboard and entire tensorflow when I first noticed this, the number of jobs is different, yet I supposed these are necessary but not trying to modifying the config file ><

I think it might work, as we've replaced the logging part, the tensorflow ops could be removed.

I used snakefood to analyze the dependencies of the rendering part:

sfood -fuq tensorflow/tensorboard/tensorboard.py | sfood-filter-stdlib | sfood-target-files > deps.log

And found the unnecessary tensorflow ops are imported from several files(tensorflow/tensorflow/python/summary/event_accumulator.py, etc), while the methods it uses doesn't require tensorflow ops so it's possible to replace them. Another work is to change the bazel config/BUILD and make sure the SWIG files won't cause huge troubles.

However, it could make our project not as easy to maintain as we change the rendering part and the BUILD file, any ideas? Should we consider building the project using Jenkins like TF does? Ref:https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=mac-slave/

If we use Jenkins, I have to setup a machine and envs to do that. But the TF has provided reusable scripts.

@piiswrong Could you help check https://travis-ci.org/zihaolucky/tensorboard/jobs/203088383#L645? It's okay in mac, but always failed in linux.

./configure < ../tools/pip_wheel/configure.conf

ill take a look later.
we can make a private repo for distribution so you don't need to hack compile time.

or we can move linux build to jenkins

Thanks! To hack the compile time limited in Travis, I have to remove some dependencies in its bazel config. I think it's okay if that won't cause any unexpected error/bug in the future, but technically it shouldn't happen as we only remove the unnecessary ops which tensorboard won't use.

So I would like to keep the Travis way currently(for simplicity) and focus on the logging part, as we have graph, video and even embedding to do.

The error in linux is caused by:

./configure < ../tools/pip_wheel/configure.conf

in which it has to configure some install options before bazel can start, I put the config in /tools/pip_wheel/configure.conf in the script as we have no keyboard input. My branch https://github.com/zihaolucky/tensorboard/tree/travis_pip_wheel

@piiswrong @mli @jermainewang

Now we have PyPI, how about add tensorboard as default requirements for MXNet Python package?

Any work I should do?

  • More docs about tensorboard.
  • Any changes in MXNet?

we can add some callbacks in mxnet.
Let's start with optional import and see if anyone have trouble installing

Yep.