tensorflow / build

Build-related tools for TensorFlow

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bazel build doesn't check cache actions for source code build

AmosChenYQ opened this issue · comments

I followed the official documentation and compiled the source code successfully in my own PC, but each time I added any VLOG or some other small code changes and then rebuilt, Bazel didn't seem to use any cache of actions resulting in very long compilation time. Most of time is spent at recompiling source or dependency files like llvm, files under tensorflow/compiler/xla etc, which haven't been changed by me at all.

But I also tried docker image build method mentioned in docs above, in docker container provided by docs Bazel can use cache to do incremental build.

So how do I check or set Bazel configuration in my PC to do incremental build and speed up compilation?

I searched for methods these days to solve this issue, this may be the reason?

commented

I searched for methods these days to solve this issue, this may be the reason?

If this happens when you rebase/merge/pull upstream -> yes.

commented

Check also:
#5
#48

I searched for methods these days to solve this issue, this may be the reason?

If this happens when you rebase/merge/pull upstream -> yes.

Thanks for replying. I did update with stream in the very beginning but now I don't do that. The point you mentioned here is the behavior I think bazel should have. But my situation is a bit strange here. I have two machines, one can use cache while the other can't.

One machine is a public server which is being used by my classmates in lab and I use docker to separate my tensorflow development environment from that server. And the image I use is tensorflow/tensorflow:devel-gpu. I build source in docker container's bash and commit this container so the next time I can save time by using this newly-committed image with build cache. This is OK and convenient.

The problem happens in my local machine. I don't use docker in my machine to separate environment and won't update source repo but just modify a few lines of code then build. Bazel uses cache just after building it but if I do it again after a few days it can't do this but build from source again.(My local machine is never shutdown or restart in these days and nor do I delete or modify bazel's cache folder like ~/.cache/bazel and bazel-in/out/bin ) I think there must be some folders being deleted during this time but just can't figure them out.

commented

Do you have something that Is changing your PATH?

If you are on Linux check It with printenv.

Do you have something that Is changing your PATH?

If you are on Linux check It with printenv.

SHELL=/bin/bash
LANGUAGE=en_US:en
LC_ADDRESS=en_US.UTF-8
LC_NAME=en_US.UTF-8
TF_CPP_MIN_LOG_LEVEL=0
LC_MONETARY=en_US.UTF-8
TF_FORCE_GPU_ALLOW_GROWTH=true
PWD=/home/amoschenyq
LOGNAME=amoschenyq
XDG_SESSION_TYPE=tty
TF_CPP_MAX_VLOG_LEVEL=1
MOTD_SHOWN=pam
HOME=/home/amoschenyq
LC_PAPER=en_US.UTF-8
LANG=en_US.UTF-8
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
LESSCLOSE=/usr/bin/lesspipe %s %s
XDG_SESSION_CLASS=user
LC_IDENTIFICATION=en_US.UTF-8
TERM=xterm-256color
LESSOPEN=| /usr/bin/lesspipe %s
USER=amoschenyq
SHLVL=0
LC_TELEPHONE=en_US.UTF-8
LC_MEASUREMENT=en_US.UTF-8
XDG_SESSION_ID=1491
PAPERSIZE=letter
LD_LIBRARY_PATH=/usr/local/TensorRT-8.4.0.6/lib:/usr/local/cuda-11.6/lib64:
XDG_RUNTIME_DIR=/run/user/1000
LC_TIME=en_US.UTF-8
XDG_DATA_DIRS=/usr/local/share:/usr/share:/var/lib/snapd/desktop
TMP=/mnt/hard-disk/tmp
PATH=/home/amoschenyq/.local/bin:/mnt/hard-disk/usr/local/bin:/home/amoschenyq/.vim/plugged/fzf/bin:/home/amoschenyq/.bazel/bin:/usr/local/TensorRT-8.4.0.6/bin:/usr/local/cuda-11.6/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
LC_NUMERIC=en_US.UTF-8
_=/usr/bin/printenv

Maybe because I have changed TMP folder to my HDD instead of SSD to save space?

commented

No, TEMP Is not involved you need only check if an action-env env var (like PATH) changed between builds:

https://github.com/tensorflow/tensorflow/blob/master/.bazelrc#L158

No, TEMP Is not involved you need only check if an action-env env var (like PATH) changed between builds:

https://github.com/tensorflow/tensorflow/blob/master/.bazelrc#L158

I did change my PATH some days ago to add my LLVM/MLIR to path... so I think this strange behavior has a clear reason and this issue is solved!