google-research / vision_transformer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

An error occured when executing the fine tuning training steps by vit_jax.ipynb on colab GPU

jax11235 opened this issue · comments

Model: vit-b/16
Batch_size: 1

image

Error info:
RuntimeError: UNKNOWN: Failed to determine best cudnn convolution algorithm for:
%cudnn-conv-bias-activation.1 = (f32[512,14,14,768]{2,1,3,0}, u8[0]{0}) custom-call(f32[512,224,224,3]{2,1,3,0} %copy, f32[16,16,3,768]{1,0,2,3} %copy.1, f32[768]{0} %get-tuple-element.198), window={size=16x16 stride=16x16}, dim_labels=b01f_01io->b01f, custom_call_target="__cudnn$convBiasActivationForward", metadata={op_name="pmap()/jit(main)/conv_general_dilated[window_strides=(16, 16) padding=((0, 0), (0, 0)) lhs_dilation=(1, 1) rhs_dilation=(1, 1) dimension_numbers=ConvDimensionNumbers(lhs_spec=(0, 3, 1, 2), rhs_spec=(3, 2, 0, 1), out_spec=(0, 3, 1, 2)) feature_group_count=1 batch_group_count=1 lhs_shape=(512, 224, 224, 3) rhs_shape=(16, 16, 3, 768) precision=None preferred_element_type=None]" source_file="/usr/local/lib/python3.7/dist-packages/flax/linen/linear.py" source_line=434}, backend_config="{"conv_result_scale":1,"activation_mode":"0","side_input_scale":0}"

Original error: UNIMPLEMENTED: DNN library is not found.

To ignore this failure and try to use a fallback algorithm (which may have suboptimal performance), use XLA_FLAGS=--xla_gpu_strict_conv_algorithm_picker=false. Please also file a bug for the root cause of failing autotuning.

I just ran the public Colab
https://colab.sandbox.google.com/github/google-research/vision_transformer/blob/main/vit_jax.ipynb

on the public GPU runtime and encountered no such error.

I have the following packages installed:

!pip freeze
absl-py==1.1.0
alabaster==0.7.12
albumentations==0.1.12
altair==4.2.0
appdirs==1.4.4
aqtp==0.0.9
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
arviz==0.12.1
astor==0.8.1
astropy==4.3.1
astunparse==1.6.3
atari-py==0.2.9
atomicwrites==1.4.0
attrs==21.4.0
audioread==2.1.9
autograd==1.4
Babel==2.10.2
backcall==0.2.0
beautifulsoup4==4.6.3
bleach==5.0.0
blis==0.7.7
bokeh==2.3.3
branca==0.5.0
bs4==0.0.1
CacheControl==0.12.11
cached-property==1.5.2
cachetools==4.2.4
catalogue==2.0.7
certifi==2022.6.15
cffi==1.15.0
cftime==1.6.0
chardet==3.0.4
charset-normalizer==2.0.12
chex==0.1.3
click==7.1.2
cloudpickle==1.3.0
clu==0.0.7
cmake==3.22.5
cmdstanpy==0.9.5
colorama==0.4.5
colorcet==3.0.0
colorlover==0.3.0
commonmark==0.9.1
community==1.0.0b1
contextlib2==0.5.5
convertdate==2.4.0
coverage==3.7.1
coveralls==0.5
crcmod==1.7
cufflinks==0.17.3
cupy-cuda111==9.4.0
cvxopt==1.2.7
cvxpy==1.0.31
cycler==0.11.0
cymem==2.0.6
Cython==0.29.30
dacite==1.6.0
daft==0.0.4
dask==2.12.0
datascience==0.10.6
debugpy==1.0.0
decorator==4.4.2
defusedxml==0.7.1
descartes==1.1.0
dill==0.3.5.1
distributed==1.25.3
dlib==19.18.0+zzzcolab20220513001918
dm-tree==0.1.7
docopt==0.6.2
docutils==0.17.1
dopamine-rl==1.0.5
earthengine-api==0.1.315
easydict==1.9
ecos==2.0.10
editdistance==0.5.3
einops==0.4.1
en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.3.0/en_core_web_sm-3.3.0-py3-none-any.whl
entrypoints==0.4
ephem==4.1.3
et-xmlfile==1.1.0
etils==0.6.0
fa2==0.3.5
fastai==2.6.3
fastcore==1.4.4
fastdownload==0.0.6
fastdtw==0.3.4
fastjsonschema==2.15.3
fastprogress==1.0.2
fastrlock==0.8
fbprophet==0.7.1
feather-format==0.4.1
filelock==3.7.1
firebase-admin==4.4.0
fix-yahoo-finance==0.0.22
Flask==1.1.4
flatbuffers==1.12
flax==0.5.2
flaxformer @ git+https://github.com/google/flaxformer@9712a16a807ec21ad7cbf816e9f6a9c174ea795d
folium==0.8.3
future==0.16.0
gast==0.4.0
GDAL==2.2.2
gdown==4.4.0
gensim==3.6.0
geographiclib==1.52
geopy==1.17.0
gin-config==0.5.0
glob2==0.7
google==2.0.3
google-api-core==1.31.6
google-api-python-client==1.12.11
google-auth==1.35.0
google-auth-httplib2==0.0.4
google-auth-oauthlib==0.4.6
google-cloud-bigquery==1.21.0
google-cloud-bigquery-storage==1.1.2
google-cloud-core==1.0.3
google-cloud-datastore==1.8.0
google-cloud-firestore==1.7.0
google-cloud-language==1.2.0
google-cloud-storage==1.18.1
google-cloud-translate==1.5.0
google-colab @ file:///colabtools/dist/google-colab-1.0.0.tar.gz
google-pasta==0.2.0
google-resumable-media==0.4.1
googleapis-common-protos==1.56.2
googledrivedownloader==0.4
graphviz==0.10.1
greenlet==1.1.2
grpcio==1.46.3
gspread==3.4.2
gspread-dataframe==3.0.8
gym==0.17.3
h5py==3.1.0
HeapDict==1.0.1
hijri-converter==2.2.4
holidays==0.10.5.2
holoviews==1.14.9
html5lib==1.0.1
httpimport==0.5.18
httplib2==0.17.4
httplib2shim==0.0.3
humanize==0.5.1
hyperopt==0.1.2
ideep4py==2.0.0.post3
idna==2.10
imageio==2.4.1
imagesize==1.3.0
imbalanced-learn==0.8.1
imblearn==0.0
imgaug==0.2.9
importlib-metadata==4.11.4
importlib-resources==5.7.1
imutils==0.5.4
inflect==2.1.0
iniconfig==1.1.1
intel-openmp==2022.1.0
intervaltree==2.1.0
ipykernel==4.10.1
ipython==5.5.0
ipython-genutils==0.2.0
ipython-sql==0.3.9
ipywidgets==7.7.0
itsdangerous==1.1.0
jax==0.3.8
jaxlib @ https://storage.googleapis.com/jax-releases/cuda11/jaxlib-0.3.7+cuda11.cudnn805-cp37-none-manylinux2014_x86_64.whl
jedi==0.18.1
jieba==0.42.1
Jinja2==2.11.3
joblib==1.1.0
jpeg4py==0.1.4
jsonschema==4.3.3
jupyter==1.0.0
jupyter-client==5.3.5
jupyter-console==5.2.0
jupyter-core==4.10.0
jupyterlab-pygments==0.2.2
jupyterlab-widgets==1.1.0
kaggle==1.5.12
kapre==0.3.7
keras==2.9.0
Keras-Preprocessing==1.1.2
keras-vis==0.4.1
kiwisolver==1.4.3
korean-lunar-calendar==0.2.1
langcodes==3.3.0
libclang==14.0.1
librosa==0.8.1
lightgbm==2.2.3
llvmlite==0.34.0
lmdb==0.99
LunarCalendar==0.0.9
lxml==4.2.6
Markdown==3.3.7
MarkupSafe==2.0.1
matplotlib==3.2.2
matplotlib-inline==0.1.3
matplotlib-venn==0.11.7
missingno==0.5.1
mistune==0.8.4
mizani==0.6.0
mkl==2019.0
ml-collections==0.1.1
mlxtend==0.14.0
more-itertools==8.13.0
moviepy==0.2.3.5
mpmath==1.2.1
msgpack==1.0.4
multiprocess==0.70.13
multitasking==0.0.10
murmurhash==1.0.7
music21==5.5.0
natsort==5.5.0
nbclient==0.6.4
nbconvert==5.6.1
nbformat==5.4.0
nest-asyncio==1.5.5
netCDF4==1.5.8
networkx==2.6.3
nibabel==3.0.2
nltk==3.7
notebook==5.3.1
numba==0.51.2
numexpr==2.8.1
numpy==1.21.6
oauth2client==4.1.3
oauthlib==3.2.0
okgrade==0.4.3
opencv-contrib-python==4.1.2.30
opencv-python==4.1.2.30
openpyxl==3.0.10
opt-einsum==3.3.0
optax==0.1.2
osqp==0.6.2.post0
packaging==21.3
palettable==3.3.0
pandas==1.3.5
pandas-datareader==0.9.0
pandas-gbq==0.13.3
pandas-profiling==1.4.1
pandocfilters==1.5.0
panel==0.12.1
param==1.12.1
parso==0.8.3
pathlib==1.0.1
pathy==0.6.1
patsy==0.5.2
pep517==0.12.0
pexpect==4.8.0
pickleshare==0.7.5
Pillow==7.1.2
pip-tools==6.2.0
plotly==5.5.0
plotnine==0.6.0
pluggy==0.7.1
pooch==1.6.0
portpicker==1.3.9
prefetch-generator==1.0.1
preshed==3.0.6
prettytable==3.3.0
progressbar2==3.38.0
prometheus-client==0.14.1
promise==2.3
prompt-toolkit==1.0.18
protobuf==3.17.3
psutil==5.4.8
psycopg2==2.7.6.1
ptyprocess==0.7.0
py==1.11.0
pyarrow==6.0.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycocotools==2.0.4
pycparser==2.21
pyct==0.4.8
pydantic==1.8.2
pydata-google-auth==1.4.0
pydot==1.3.0
pydot-ng==2.0.0
pydotplus==2.0.2
PyDrive==1.3.1
pyemd==0.5.1
pyerfa==2.0.0.1
pyglet==1.5.0
Pygments==2.6.1
pygobject==3.26.1
pymc3==3.11.4
PyMeeus==0.5.11
pymongo==4.1.1
pymystem3==0.2.0
PyOpenGL==3.1.6
pyparsing==3.0.9
pyrsistent==0.18.1
pysndfile==1.3.8
PySocks==1.7.1
pystan==2.19.1.1
pytest==3.6.4
python-apt==0.0.0
python-chess==0.23.11
python-dateutil==2.8.2
python-louvain==0.16
python-slugify==6.1.2
python-utils==3.3.3
pytz==2022.1
pyviz-comms==2.2.0
PyWavelets==1.3.0
PyYAML==6.0
pyzmq==23.1.0
qdldl==0.1.5.post2
qtconsole==5.3.1
QtPy==2.1.0
regex==2022.6.2
requests==2.23.0
requests-oauthlib==1.3.1
resampy==0.2.2
rich==11.2.0
rpy2==3.4.5
rsa==4.8
scikit-image==0.18.3
scikit-learn==1.0.2
scipy==1.4.1
screen-resolution-extra==0.0.0
scs==3.2.0
seaborn==0.11.2
semver==2.13.0
Send2Trash==1.8.0
setuptools-git==1.2
Shapely==1.8.2
simplegeneric==0.8.1
six==1.15.0
sklearn==0.0
sklearn-pandas==1.8.0
smart-open==5.2.1
snowballstemmer==2.2.0
sortedcontainers==2.4.0
SoundFile==0.10.3.post1
soupsieve==2.3.2.post1
spacy==3.3.1
spacy-legacy==3.0.9
spacy-loggers==1.0.2
Sphinx==1.8.6
sphinxcontrib-serializinghtml==1.1.5
sphinxcontrib-websupport==1.2.4
SQLAlchemy==1.4.37
sqlparse==0.4.2
srsly==2.4.3
statsmodels==0.10.2
sympy==1.7.1
tables==3.7.0
tabulate==0.8.9
tblib==1.7.0
tenacity==8.0.1
tensorboard==2.9.1
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorflow==2.9.1
tensorflow-cpu==2.9.1
tensorflow-datasets==4.6.0
tensorflow-estimator==2.9.0
tensorflow-gcs-config==2.8.0
tensorflow-hub==0.12.0
tensorflow-io-gcs-filesystem==0.26.0
tensorflow-metadata==1.8.0
tensorflow-probability==0.16.0
tensorflow-text==2.9.0
termcolor==1.1.0
terminado==0.13.3
testpath==0.6.0
text-unidecode==1.3
textblob==0.15.3
Theano-PyMC==1.1.2
thinc==8.0.17
threadpoolctl==3.1.0
tifffile==2021.11.2
tinycss2==1.1.1
toml==0.10.2
tomli==2.0.1
toolz==0.11.2
torch @ https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp37-cp37m-linux_x86_64.whl
torchaudio @ https://download.pytorch.org/whl/cu113/torchaudio-0.11.0%2Bcu113-cp37-cp37m-linux_x86_64.whl
torchsummary==1.5.1
torchtext==0.12.0
torchvision @ https://download.pytorch.org/whl/cu113/torchvision-0.12.0%2Bcu113-cp37-cp37m-linux_x86_64.whl
tornado==5.1.1
tqdm==4.64.0
traitlets==5.1.1
tweepy==3.10.0
typeguard==2.7.1
typer==0.4.1
typing-extensions==4.1.1
tzlocal==1.5.1
uritemplate==3.0.1
urllib3==1.24.3
vega-datasets==0.9.0
wasabi==0.9.1
wcwidth==0.2.5
webencodings==0.5.1
Werkzeug==1.0.1
widgetsnbextension==3.6.0
wordcloud==1.5.0
wrapt==1.14.1
xarray==0.20.2
xarray-einstats==0.2.2
xgboost==0.90
xkit==0.0.0
xlrd==1.1.0
xlwt==1.3.0
yellowbrick==1.4
zict==2.2.0
zipp==3.8.0

Now it works without any extra operations, just run with https://github.com/google-research/vision_transformer/blob/main/vit_jax.ipynb

ok, good to know!

this can happen because packages are updated and we don't the packages in our

install_requires = [
'absl-py',
'clu',
'einops',
'flax',
'flaxformer @ git+https://github.com/google/flaxformer',
'jax',
'ml-collections',
'numpy',
'packaging',
'pandas',
'scipy',
'tensorflow_datasets',
'tensorflow_probability',
'tensorflow',
'tensorflow_text',
'tqdm',
]