osmr / imgclsmob

Sandbox for training deep learning networks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AttributeError: 'Namespace' object has no attribute 'disable_cudnn_autotune'

tucan9389 opened this issue · comments

I have trouble when running the train_tf2.py script. Do you have any example script for executing train_tf2.py with CocoSeg dataset?

Execution Env

  • Google Colab Pro
  • tensorflow==2.1.0
  • tensorflow-gpu==2.1.0
  • python==3.6.9

Execution Script

python train_tf2.py --model simplepose_mobile_mobilenetv3_small_w1_coco --dataset CocoSeg --batch-size 1

Error Message

image

Full Message

2020-03-01 02:41:24.974479: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2020-03-01 02:41:24.974607: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2020-03-01 02:41:24.974629: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
INFO:root:Script command line:
train_tf2.py --model simplepose_mobile_mobilenetv3_small_w1_coco --dataset CocoSeg --batch-size 1
INFO:root:Script arguments:
Namespace(attempt=1, batch_size=1, data_dir='../imgclsmob_data/coco', dataset='CocoSeg', image_base_size=520, image_crop_size=480, in_channels=3, log_interval=50, log_packages='tensorflow-gpu', log_pip_packages='tensorflow-gpu', logging_file_name='train.log', lr=0.1, lr_decay=0.1, lr_decay_epoch='40,60', lr_decay_period=0, lr_mode='cosine', model='simplepose_mobile_mobilenetv3_small_w1_coco', momentum=0.9, num_classes=21, num_epochs=120, num_gpus=0, num_workers=4, optimizer_name='nag', resume='', resume_state='', save_dir='', save_interval=4, seed=2949, start_epoch=1, target_lr=1e-08, use_pretrained=False, wd=0.0001, work_dir='../imgclsmob_data')
fatal: not a git repository (or any of the parent directories): .git
INFO:root:Env_stats:
{
    "tensorflow-gpu": "Name: tensorflow-gpu\nVersion: 2.1.0\nSummary: TensorFlow is an open source machine learning framework for everyone.\nHome-page: https://www.tensorflow.org/\nAuthor: Google Inc.\nAuthor-email: packages@tensorflow.org\nLicense: Apache 2.0\nLocation: /usr/local/lib/python3.6/dist-packages\nRequires: wheel, tensorboard, google-pasta, gast, keras-preprocessing, scipy, termcolor, numpy, astor, absl-py, six, protobuf, grpcio, wrapt, keras-applications, opt-einsum, tensorflow-estimator\nRequired-by:",
    "python": "3.6.9",
    "pwd": "/content/common",
    "git": "unknown",
    "platform": "Linux-4.14.137+-x86_64-with-Ubuntu-18.04-bionic"
}
2020-03-01 02:41:29.288155: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
Traceback (most recent call last):
  File "train_tf2.py", line 291, in <module>
    main()
  File "train_tf2.py", line 253, in main
    assert (ds_metainfo.ml_type != "imgseg") or args.disable_cudnn_autotune
AttributeError: 'Namespace' object has no attribute 'disable_cudnn_autotune'

What I tried

Add disable_cudnn_autotune arg on execution command

Not working 😓

python train_tf2.py --model simplepose_mobile_mobilenetv3_small_w1_coco --dataset CocoSeg --batch-size 1 --disable_cudnn_autotune true
python train_tf2.py --model simplepose_mobile_mobilenetv3_small_w1_coco --dataset CocoSeg --batch-size 1 --disable_cudnn_autotune
python train_tf2.py --model simplepose_mobile_mobilenetv3_small_w1_coco --dataset CocoSeg --batch-size 1 --disable-cudnn-autotune

Full Source

Here is a full notebook what I work.

CocoSeg is a data loader for the image segmentation task. simplepose_mobile_mobilenetv3_small_w1_coco is a human pose estimation network.

I’ll try with “CocoHpe” dataset. Thanks