tensorflow / profiler

A profiling and performance analysis tool for TensorFlow

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is this profiler API support AMD GPU (with rocm>=3.0.0) ?

alphaRGB opened this issue · comments

My question

I want to profile my CNN model on AMD GPU (Vega 20), I try tensorflow-rocm==2.2.0 , while my rocm==3.3.0, but is it failed. So I want to kown if this tensorflow/profiler API support AMD GPU??

environment

  • Ubuntu 18.04
  • tensorflow-rocm==2.2.0
  • rocm==3.3.0

Test code

    model = MyModel()
    input_shape = [1, 64, 64, 3]
    # model = MyModel()
    model(tf.ones(input_shape))

    with tf.device('/GPU:0'):
        with tf.profiler.experimental.Profile(logdir='temp'):
            outs = model(tf.ones(input_shape))
    
    print('Profile v2 done!')

failed message:

2020-07-06 13:19:19.304584: I tensorflow/core/graph/gpu_fusion_pass.cc:505] ROCm Fusion is enabled.
2020-07-06 13:19:19.306426: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocblas.so
2020-07-06 13:19:19.307372: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libMIOpen.so
MIOpen(HIP): Warning [SQLiteBase] Unable to read system database file:/opt/rocm-3.3.0/miopen/share/miopen/db/miopen.db Performance may degrade
MIOpen(HIP): Warning [FindDataDirectSolutions] /root/driver/MLOpen/src/include/miopen/sqlite_db.hpp:209: Internal error while accessing SQLite database: unable to open database file
MIOpen(HIP): Warning [FindDataDirectSolutions] /root/driver/MLOpen/src/include/miopen/sqlite_db.hpp:209: Internal error while accessing SQLite database: unable to open database file
MIOpen(HIP): Warning [FindDataDirectSolutions] /root/driver/MLOpen/src/include/miopen/sqlite_db.hpp:209: Internal error while accessing SQLite database: unable to open database file
2020-07-06 13:19:27.832672: I tensorflow/core/profiler/lib/profiler_session.cc:154] Profiler session started.
2020-07-06 13:19:27.832742: I tensorflow/core/profiler/internal/gpu/rocm_tracer.cc:743] Profiler found 1 GPUs
2020-07-06 13:19:27.832763: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:583] GpuTracer created.
2020-07-06 13:19:27.836269: I tensorflow/core/profiler/internal/gpu/rocm_tracer.cc:757] GpuTracer started
2020-07-06 13:19:27.839595: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:177] RocmTracerEvent(s) dropped (1) : Activity event encountered before a corresponding API event.
2020-07-06 13:19:27.840591: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:177] RocmTracerEvent(s) dropped (1) : Activity event encountered before a corresponding API event.
2020-07-06 13:19:27.840633: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:177] RocmTracerEvent(s) dropped (1) : invalid stream id.
2020-07-06 13:19:27.840683: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:177] RocmTracerEvent(s) dropped (1) : invalid stream id.
2020-07-06 13:19:27.840701: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:177] RocmTracerEvent(s) dropped (1) : invalid stream id.
2020-07-06 13:19:27.840716: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:177] RocmTracerEvent(s) dropped (1) : invalid stream id.
2020-07-06 13:19:27.840732: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:177] RocmTracerEvent(s) dropped (1) : invalid stream id.
2020-07-06 13:19:27.840746: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:199] Mapping physical device id 0 to logical device id 0
2020-07-06 13:19:27.840760: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:177] RocmTracerEvent(s) dropped (1) : invalid stream id.
2020-07-06 13:19:27.840775: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:177] RocmTracerEvent(s) dropped (1) : invalid stream id.
2020-07-06 13:19:27.840790: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:177] RocmTracerEvent(s) dropped (1) : invalid stream id.
2020-07-06 13:19:27.840816: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:177] RocmTracerEvent(s) dropped (1) : invalid stream id.
2020-07-06 13:19:27.840840: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:177] RocmTracerEvent(s) dropped (1) : invalid stream id.
2020-07-06 13:19:27.840855: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:177] RocmTracerEvent(s) dropped (1) : invalid stream id.
2020-07-06 13:19:27.840869: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:177] RocmTracerEvent(s) dropped (1) : invalid stream id.
2020-07-06 13:19:27.840883: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:177] RocmTracerEvent(s) dropped (1) : invalid stream id.
2020-07-06 13:19:27.840899: I tensorflow/core/profiler/internal/gpu/device_tracer_rocm.cc:177] RocmTracerEvent(s) dropped (1) : invalid stream id.
2020-07-06 13:19:27.840927: I tensorflow/core/profiler/internal/gpu/rocm_tracer.cc:768] GpuTracer stopped
2020-07-06 13:19:27.842305: I tensorflow/core/profiler/rpc/client/save_profile.cc:168] Creating directory: temp/plugins/profile/2020_07_06_13_19_27
2020-07-06 13:19:27.842913: I tensorflow/core/profiler/rpc/client/save_profile.cc:174] Dumped gzipped tool data for trace.json.gz to temp/plugins/profile/2020_07_06_13_19_27/ubuntu.trace.json.gz
2020-07-06 13:19:27.842992: E tensorflow/core/profiler/utils/hardware_type_utils.cc:60] Invalid GPU compute capability.
2020-07-06 13:19:27.843468: I tensorflow/core/profiler/utils/event_span.cc:288] Generation of step-events took 0 ms

2020-07-06 13:19:27.844651: I tensorflow/python/profiler/internal/profiler_wrapper.cc:91] Creating directory: temp/plugins/profile/2020_07_06_13_19_27Dumped tool data for overview_page.pb to temp/plugins/profile/2020_07_06_13_19_27/ubuntu.overview_page.pb
Dumped tool data for input_pipeline.pb to temp/plugins/profile/2020_07_06_13_19_27/ubuntu.input_pipeline.pb
Dumped tool data for tensorflow_stats.pb to temp/plugins/profile/2020_07_06_13_19_27/ubuntu.tensorflow_stats.pb
Dumped tool data for kernel_stats.pb to temp/plugins/profile/2020_07_06_13_19_27/ubuntu.kernel_stats.pb

commented

@ckluk thank yopu, I know, the profiler of offical tensorflow does not support AMD GPU. I'll try it with tensorflow-rocm which is forked by AMD ROCm.