High inference time using r1.0 and master

Question

High inference time using r1.0 and master

harsh-agar opened this issue 6 years ago · comments

Hi @gustavz
The model ran successfully on Jetson TX2 but the inference time was quite slow. I tried both r1.0 branch and the master branch, the inference time were-
For master:
18.15, 2.39, 2.62, 2.53 seconds
While for r1.0:
22.34, 0.27, 0.17, 0.13 seconds
for 4 images respectively.
Visualization was switched off.
Is there anything I'm missing that makes it this slow?

Thanks

Gustav von Zitzewitz · Answer 1 · Mon Jul 02 2018 04:06:17 GMT+0800 (China Standard Time)

@harsh-agar

are you using the current master?
how does your config look like?
Did you change code?
which python / openCV /JetPack version are you using

Harsh Agarwal · Answer 2 · Mon Jul 02 2018 11:01:37 GMT+0800 (China Standard Time)

@gustavz

I tried both master as well as r1.0 and results obtained are shown above

2.This is my config.yml for master

Inference Config

VIDEO_INPUT: 0 # Input Must be OpenCV readable
VISUALIZE: True # Disable for performance increase
VIS_FPS: True # Draw current FPS in the top left Image corner
CPU_ONLY: False # CPU Placement for speed test
USE_OPTIMIZED: False # whether to use the optimized model (only possible if transform with script)
DISCO_MODE: False # Secret Disco Visualization Mode

Testing

IMAGE_PATH: 'test_images' # path for test_*.py test_images
LIMIT_IMAGES: None # if set to None, all images are used
WRITE_TIMELINE: True # write json timeline file (slows infrence)
SAVE_RESULT: False # save detection results to disk
RESULT_PATH: 'test_results' # path to save detection results
SEQ_MODELS: [] # List of Models to sequentially test (Default all Models)

Object_Detection

WIDTH: 600 # OpenCV only supports 4:3 formats others will be converted
HEIGHT: 600 # 600x600 leads to 640x480
MAX_FRAMES: 5000 # only used if visualize==False
FPS_INTERVAL: 5 # Interval [s] to print fps of the last interval in console
PRINT_INTERVAL: 500 # intervall [frames] to print detections to console
PRINT_TH: 0.5 # detection threshold for det_intervall

speed hack

SPLIT_MODEL: True # Splits Model into a GPU and CPU session (currently only works for ssd_mobilenets)
SSD_SHAPE: 300 # used for the split model algorithm (currently only supports ssd networks trained on 300x300 and 600x600 input)

Tracking

USE_TRACKER: False # Use a Tracker (currently only works properly WITHOUT split_model)
TRACKER_FRAMES: 20 # Number of tracked frames between detections
NUM_TRACKERS: 5 # Max number of objects to track

Model

OD_MODEL_NAME: 'ssd_mobilenet_v11_coco'
OD_MODEL_PATH: 'models/ssd_mobilenet_v11_coco/{}'
LABEL_PATH: 'rod/data/tf_coco_label_map.pbtxt'
NUM_CLASSES: 90

DeepLab

ALPHA: 0.3 # mask overlay factor (also for mask_rcnn)
BBOX: True # compute boundingbox in postprocessing
MINAREA: 500 # min Pixel Area to apply bounding boxes (avoid noise)

Model

DL_MODEL_NAME: 'deeplabv3_mnv2_pascal_train_aug_2018_01_29'
DL_MODEL_PATH: 'models/deeplabv3_mnv2_pascal_train_aug/{}'

I did not change the code
Python 2.7 | OpenCV 3.3.1 | Jetpack 3.1

Thanks again

Harsh Agarwal · Answer 3 · Mon Jul 02 2018 12:23:03 GMT+0800 (China Standard Time)

I ran the script test_objectdetection.py and, What I observed is when loading the model it is using GPU, but during detection GPU usage is 0%.