octoml / triton-modelanalyzer-experiments

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This repo contains artifacts and configurations to use the Triton Model Analyzer.

Instructions

  1. Clone the model analyzer repo.
    git clone https://github.com/triton-inference-server/model_analyzer.git -b r22.10
  1. Build a container via instructions here.
    cd ./model_analyzer
    docker build --pull -t model-analyzer .
  1. Clone this repo.
    git clone TBD
    cd model_analyzer_experiments
  1. Launch the model analyzer container with these paths. Add --gpu=all when available.
    docker run -it --rm \
        -v $(pwd)/config:/config \
        -v $(pwd)/models:/models \
        -v $(pwd)/export:/export \
        -v $(pwd)/output:/output \
        --net=host model-analyzer    
  1. Run experiments via the following config. Modify as necessary.
    model-analyzer --verbose profile --config-file /config/modelanalyzer.yaml

Notes

To Supply Custom InputData to PerfAnalyzer

To configure ONNX backend parameters - There are two options

  1. Set the following in the model's config.pbtxt
parameters { key: "intra_op_thread_count" value: { string_value: "16" } }
parameters { key: "inter_op_thread_count" value: { string_value: "1" } }
  1. Set the following as flags in the model analyzer yaml
  yolov5s:
    triton_server_flags:
      log_verbose: True
      backend-config:
        onnxruntime,inter_op_thread_count=1
        onnxruntime,intra_op_thread_count=1
        onnxruntime,enable-global-threadpool=1        

About

License:Apache License 2.0