alibaba / BladeDISC

BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[TorchBench] Performance Signal Detected

zzpmiracle opened this issue · comments

TorchBench CI has detected a performance signal.

Affected Tests:

  • eval-cuda-fp32:
    • attention_is_all_you_need_pytorch[dynamo-blade (latency)] 6.6 -> 5.808, +12.0%
    • attention_is_all_you_need_pytorch[dynamo-disc (latency)] 5.587 -> 4.762, +14.7664%
    • DALLE2_pytorch[disc (latency)] status changed, 90.621 -> OSError
    • DALLE2_pytorch[disc (compiled)] status changed, 22800.0 -> N/A
    • DALLE2_pytorch[disc (clusters)] status changed, 184.0 -> N/A
    • detectron2_fasterrcnn_r_101_c4[dynamo-blade (latency)] status changed, AssertionError -> 80.596
    • detectron2_fasterrcnn_r_101_c4[dynamo-disc (latency)] status changed, AssertionError -> 146.281
    • detectron2_fasterrcnn_r_101_c4[dynamo-disc (clusters)] status changed, N/A -> 15.0
    • detectron2_fasterrcnn_r_101_c4[dynamo-disc (compiled)] status changed, N/A -> 1445.0
    • detectron2_fasterrcnn_r_101_dc5[dynamo-blade (latency)] status changed, AssertionError -> 41.958
    • detectron2_fasterrcnn_r_101_dc5[dynamo-disc (latency)] status changed, AssertionError -> 54.014
    • detectron2_fasterrcnn_r_101_dc5[dynamo-disc (clusters)] status changed, N/A -> 14.0
    • detectron2_fasterrcnn_r_101_dc5[dynamo-disc (compiled)] status changed, N/A -> 1453.0
    • detectron2_fasterrcnn_r_101_fpn[dynamo-blade (latency)] status changed, AssertionError -> 30.999
    • detectron2_fasterrcnn_r_50_c4[dynamo-blade (latency)] status changed, AssertionError -> 76.17
    • detectron2_fasterrcnn_r_50_c4[dynamo-disc (latency)] status changed, AssertionError -> 33.887
    • detectron2_fasterrcnn_r_50_c4[dynamo-disc (clusters)] status changed, N/A -> 16.0
    • detectron2_fasterrcnn_r_50_c4[dynamo-disc (compiled)] status changed, N/A -> 958.0
    • detectron2_fasterrcnn_r_50_dc5[dynamo-blade (latency)] status changed, AssertionError -> 38.074
    • detectron2_fasterrcnn_r_50_fpn[dynamo-blade (latency)] status changed, AssertionError -> 25.299
    • detectron2_maskrcnn_r_101_c4[dynamo-blade (latency)] status changed, AssertionError -> 90.31
    • detectron2_maskrcnn_r_50_c4[dynamo-blade (latency)] status changed, AssertionError -> 84.526
    • detectron2_maskrcnn_r_50_fpn[dynamo-blade (latency)] status changed, AssertionError -> 29.93
    • dlrm[disc (latency)] 1.861 -> 2.019, -8.4901%
    • dlrm[blade (latency)] 1.681 -> 1.925, -14.5152%
    • dlrm[dynamo-blade (latency)] 1.784 -> 2.033, -13.9574%
    • dlrm[dynamo-disc (latency)] 1.9 -> 2.131, -12.1579%
    • drq[dynamo-blade (latency)] status changed, 1.4 -> UnserializableException
    • drq[dynamo-disc (latency)] status changed, 1.353 -> UnserializableException
    • drq[dynamo-disc (clusters)] status changed, 1.0 -> N/A
    • drq[dynamo-disc (compiled)] status changed, 84.0 -> N/A
    • fambench_xlmr[dynamo-blade (latency)] 247.478 -> 144.744, +41.5124%
    • fambench_xlmr[dynamo-disc (latency)] status changed, 174.359 -> OSError
    • fambench_xlmr[dynamo-disc (clusters)] status changed, 50.0 -> N/A
    • fambench_xlmr[dynamo-disc (compiled)] status changed, 2868.0 -> N/A
    • functorch_dp_cifar10[disc (latency)] 1.692 -> 1.779, -5.1418%
    • functorch_maml_omniglot[blade (latency)] 0.592 -> 0.71, -19.9324%
    • functorch_maml_omniglot[dynamo-disc (latency)] 0.663 -> 0.546, +17.6471%
    • hf_Bart[dynamo-blade (latency)] 12.191 -> 9.559, +21.5897%
    • hf_Bart[dynamo-disc (latency)] 12.929 -> 10.165, +21.3783%
    • hf_Bart[dynamo-disc (clusters)] 8 -> 1
    • hf_Bart[dynamo-disc (compiled)] 1414 -> 1426
    • hf_Bert[dynamo-blade (latency)] 8.03 -> 7.029, +12.4658%
    • hf_Bert[dynamo-disc (latency)] 8.199 -> 7.304, +10.916%
    • hf_Bert_large[dynamo-blade (latency)] 19.724 -> 18.176, +7.8483%
    • hf_Bert_large[dynamo-disc (latency)] 21.283 -> 19.555, +8.1192%
    • hf_Bert_mini[blade (latency)] 0.653 -> 0.546, +16.3859%
    • hf_Bert_mini[dynamo-blade (latency)] 1.025 -> 0.57, +44.3902%
    • hf_Bert_mini[dynamo-disc (latency)] 1.566 -> 0.87, +44.4444%
    • hf_BigBird[disc (latency)] status changed, 160.796 -> RuntimeError
    • hf_BigBird[dynamo-blade (latency)] status changed, OSError -> 117.253
    • hf_BigBird[dynamo-disc (latency)] status changed, RuntimeError -> 113.384
    • hf_BigBird[disc (compiled)] status changed, 5007.0 -> N/A
    • hf_BigBird[disc (clusters)] status changed, 61.0 -> N/A
    • hf_BigBird[dynamo-disc (clusters)] status changed, N/A -> 135.0
    • hf_BigBird[dynamo-disc (compiled)] status changed, N/A -> 11446.0
    • hf_DistilBert[dynamo-blade (latency)] 4.095 -> 3.823, +6.6422%
    • hf_DistilBert[dynamo-disc (latency)] 4.298 -> 3.966, +7.7245%
    • hf_Longformer[disc (latency)] status changed, 137.037 -> RuntimeError
    • hf_Longformer[dynamo-disc (latency)] status changed, 119.214 -> RuntimeError
    • hf_Longformer[disc (compiled)] status changed, 9201.0 -> N/A
    • hf_Longformer[disc (clusters)] status changed, 169.0 -> N/A
    • hf_Longformer[dynamo-disc (clusters)] status changed, 172.0 -> N/A
    • hf_Longformer[dynamo-disc (compiled)] status changed, 6130.0 -> N/A
    • hf_T5_large[dynamo-blade (latency)] 105.369 -> 98.895, +6.1441%
    • pyhpc_isoneutral_mixing[blade (latency)] 11.621 -> 10.566, +9.0784%
    • pyhpc_isoneutral_mixing[dynamo-blade (latency)] 10.665 -> 9.696, +9.0858%
    • pyhpc_turbulent_kinetic_energy[dynamo-blade (latency)] 10.589 -> 11.627, -9.8026%
    • resnet18[dynamo-blade (latency)] 2.185 -> 2.022, +7.46%
    • timm_efficientdet[blade (latency)] status changed, 1055.4 -> RuntimeError
    • timm_vision_transformer[dynamo-disc (latency)] 7.076 -> 6.702, +5.2855%
    • timm_vovnet[dynamo-blade (latency)] 22.05 -> 20.904, +5.1973%
    • yolov3[dynamo-disc (latency)] status changed, 44.194 -> RuntimeError
    • yolov3[dynamo-disc (clusters)] status changed, 18.0 -> N/A
    • yolov3[dynamo-disc (compiled)] status changed, 141.0 -> N/A
  • eval-cuda-fp16:
    • alexnet[dynamo-blade (latency)] 3.964 -> 3.759, +5.1715%
    • attention_is_all_you_need_pytorch[dynamo-blade (latency)] 5.219 -> 4.411, +15.4819%
    • attention_is_all_you_need_pytorch[dynamo-disc (latency)] 3.67 -> 2.688, +26.7575%
    • detectron2_fasterrcnn_r_101_c4[dynamo-blade (latency)] status changed, AssertionError -> 49.353
    • detectron2_fasterrcnn_r_101_c4[dynamo-disc (latency)] status changed, AssertionError -> 58.206
    • detectron2_fasterrcnn_r_101_c4[dynamo-disc (clusters)] status changed, N/A -> 15.0
    • detectron2_fasterrcnn_r_101_c4[dynamo-disc (compiled)] status changed, N/A -> 1445.0
    • detectron2_fasterrcnn_r_101_dc5[dynamo-blade (latency)] status changed, AssertionError -> 26.425
    • detectron2_fasterrcnn_r_101_dc5[dynamo-disc (latency)] status changed, AssertionError -> 31.067
    • detectron2_fasterrcnn_r_101_dc5[dynamo-disc (clusters)] status changed, N/A -> 14.0
    • detectron2_fasterrcnn_r_101_dc5[dynamo-disc (compiled)] status changed, N/A -> 1453.0
    • detectron2_fasterrcnn_r_101_fpn[dynamo-blade (latency)] status changed, AssertionError -> 19.83
    • detectron2_fasterrcnn_r_50_c4[dynamo-blade (latency)] status changed, AssertionError -> 47.455
    • detectron2_fasterrcnn_r_50_c4[dynamo-disc (latency)] status changed, AssertionError -> 18.222
    • detectron2_fasterrcnn_r_50_c4[dynamo-disc (clusters)] status changed, N/A -> 16.0
    • detectron2_fasterrcnn_r_50_c4[dynamo-disc (compiled)] status changed, N/A -> 958.0
    • detectron2_fasterrcnn_r_50_dc5[dynamo-blade (latency)] status changed, AssertionError -> 24.353
    • detectron2_fasterrcnn_r_50_fpn[dynamo-blade (latency)] status changed, AssertionError -> 16.632
    • detectron2_maskrcnn_r_101_c4[dynamo-blade (latency)] status changed, AssertionError -> 53.855
    • detectron2_maskrcnn_r_50_c4[dynamo-blade (latency)] status changed, AssertionError -> 49.114
    • detectron2_maskrcnn_r_50_fpn[dynamo-blade (latency)] status changed, AssertionError -> 18.985
    • dlrm[disc (latency)] 1.132 -> 1.59, -40.4594%
    • dlrm[blade (latency)] 1.144 -> 1.612, -40.9091%
    • dlrm[dynamo-blade (latency)] 1.22 -> 1.672, -37.0492%
    • dlrm[dynamo-disc (latency)] 1.225 -> 1.754, -43.1837%
    • fambench_xlmr[dynamo-disc (latency)] status changed, 90.117 -> OSError
    • fambench_xlmr[dynamo-disc (clusters)] status changed, 50.0 -> N/A
    • fambench_xlmr[dynamo-disc (compiled)] status changed, 2868.0 -> N/A
    • hf_Bart[dynamo-disc (clusters)] 13 -> 7
    • hf_Bart[dynamo-disc (compiled)] 1397 -> 1403
    • hf_Bert[dynamo-blade (latency)] 4.5 -> 3.2, +28.8889%
    • hf_Bert[dynamo-disc (latency)] 4.563 -> 3.344, +26.7149%
    • hf_Bert_large[dynamo-blade (latency)] 10.575 -> 8.432, +20.2648%
    • hf_Bert_large[dynamo-disc (latency)] 10.625 -> 9.043, +14.8894%
    • hf_Bert_mini[dynamo-blade (latency)] 0.961 -> 0.509, +47.0343%
    • hf_Bert_mini[dynamo-disc (latency)] 1.374 -> 0.783, +43.0131%
    • hf_BigBird[dynamo-disc (latency)] status changed, RuntimeError -> 67.762
    • hf_BigBird[dynamo-disc (clusters)] status changed, N/A -> 135.0
    • hf_BigBird[dynamo-disc (compiled)] status changed, N/A -> 11446.0
    • hf_DistilBert[dynamo-blade (latency)] 3.234 -> 2.922, +9.6475%
    • hf_DistilBert[dynamo-disc (latency)] 2.936 -> 2.577, +12.2275%
    • hf_GPT2[dynamo-blade (latency)] 14.768 -> 13.811, +6.4802%
    • hf_GPT2[dynamo-disc (latency)] 11.466 -> 10.781, +5.9742%
    • hf_GPT2_large[dynamo-disc (latency)] 54.855 -> 52.108, +5.0077%
    • hf_Longformer[disc (latency)] status changed, 85.172 -> RuntimeError
    • hf_Longformer[dynamo-blade (latency)] 88.71 -> 107.706, -21.4136%
    • hf_Longformer[dynamo-disc (latency)] status changed, 67.529 -> RuntimeError
    • hf_Longformer[disc (compiled)] status changed, 9201.0 -> N/A
    • hf_Longformer[disc (clusters)] status changed, 169.0 -> N/A
    • hf_Longformer[dynamo-disc (clusters)] status changed, 172.0 -> N/A
    • hf_Longformer[dynamo-disc (compiled)] status changed, 6130.0 -> N/A
    • hf_T5[dynamo-disc (latency)] status changed, 32.34 -> OSError
    • hf_T5[dynamo-disc (clusters)] status changed, 35.0 -> N/A
    • hf_T5[dynamo-disc (compiled)] status changed, 1775.0 -> N/A
    • hf_T5_base[dynamo-disc (latency)] status changed, 92.944 -> OSError
    • hf_T5_base[dynamo-disc (clusters)] status changed, 65.0 -> N/A
    • hf_T5_base[dynamo-disc (compiled)] status changed, 3431.0 -> N/A
    • hf_T5_large[dynamo-blade (latency)] 67.117 -> 57.946, +13.6642%
    • hf_T5_large[dynamo-disc (latency)] status changed, 61.988 -> OSError
    • hf_T5_large[dynamo-disc (clusters)] status changed, 125.0 -> N/A
    • hf_T5_large[dynamo-disc (compiled)] status changed, 6743.0 -> N/A
    • maml_omniglot[dynamo-blade (latency)] 0.376 -> 0.486, -29.2553%
    • maml_omniglot[dynamo-disc (latency)] 0.645 -> 0.513, +20.4651%
    • mnasnet1_0[dynamo-blade (latency)] 2.78 -> 2.629, +5.4317%
    • mnasnet1_0[dynamo-disc (latency)] 4.175 -> 3.879, +7.0898%
    • mobilenet_v3_large[dynamo-blade (latency)] 3.87 -> 3.581, +7.4677%
    • phlippe_densenet[dynamo-blade (latency)] 4.085 -> 3.742, +8.3966%
    • phlippe_densenet[dynamo-disc (latency)] 5.089 -> 4.749, +6.6811%
    • pyhpc_equation_of_state[blade (latency)] 2.0 -> 2.102, -5.1%
    • pyhpc_equation_of_state[dynamo-blade (latency)] 2.134 -> 2.023, +5.2015%
    • pyhpc_isoneutral_mixing[blade (latency)] 7.07 -> 7.581, -7.2277%
    • pyhpc_isoneutral_mixing[dynamo-blade (latency)] 6.201 -> 6.667, -7.5149%
    • pyhpc_turbulent_kinetic_energy[blade (latency)] 5.746 -> 6.198, -7.8663%
    • pyhpc_turbulent_kinetic_energy[dynamo-blade (latency)] 5.872 -> 6.174, -5.1431%
    • pytorch_stargan[blade (latency)] 9.462 -> 25.75, -172.1412%
    • resnet18[blade (latency)] 0.887 -> 0.994, -12.0631%
    • resnet18[dynamo-blade (latency)] 0.993 -> 0.925, +6.8479%
    • resnet18[dynamo-disc (latency)] 1.459 -> 1.385, +5.072%
    • squeezenet1_1[dynamo-blade (latency)] 0.977 -> 0.91, +6.8577%
    • timm_efficientnet[dynamo-blade (latency)] 11.535 -> 10.837, +6.0511%
    • timm_vision_transformer[dynamo-disc (latency)] 3.77 -> 3.493, +7.3475%
    • timm_vovnet[dynamo-blade (latency)] 10.484 -> 9.427, +10.082%
    • timm_vovnet[dynamo-disc (latency)] 16.368 -> 15.232, +6.9404%
    • yolov3[dynamo-disc (latency)] status changed, 27.479 -> RuntimeError
    • yolov3[dynamo-disc (clusters)] status changed, 18.0 -> N/A
    • yolov3[dynamo-disc (compiled)] status changed, 141.0 -> N/A

detail data can be seen in oss://bladedisc-ci/TorchBench/gpu/full/20230623-15
created by TorchBench CI automatically

duplicated to #1180