build successfully, but container didn't start

Question

build successfully, but container didn't start

bltcn opened this issue 3 years ago · comments

bltcn commented 3 years ago

win11，wsl2，ubuntu18.04

How can I deal with it?

SthPhoenix · Answer 1 · Wed Nov 03 2021 22:32:27 GMT+0800 (China Standard Time)

Hi! I haven't tested image on windows.
Have you checked container logs?

bltcn · Answer 2 · Thu Nov 04 2021 14:06:31 GMT+0800 (China Standard Time)

Preparing models...
[04:56:39] INFO - Preparing 'glintr100' model...
[04:56:39] INFO - Building TRT engine for glintr100...
[TensorRT] WARNING: onnx2trt_utils.cpp:362: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[TensorRT] WARNING: GPU error during getBestTactic: Conv_0 : invalid argument
[TensorRT] ERROR: 10: [optimizer.cpp::computeCosts::1855] Error Code 10: Internal Error (Could not find any implementation for node Conv_0.)
Traceback (most recent call last):
File "prepare_models.py", line 54, in
prepare_models()
File "prepare_models.py", line 49, in prepare_models
prepare_backend(model_name=model, backend_name=backend_name, im_size=max_size, force_fp16=force_fp16,
File "/app/modules/model_zoo/getter.py", line 157, in prepare_backend
convert_onnx(temp_onnx_model,
File "/app/modules/converters/onnx_to_trt.py", line 84, in convert_onnx
assert not isinstance(engine, type(None))
AssertionError
Starting InsightFace-REST using 1 workers.
[04:56:51] INFO - 1
[04:56:51] INFO - MAX_BATCH_SIZE: 1
[04:56:51] INFO - Reshaping ONNX inputs to: (1, 3, 640, 640)
[04:56:51] INFO - In shape: [dim_value: 1
, dim_value: 3
, dim_param: "?"
, dim_param: "?"
]
[04:56:51] INFO - Building TRT engine for scrfd_10g_gnkps...
[TensorRT] WARNING: onnx2trt_utils.cpp:362: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[TensorRT] WARNING: GPU error during getBestTactic: Conv_0 + Relu_1 : invalid argument
[TensorRT] ERROR: 10: [optimizer.cpp::computeCosts::1855] Error Code 10: Internal Error (Could not find any implementation for node Conv_0 + Relu_1.)
Traceback (most recent call last):
File "/usr/local/bin/uvicorn", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1128, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/uvicorn/main.py", line 425, in main
run(app, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/uvicorn/main.py", line 447, in run
server.run()
File "/usr/local/lib/python3.8/dist-packages/uvicorn/server.py", line 68, in run
return asyncio.run(self.serve(sockets=sockets))
File "/usr/lib/python3.8/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/usr/local/lib/python3.8/dist-packages/uvicorn/server.py", line 76, in serve
config.load()
File "/usr/local/lib/python3.8/dist-packages/uvicorn/config.py", line 448, in load
self.loaded_app = import_from_string(self.app)
File "/usr/local/lib/python3.8/dist-packages/uvicorn/importer.py", line 21, in import_from_string
module = importlib.import_module(module_str)
File "/usr/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
File "", line 991, in _find_and_load
File "", line 975, in _find_and_load_unlocked
File "", line 671, in _load_unlocked
File "", line 848, in exec_module
File "", line 219, in _call_with_frames_removed
File "/app/./app.py", line 36, in
processing = Processing(det_name=configs.models.det_name, rec_name=configs.models.rec_name,
File "/app/./modules/processing.py", line 180, in init
self.model = FaceAnalysis(det_name=det_name, rec_name=rec_name, ga_name=ga_name, device=device,
File "/app/./modules/face_model.py", line 78, in init
self.det_model = Detector(det_name=det_name, device=device, max_size=self.max_size,
File "/app/./modules/face_model.py", line 37, in init
self.retina = get_model(det_name, backend_name=backend_name, force_fp16=force_fp16, im_size=max_size,
File "/app/./modules/model_zoo/getter.py", line 203, in get_model
model_path = prepare_backend(model_name, backend_name, im_size=im_size, max_batch_size=max_batch_size,
File "/app/./modules/model_zoo/getter.py", line 157, in prepare_backend
convert_onnx(temp_onnx_model,
File "/app/./modules/converters/onnx_to_trt.py", line 84, in convert_onnx
assert not isinstance(engine, type(None))
AssertionError

SthPhoenix · Answer 3 · Thu Nov 04 2021 14:34:06 GMT+0800 (China Standard Time)

Have you tried running other GPU based containers on wsl2, like TensorFlow benchmarks, to verify your wsl2 is properly configured for GPU usage?

SthPhoenix · Answer 4 · Thu Nov 04 2021 14:44:45 GMT+0800 (China Standard Time)

Try running this sample:
https://docs.nvidia.com/cuda/wsl-user-guide/index.html#ch05-sub01-simple-containers

bltcn · Answer 5 · Thu Nov 04 2021 18:02:49 GMT+0800 (China Standard Time)

Run "nbody -benchmark [-numbodies=]" to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values for simulation)
-hostmem (stores simulation data in host memory)
-benchmark (run benchmark to measure performance)
-numbodies= (number of bodies (>= 1) to run in simulation)
-device= (where d=0,1,2.... for the CUDA device to use)
-numdevices= (where i=(number of CUDA devices > 0) to use for simulation)
-compare (compares simulation results running once on the default GPU and once on the CPU)
-cpu (run n-body simulation on the CPU)
-tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Windowed mode
Simulation data stored in video memory
Single precision floating point simulation
1 Devices used for simulation
GPU Device 0: "Pascal" with compute capability 6.1

Compute 6.1 CUDA device: [NVIDIA GeForce GTX 1060]
10240 bodies, total time for 10 iterations: 8.868 ms
= 118.245 billion interactions per second
= 2364.896 single-precision GFLOP/s at 20 flops per interaction

SthPhoenix · Answer 6 · Thu Nov 04 2021 18:21:53 GMT+0800 (China Standard Time)

Hm, then TensorRT should work as expected.

I can double check that latest published version of InsightFace-REST works out of the box, but unfortunately I can't help you with running it on Windows.

SthPhoenix · Answer 7 · Thu Nov 04 2021 23:42:22 GMT+0800 (China Standard Time)

I have checked building from scratch with clean clone from repo - everything works as intended on Ubuntu 20.04.

Looks like it's WSL related problem.

bltcn · Answer 8 · Fri Nov 05 2021 09:42:30 GMT+0800 (China Standard Time)

thanks, I have tested cpu version. it works fine.maybe there is somthing wrong with parameters in this case

SthPhoenix · Answer 9 · Fri Nov 05 2021 15:15:07 GMT+0800 (China Standard Time)

Quote from Nvidia page above:

With the NVIDIA Container Toolkit for Docker 19.03, only --gpus all is supported.

This might be the case, since deploy_trt.sh tries to set specific GPU. Try replacing line 99 with --gpus all

Though according to the same document there also might be issues with pinned memory required for TensorRT, and issues with concurrent CUDA streams.

If pinned memory is also the issue you can try add RUN $PIP_INSTALL onnxruntime-gpu to Dockerfile_trt and switch inference backend to onnx in deploy_trt.sh at line 105

bltcn · Answer 10 · Fri Nov 05 2021 16:42:14 GMT+0800 (China Standard Time)

thanks. i will try

SthPhoenix · Answer 11 · Tue Nov 16 2021 04:06:53 GMT+0800 (China Standard Time)

Hi! Any updates? Have you managed to run it under WSL2?

bltcn · Answer 12 · Thu Dec 23 2021 18:07:47 GMT+0800 (China Standard Time)

sorry,I just see your reply. I will try.

SthPhoenix · Answer 13 · Tue Dec 28 2021 03:07:32 GMT+0800 (China Standard Time)

sorry,I just see your reply. I will try.

Looks like WSL2 just wasn't supported by TensorRT, but according to change log latest TensorRT version should support it. Try using 21.12 TensorRT image.

Bolano · Answer 14 · Sat Jan 29 2022 11:10:16 GMT+0800 (China Standard Time)

sorry,I just see your reply. I will try.

Looks like WSL2 just wasn't supported by TensorRT, but according to change log latest TensorRT version should support it. Try using 21.12 TensorRT image.

i try 21.12 and 22.01TensorRT image, unfortunately，all failed. 21.12 report GPU error during getBestTactic, 22.01 report Cuda failure: integrity checks failed

SthPhoenix · Answer 15 · Thu Feb 10 2022 23:19:40 GMT+0800 (China Standard Time)

i try 21.12 and 22.01TensorRT image, unfortunately，all failed. 21.12 report GPU error during getBestTactic, 22.01 report Cuda failure: integrity checks failed

Have you tried running other GPU based containers on WSL2 to ensure everything is installed correctly?