Question: ... ?[h264_nvenc @ 0x55d020323ac0] dl_fn->cuda_dl->cuCtxCreate(&ctx->cu_context_internal, 0, cu_device) failed -> CUDA_ERROR_OUT_OF_MEMORY: out of memory
zlucode opened this issue · comments
Hello,
My question is: ...
I have a problem, reproduces 100%. could any experts help to shoot it ?
[problem desc]
Write a very simple sample code to test the robust of "initialize/encode frame/destroy" procedure, with h264_nvenc encoder with ffmpeg.
My test code repeats calling "initialize/encode frame/destroy" in a infinit loop.
The result is :
(1) under windows, it works fine, it can run 65000+ times, no error reports;
(2) under ubuntu, it only can loop 65528 times. when it comes in number 65529 loop, it reports error like below.
[h264_nvenc @ 0x55d020323ac0] dl_fn->cuda_dl->cuCtxCreate(&ctx->cu_context_internal, 0, cu_device) failed -> CUDA_ERROR_OUT_OF_MEMORY: out of memory
[h264_nvenc @ 0x55d020323ac0] No capable devices found
when this issue appears, I check memory with nvidia-smi, but the memory occupying is normal, didnot show there is memory leak or something like that.
and if kill the program and re-run it, it can works fine, but will still report error when comes in 65529's loop.
Everytime I trying the program, it reports error at the 65529's running time.
the main program is pasted at below, it's very simple, I cannot find problem with it.
------------ code snippet BEGIN-----------
//function, init & encode frames & destroy
void test1()
{
//Initialize h264_nvenc codec and context
AVCodec* codec = avcodec_find_encoder_by_name("h264_nvenc");
AVCodecContext* codec_context = avcodec_alloc_context3(codec);
avcodec_open2(codec_context, codec, nullptr);
AVFrame* frame = av_frame_alloc();
AVPacket* avpkt = av_packet_alloc();
av_frame_get_buffer(frame, 0);
//encode 1 frame, or n frames, whatever, the final results are same.
while(nFrames_total)
{
avcodec_send_frame(codec_context, frame);
avcodec_receive_packet(codec_context, avpkt);
}
//destroy objects and release resources
avcodec_free_context(&codec_context);
av_free(ff_context_ptr_->codec_context);
av_frame_free(&frame);
av_packet_free(&avpkt);
}
void main(void)
{
int counter=0;
while(counter++)
{
/* when counter >= 65529, error will be reported */
test1();
}
}
------------ code snippet END-----------
The ffmpeg version / ubuntu version / GPU information are:
(ffmpeg version)
||/ Name Version Architecture Description
+++-=====================-========================-============-===========================================================
ii libavcodec-dev:amd64 7:4.4.2-0ubuntu0.22.04.1 amd64 FFmpeg library with de/encoders for audio/video codecs - development files
ii libavfilter-dev:amd64 7:4.4.2-0ubuntu0.22.04.1 amd64 FFmpeg library containing media filters - development files
..
(os version)
$ cat /etc/issue
Ubuntu 22.04.3 LTS \n \l
(GPU info)
$ nvidia-smi
Tue Oct 17 07:37:18 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05 Driver Version: 535.104.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Quadro P6000 Off | 00000000:01:00.0 Off | Off |
| 16% 46C P0 61W / 250W | 0MiB / 24576MiB | 1% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
This is not related to the patch. Please ask on ffmpeg mailing lists.
65529 is suspiciously close to 2^16 btw - could be caused by int overflow.