cmuratori / computer_enhance

Source code for the https://computerenhance.com programming series

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Undefined behavior in circular buffer

VasyaPRO opened this issue · comments

Circular buffer implementation that involves page mapping mentioned in the recent video (which is a great video btw) behaves inconsistently on different optimization levels, which is likely caused by undefined behavior. The reason is probably because of aliasing rules that modern compilers use aggressively to optimize code. The following code uses the circular buffer defined in perfaware/part3/listing_0121_circular_buffer_main.cpp:

int main(void)
{
    printf("Circular buffer test:\n");
    
    const size_t BUF_SIZE = 64 * 4096;

    circular_buffer Circular = AllocateCircularBuffer(BUF_SIZE, 3);
    
    if(IsValid(Circular))
    {
        u8 *Data = Circular.Base.Data + BUF_SIZE;

        Data[0] = 1;
        Data[BUF_SIZE] = 2;

        printf("%u\n", Data[0]);

        DeallocateCircularBuffer(&Circular);
    }
    else
    {
        printf("  FAILED\n");
    }
    
    // NOTE(casey): Since we do not use these functions in this particular build, we reference their pointers
    // here to prevent the compiler from complaining about "unused functions".
    (void)&IsInBounds;
    (void)&AreEqual;
    (void)&AllocateBuffer;
    (void)&FreeBuffer;
    
    return 0;
}

This code outputs (which is the expected result) on each compiler with optimizations off (cl /Od, g++ -O0, clang++ -O0):

Circular buffer test:
2

But it gives the following output when optimizations are on (cl /O2, g++ -O2, clang++ -O2):

Circular buffer test:
1

It seems like compilers assume that writing to Data[BUF_SIZE] could not possibly affect the value of Data[0], so it can safely put the known value of Data[0] directly into printf.
Here is the assembly generated with g++ -O2 (g++ version 13.1, mingw-w64)

   140007eba:   c6 80 00 00 04 00 01    mov    BYTE PTR [rax+0x40000],0x1   ; write 1 to Data[0]
   140007ec1:   48 8d 0d 8b 21 00 00    lea    rcx,[rip+0x218b]
   140007ec8:   ba 01 00 00 00          mov    edx,0x1                      ; put 1 directly into printf args
   140007ecd:   c6 80 00 00 08 00 02    mov    BYTE PTR [rax+0x80000],0x2   ; write 2 to Data[BUF_SIZE]
   140007ed4:   e8 f7 fd ff ff          call   140007cd0 <_Z6printfPKcz>    ; call printf

And here is the assembly generated with g++ -O0

   140001aec:   c6 00 01                mov    BYTE PTR [rax],0x1           ; write 1 to Data[0]
   140001aef:   48 8b 45 f0             mov    rax,QWORD PTR [rbp-0x10]
   140001af3:   48 05 00 00 04 00       add    rax,0x40000
   140001af9:   c6 00 02                mov    BYTE PTR [rax],0x2           ; write 2 to Data[BUF_SIZE]
   140001afc:   48 8b 45 f0             mov    rax,QWORD PTR [rbp-0x10]
   140001b00:   0f b6 00                movzx  eax,BYTE PTR [rax]           ; read Data[0] again
   140001b03:   0f b6 c0                movzx  eax,al
   140001b06:   89 c2                   mov    edx,eax                      ; put the value of Data[0] into printf args
   140001b08:   48 8d 05 6b 85 00 00    lea    rax,[rip+0x856b]
   140001b0f:   48 89 c1                mov    rcx,rax
   140001b12:   e8 39 68 00 00          call   140008350 <_Z6printfPKcz>    ; call printf

Sorry if it's not the right place to disscuss this, but YouTube comments are disabled, and Computerenhance comments are for subscribers only. But I believe it should be mentioned somewhere that this kind of circular buffers are not really safe to use with modern compilers unless someone figures out how to reliably tell the compiler that this kind of page manipulation is involved.

I can certainly add a comment to that effect, although I've never actually seen any cases of actual code you would hand a circular buffer to that do this (in general, if you are writing to more than one buffer's worth of data like this is, then it's unclear what you would want to have happen in the circular buffer case anyway, since the output is larger than the size of the buffer to begin with).

Separately, if you do want this to work for some reason, you can add "volatile" to the pointer so the compiler knows it can't optimize assumed values.

  • Casey

Usually when I use this kind of "magic" ringbuffer with virtual memory trick, then I read it only via variable index that compiler does not know anything about compile time, and I'm only reading/writing it forwards. Never with static offsets more than ringbuffer size. So far I have not seen compilers messing up such code.

You can see example of such ringbuffer here in my code here: https://github.com/mmozeiko/wstream/blob/main/rtmp_stream.c#L278-L355

The reader calls RB_BeginRead, gets a pointer, reads what values it wants, and calls RB_EndRead to advance read offset. Similarly with writer calling RB_BeginWrite / RB_EndWrite. Example where write happens: https://github.com/mmozeiko/wstream/blob/main/rtmp_stream.c#L413-L424