smuehlst / circle-stdlib

Standard C and C++ Library Support for Circle

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bad memmove behaviour

giuseppe-avon opened this issue · comments

memmove fails without any warning when moving overlapping heap regions. Data inside is array is corrupted, seems like last value is smeared across the whole portion.
Below simple example to replicate.

#define SIZE_A 10

uint32_t *testArray = (uint32_t*)malloc(sizeof(uint32_t) * SIZE_A + 1);
for(int i = 0; i < SIZE_A; i++) {
	testArray[i] = i;
}
memmove(&testArray[1], &testArray[0], sizeof(uint32_t)* SIZE_A);

testArray[0] = 20;
for(int i = 0; i < SIZE_A; i++) {
	printf("%d\r\n", testArray[i]);
}

I made a first attempt to reproduce this, but here all looks good. A test program derived from your code above and executed in QEMU prints the numbers 20, 0, ... 8, which is what I would expect.

So can you please provide more information:

  • The full program to reproduce this.
  • What compiler do you use?
  • How do you invoke the circle-stdlib configure script?

The full program is there, tested on a real Raspberry Pi 4 - 4GB RAM, compiled with aarch64-none-g++
Everything was made starting from the 04-std inside the samples folder.
The circle configure script is invoked with the 4 for the right RPi version and aarch64-none as prefix.

I can reproduce this now with a 64-bit build. A preliminary analysis indicates that there's a problem in the assembler implementation of memmove for AArch64 in newlib.

As a workaround you can disable the assembler implementations in newlib by applying the following patch:

diff --git a/configure b/configure
index 66a2bfd..e177688 100755
--- a/configure
+++ b/configure
@@ -189,9 +189,9 @@ export \

 if [ $DEBUG -eq 1 ]
 then
-    CFLAGS_FOR_TARGET="$ARCH -O0 -g -Wno-parentheses"
+    CFLAGS_FOR_TARGET="$ARCH -O0 -g -Wno-parentheses -DPREFER_SIZE_OVER_SPEED"
 else
-    CFLAGS_FOR_TARGET="$ARCH -Wno-parentheses"
+    CFLAGS_FOR_TARGET="$ARCH -Wno-parentheses -DPREFER_SIZE_OVER_SPEED"
 fi
 export CFLAGS_FOR_TARGET

Then rerun configure and rebuild everything.

While this is not responsible for the observed problem, I wanted to note that there is a bug in the test program itself:

#define SIZE_A 10
uint32_t *testArray = (uint32_t*)malloc(sizeof(uint32_t) * SIZE_A + 1);

While the intention is to allocate space for (SIZE_A + 1) uint32_t elements, it allocates space for SIZE_A uint32_t elements plus 1 byte.

Correct is:

#define SIZE_A 10
uint32_t *testArray = (uint32_t*)malloc(sizeof(uint32_t) * (SIZE_A + 1));

The problem is caused by the fact that newlib's assembler implementation of memmove wants to call its own assembler implementation of memcpy:

https://github.com/smuehlst/circle-newlib/blob/9e241244edf500443db61f28fbc01930daefba5e/newlib/libc/machine/aarch64/memmove.S#L90-L100

/* All memmoves up to 96 bytes are done by memcpy as it supports overlaps.
   Larger backwards copies are also handled by memcpy. The only remaining
   case is forward large copies.  The destination is aligned, and an
   unrolled loop processes 64 bytes per iteration.
*/

def_fn memmove, 6
	sub	tmp1, dstin, src
	cmp	count, 96
	ccmp	tmp1, count, 2, hi
	b.hs	memcpy

But in the debugger I can see that the call to memcpy jumps to Circle's assembler implementation of memcpy in util_fast.S:

https://github.com/rsta2/circle/blob/bd762e6c37823e166189791f624ed633764dec84/lib/util_fast.S#L61-L63

	.globl	memcpy
memcpy:
	mov	x8, x0

This obviously cannot work as the newlib assumptions about the behavior of memcpy are very likely violated.

@rsta2 How can we avoid this clash between the duplicate assembler implementations of memmoveand memcpy in newlib and in Circle itself? Would this require a change to Circle so that the memory assembler routines in Circle are disabled with STDLIB_SUPPORT>=2?

Theoretically it's possible to remove memcpy() (and maybe memset()?) from libcircle.a, when STDLIB_SUPPORT >= 2. Unfortunately there may be problems, because these functions are used in the early stages at system init and must work with MMU and D-cache disabled too. This was a cause of problems multiple times already. I don't know, if the newlib implementations for these functions are safe to work without MMU.

In any case it would have to be tested carefully in different configurations (32-bit and 64-bit) on the real hardware. How can I enable memcpy() and memset() in newlib to test this? I you want to test this on your own, memcpy() is implemented in lib/util_fast.S in Circle and memset() in lib/util.cpp.

Theoretically it's possible to remove memcpy() (and maybe memset()?) from libcircle.a, when STDLIB_SUPPORT >= 2. Unfortunately there may be problems, because these functions are used in the early stages at system init and must work with MMU and D-cache disabled too. This was a cause of problems multiple times already. I don't know, if the newlib implementations for these functions are safe to work without MMU.

I have no idea whether the newlib implementations have any dependency on the MMU. How can that be determined? By the use of certain assembler instructions? If so, which are those?

In any case it would have to be tested carefully in different configurations (32-bit and 64-bit) on the real hardware. How can I enable memcpy() and memset() in newlib to test this? I you want to test this on your own, memcpy() is implemented in lib/util_fast.S in Circle and memset() in lib/util.cpp.

As far as I understand the assembler implementations of memcpy(), memmove(), etc. are enabled by default in the circle-stdlib build.

I know only two cases, where it does not work without MMU:

  • Unaligned memory accesses (32-bit and 64-bit)
  • Using the floating point registers (32-bit, the floating point unit must be enabled first there, which is not the case in the early stages)
  • Maybe more

In the end, one can only test, if it works.

I tried to build samples/04-std with memcpy() and memset() disabled in Circle. I got many unresolved externals for these functions. Thus they must be disabled in circle-stdlib by default.

I think, we decided to provide these functions by Circle itself, when you started with the circle-stdlib project. This was before the STDLIB_SUPPORT option existed and memcpy() and memset() had to be provided by Circle, because the compiler generates calls to them. So there was no possibility to define these functions with a different name in Circle.

I think, we decided to provide these functions by Circle itself, when you started with the circle-stdlib project. This was before the STDLIB_SUPPORT option existed and memcpy() and memset() had to be provided by Circle, because the compiler generates calls to them. So there was no possibility to define these functions with a different name in Circle.

I only vaguely remember this. After thinking about this some more, I think the best strategy will be to disable these functions in newlib in general and to rely only on the Circle implementations. With that the behavior will not change in any way for example when one switches from Circle standalone to circle-stdlib.

Yes, I agree. Which functions does this exactly affect, beside memcpy() and memset()?

Which functions does this exactly affect, beside memcpy() and memset()?

Only memmove(). I have now tested the following:

  • Disabled building of all modules in circle-stdlib that are related to memcpy(), memset() and memmove(), including the assembler implementations for arm and aarch64.
  • Enabled compilation of the memmove() function unconditionally in circle/lib/util.cpp, e.g. with the patch below:
diff --git a/lib/util.cpp b/lib/util.cpp
index 0ca3b6c..0ba045c 100644
--- a/lib/util.cpp
+++ b/lib/util.cpp
@@ -51,7 +51,6 @@ void *memset (void *pBuffer, int nValue, size_t nLength)
        return pBuffer;
 }

-#if STDLIB_SUPPORT <= 1

 void *memmove (void *pDest, const void *pSrc, size_t nLength)
 {
@@ -75,6 +74,7 @@ void *memmove (void *pDest, const void *pSrc, size_t nLength)
        return memcpy (pDest, pSrc, nLength);
 }

+#if STDLIB_SUPPORT <= 1
 int memcmp (const void *pBuffer1, const void *pBuffer2, size_t nLength)
 {
        const unsigned char *p1 = (const unsigned char *) pBuffer1;

After that the test program for memmove() works fine in 32-bit and 64-mode under QEMU.

Should I prepare a pull request for the patch above?

Should I prepare a pull request for the patch above?

Yes, please do.

There is a new hotfix release 44.1 of Circle available, which can be used to fix this issue.

@rsta2 Thanks for Circle Step 44.1. By using exclusively the memmove(), memset()and memcpy() implementations from Circle this issue is fixed now in circle-stdlib v15.8.