google / tcmalloc

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GuardedPageAllocator appears in stack trace when ActivateGuardedSampling() has not been called

erin2722 opened this issue · comments

When running tests with tcmalloc, I have occasionally seen the program crash with the following lines appearing in the backtrace:

tcmalloc::tcmalloc_internal::GuardedPageAllocator::Deallocate(void*)

tcmalloc::tcmalloc_internal::(anonymous namespace)::InvokeHooksAndFreePages(void*, std::optional<unsigned long>)

TCMallocInternalDeleteArraySized

When examining the program with gdb, I can see that the program is crashing because it detected a memory error in the GuardedPageAllocator-- however, the application has not called ActivateGuardedSampling, and so this is unexpected behavior.

From examining the tcmalloc code, I see that ActivateGuardedSampling flips a setting that allows the GuardedPageAllocator to allocate bytes within it's defined address space. However, on deallocation, that setting is not checked, and tcmalloc simply checks whether the deallocated pointer is within its address space, and then goes on with the memory checks (and possible crashes) if that is true. Is it possible that some other sampled memory is ending up in the address space of the GuardedPageAllocator, and is therefore being validated upon deallocation when it is not intended to?

Is it a bug on tcmalloc's end that it is crashing on deallocations like this? Or is there anything else that can explain this behavior?

Hi @erin2722,

Is it possible that some other sampled memory is ending up in the address space of the GuardedPageAllocator, and is therefore being validated upon deallocation when it is not intended to?

If that would be possible, that would be a serious bug that can lead to arbitrary memory corruptions.
I don't immediately see how this is possible. GuardedPageAllocator allocated that memory with mmap in Init method.

GuardedPageAllocator circumvents system-alloc's spinlock, which may be unintentional, but the mmap in system-alloc does not use MAP_FIXED, only MAP_FIXED_NOREPLACE (if available). Without MAP_FIXED overlap in hints must not lead to overlapping ranges being allocated.

If you can reproduce this at least semi-reliably, I would suggest to trace mmap's with strace of printf's to confirm/disprove possible overlapping.

Hi @dvyukov ,

Thank you so much for the quick response! I will see what I can do in terms of reproing this, and let you know what I find.

Since you have a core dump, I was curious if anything obvious stood out about the contents of guarded_page_allocator_ and the faulting address.

Without the call to ActivateGuardedSampling, I'd expect the begin/end address ranges of guarded_page_allocator_ to be 0 and PointerIsMine to always fail, but maybe something unusual is happening.

When I last looked at the code I read it as: GuardedPageAllocator::Init mmaps memory and initializes begin/end, and then ActivateGuardedSampling sets the flag to start allocating guarded allocations, and they are separate.
If that's the case, we can have begin/end non-0, but no allocations, and PointerIsMine can still return true (due to some corruption presumably).

Yup, that is also my interpretation of the code @dvyukov , which is validated by the state of the GuardedPageAllocator when the crash happens:

(gdb) f 0
#0  tcmalloc::tcmalloc_internal::GuardedPageAllocator::Deallocate (this=0x7fc1873258e0 <tcmalloc::tcmalloc_internal::Static::guardedpage_allocator_>, ptr=ptr@entry=0x438f3fe00000) at src/third_party/tcmalloc/dist/tcmalloc/guarded_page_allocator.cc:223
223	    *reinterpret_cast<char*>(ptr) = 'X';  // Trigger SEGV handler.
(gdb) p *this
$1 = {
  stacktrace_filter_ = {
    stack_hashes_with_count_ = {{
        <std::__atomic_base<unsigned long>> = {
          _M_i = 0
        }, 
      } <repeats 256 times>},
    max_slots_used_ = {
      <std::__atomic_base<unsigned long>> = {
        _M_i = 0
      }, 
    },
    replacement_inserts_ = {
      <std::__atomic_base<unsigned long>> = {
        _M_i = 0
      }, 
    }
  },
  guarded_page_lock_ = {
    lockword_ = {
      <std::__atomic_base<unsigned int>> = {
        _M_i = 0
      }, 
    }
  },
  free_pages_ = {true <repeats 128 times>, false <repeats 384 times>},
  num_alloced_pages_ = 0,
  num_alloced_pages_max_ = 0,
  num_successful_allocations_ = {
    value_ = {
      <std::__atomic_base<long>> = {
        _M_i = 0
      }, 
    }
  },
  num_failed_allocations_ = {
    value_ = {
      <std::__atomic_base<long>> = {
        _M_i = 0
      }, 
    }
  },
  data_ = 0x2d563ff86120,
  pages_base_addr_ = 0x438f3fc00000,
  pages_end_addr_ = 0x438f3fe02000,
  first_page_addr_ = 0x438f3fc02000,
  max_alloced_pages_ = 64,
  total_pages_ = 128,
  total_pages_used_ = 0,
  alloced_page_count_when_all_used_once_ = 0,
  page_size_ = 8192,
  rand_ = {
    <std::__atomic_base<unsigned long>> = {
      _M_i = 140469173639392
    }, 
  },
  initialized_ = true,
  allow_allocations_ = false,
  double_free_detected_ = true,
  write_overflow_detected_ = false
}

We can see here that although allow_allocations_ is false, and num_successful_allocations_ is 0, the ptr argument is 0x438f3fe00000, which falls within the range of pages_base_addr_ to pages_end_addr_, causing PointerIsMine to succeed and the deallocation to go through validation.

It then detects a double free even though one is not present, because free_pages_ has been filled with true during initialization, and ReserveFreeSlot, which is what updates the free_pages_ to have false values for specific slots, will return early because allow_allocations_ is false, and so IsFreed will always return true, causing a false double-free detection.

Just for extra info, I am using the tcmalloc version as of this commit 18777b1, and the issue started appearing after we upgraded from 093ba93.

Hi all! After investigating, I believe that this was an issue with the porting of the tcmalloc build from bazel into our native build system-- we dropped the linkstatic=1 flags, and I think improper symbol resolution on dynamic builds was leading to this issue. Closing this issue, and thanks for the help!

We may need to reopen this issue -- we ended up tracking down what's going on, and it's not related to linking.

TCMalloc introduced MAP_FIXED_NOREPLACE with this commit, which is broken on Linux kernel versions 4.17 and 4.18, fixed in 4.19.

This is what causes the issue seen in the beginning of the issue. In our testing on a machine with kernel version 4.18, this sequence of events can happen:

  1. The GuardedPageAllocator maps pages allocating roughly 2MB.
  2. In SampleifyAllocation, we end up creating a sampled page, which flows through to creating the first mmap region for sampled allocations and allocates 1GB for that region.
  3. If we are unlucky, the region created for sampled allocations will encompass the pages created for the GuardedPageAllocator, clobbering the GuardedPageAllocator's pages.
  4. A page that the GuardedPageAllocator believes it owns can be deallocated, tripping the check that the allocation is guarded, and ultimately causing a segfault because the GuardedPageAllocator believes it is seeing a double free.

There are a couple things we could do here, but I think the least invasive change would be to add another check into MapFixedNoReplaceFlagAvailable() to check if the currently running kernel version is susceptible to the MAP_FIXED_NOREPLACE bug.