buffer8848 / gperftools

Automatically exported from code.google.com/p/gperftools

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Crash In RHEL5

GoogleCodeExporter opened this issue · comments

What steps will reproduce the problem?
1.simple code
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main()
{
char *buf = NULL;
buf = malloc(BUF_SIZE);
memset(buf, 'A', BUF_SIZE);
printf("I am Here.\n");
return 0;
}
2. gcc t_malloc.c -ltcmalloc -o t_malloc
3. Run t_malloc

What is the expected output? What do you see instead?


What version of the product are you using? On what operating system?

Version: 1.4
OS: RHEL 5

Please provide any additional information below.

[root@254 hit]# gdb -q ./t_malloc 
(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) r
Starting program: /home/hit/t_malloc 
(no debugging symbols found)
(no debugging symbols found)

Program received signal SIGSEGV, Segmentation fault.
base::VDSOSupport::ElfMemImage::GetNumSymbols (this=0xbf9006bc) at
src/base/vdso_support.cc:139
139 src/base/vdso_support.cc: No such file or directory.
    in src/base/vdso_support.cc
Current language:  auto; currently c++
(gdb) bt
#0  base::VDSOSupport::ElfMemImage::GetNumSymbols (this=0xbf9006bc) at
src/base/vdso_support.cc:139
#1  0x006c507e in base::VDSOSupport::SymbolIterator::Update
(this=0xbf900674, increment=0) at src/base/vdso_support.cc:487
#2  0x006c5281 in base::VDSOSupport::begin (this=0xbf9006bc) at
src/base/vdso_support.cc:472
#3  0x006c59c9 in base::VDSOSupport::LookupSymbol (this=0xbf9006bc,
name=0x6cf8e8 "__vdso_getcpu", version=0x6cf8de "LINUX_2.6", type=2, 
    info=0xbf9006e0) at src/base/vdso_support.cc:407
#4  0x006c5af5 in base::VDSOSupport::Init () at src/base/vdso_support.cc:381
#5  0x006c5bf7 in global constructors keyed to vdso_support.cc () at
src/base/vdso_support.cc:556
#6  0x006c7a7d in __do_global_ctors_aux () at ./src/base/spinlock.h:74
#7  0x006a89bc in _init () from /usr/lib/libtcmalloc.so.0
#8  0x007e9f03 in call_init () from /lib/ld-linux.so.2
#9  0x007ea013 in _dl_init_internal () from /lib/ld-linux.so.2
#10 0x007dc84f in _dl_start_user () from /lib/ld-linux.so.2

Original issue reported on code.google.com by ipconfi...@gmail.com on 29 Sep 2009 at 2:46

Hmm, interesting, a problem with the VDSO code.  I'll ask the local VDSO expert 
if he
has any ideas.  We may need help from you to do a bit of debugging.

In the short term, you can get things working again by changing this line in
vdso_support.h:
#if defined(__ELF__) && defined(__GLIBC__)
to be
#if 0 && defined(__ELF__) && defined(__GLIBC__)

This will just disable vdso, which is not necessary for correctness (it's a
performance optimization).

Original comment by csilv...@gmail.com on 29 Sep 2009 at 10:50

  • Added labels: Priority-Medium, Type-Defect
I don't have easy access to RHEL 5 machines :-(

Please do the following to help diagnose this problem:

gdb -q ./t_malloc
run
  # wait till crash
info locals
print *this

info auxv
  # note the address of AT_SYSINFO_EHDR, likely 0xffffe000 (but may be
  # different on RHEL 5).
dump memory vdso32.so 0xffffe000 0xffffe000+4096
  # use the address of SYSINFO_EHDR instead of 0xffffe000 above.
quit

You should now have a vdso32.so file in the current directory.
Please attach output from "readelf --all vdso32.so",
and from the GDB session above.

I do have easy access to Fedora 11; if you have access to such machine
and can reproduce the crash there, that would also make debugging this
easier.

Thanks,

Original comment by ppluzhni...@gmail.com on 29 Sep 2009 at 11:17

[root@254 hit]# gdb -q ./t_malloc 
(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) r
Starting program: /home/hit/t_malloc 
(no debugging symbols found)
(no debugging symbols found)

Program received signal SIGSEGV, Segmentation fault.
base::VDSOSupport::ElfMemImage::GetNumSymbols (this=0xbfdcf8bc) at
src/base/vdso_support.cc:139
139 src/base/vdso_support.cc: No such file or directory.
    in src/base/vdso_support.cc
Current language:  auto; currently c++
(gdb) info locals
No locals.
(gdb) print *this
$1 = {ehdr_ = 0x72c000, dynsym_ = 0xdc, versym_ = 0x182, verdef_ = 0x72c18c, 
hash_ =
0xb4, dynstr_ = 0x12c <Address 0x12c out of bounds>, 
  strsize_ = 86, verdefnum_ = 2, link_base_ = 0}
(gdb) info auxv
32   AT_SYSINFO           Special system info/entry points 0x72c400
33   AT_SYSINFO_EHDR      System-supplied DSO's ELF header 0x72c000
16   AT_HWCAP             Machine-dependent CPU capability hints 0xbfebf3ff
6    AT_PAGESZ            System page size               4096
17   AT_CLKTCK            Frequency of times()           100
3    AT_PHDR              Program headers for program    0x8048034
4    AT_PHENT             Size of program header entry   32
5    AT_PHNUM             Number of program headers      7
7    AT_BASE              Base address of interpreter    0x0
8    AT_FLAGS             Flags                          0x0
9    AT_ENTRY             Entry point of program         0x80483e0
11   AT_UID               Real user ID                   0
12   AT_EUID              Effective user ID              0
13   AT_GID               Real group ID                  0
14   AT_EGID              Effective group ID             0
23   AT_SECURE            Boolean, was exec setuid-like? 0
15   AT_PLATFORM          String identifying platform    0xbfdcfadb "i686"
0    AT_NULL              End of vector                  0x0
(gdb) dump memory vdso32.so 0x72c000 0x72c000+4096
(gdb) quit
The program is running.  Exit anyway? (y or n) y


=========================================================================

[root@254 hit]# readelf --all vdso32.so 
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           Intel 80386
  Version:                           0x1
  Entry point address:               0x400
  Start of program headers:          52 (bytes into file)
  Start of section headers:          1672 (bytes into file)
  Flags:                             0x0
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         4
  Size of section headers:           40 (bytes)
  Number of section headers:         13
  Section header string table index: 12

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .hash             HASH            000000b4 0000b4 000028 04   A  2   0  4
  [ 2] .dynsym           DYNSYM          000000dc 0000dc 000050 10   A  3   1  4
  [ 3] .dynstr           STRTAB          0000012c 00012c 000056 00   A  0   0  1
  [ 4] .gnu.version      VERSYM          00000182 000182 00000a 02   A  2   0  2
  [ 5] .gnu.version_d    VERDEF          0000018c 00018c 000038 00   A  3   2  4
  [ 6] .text             PROGBITS        00000400 000400 000060 00  AX  0   0 32
  [ 7] .note             NOTE            00000460 000460 000018 00   A  0   0  4
  [ 8] .eh_frame_hdr     PROGBITS        00000478 000478 000024 00   A  0   0  4
  [ 9] .eh_frame         PROGBITS        0000049c 00049c 0000f4 00   A  0   0  4
  [10] .dynamic          DYNAMIC         00000590 000590 000078 08  WA  3   0  4
  [11] .useless          PROGBITS        00000608 000608 00000c 04  WA  0   0  4
  [12] .shstrtab         STRTAB          00000000 000614 000073 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

There are no section groups in this file.

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x00000000 0x00000000 0x00614 0x00614 R E 0x1000
  DYNAMIC        0x000590 0x00000590 0x00000590 0x00078 0x00078 R   0x4
  NOTE           0x000460 0x00000460 0x00000460 0x00018 0x00018 R   0x4
  GNU_EH_FRAME   0x000478 0x00000478 0x00000478 0x00024 0x00024 R   0x4

 Section to Segment mapping:
  Segment Sections...
   00     .hash .dynsym .dynstr .gnu.version .gnu.version_d .text .note .eh_frame_hdr
.eh_frame .dynamic .useless 
   01     .dynamic 
   02     .note 
   03     .eh_frame_hdr 

Dynamic section at offset 0x590 contains 10 entries:
  Tag        Type                         Name/Value
 0x0000000e (SONAME)                     Library soname: [linux-gate.so.1]
 0x00000004 (HASH)                       0xb4
 0x00000005 (STRTAB)                     0x12c
 0x00000006 (SYMTAB)                     0xdc
 0x0000000a (STRSZ)                      86 (bytes)
 0x0000000b (SYMENT)                     16 (bytes)
 0x6ffffffc (VERDEF)                     0x18c
 0x6ffffffd (VERDEFNUM)                  2
 0x6ffffff0 (VERSYM)                     0x182
 0x00000000 (NULL)                       0x0

There are no relocations in this file.

There are no unwind sections in this file.

Symbol table '.dynsym' contains 5 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 00000400     3 FUNC    GLOBAL DEFAULT    6 __kernel_vsyscall@@LINUX_2.5
     2: 00000000     0 OBJECT  GLOBAL DEFAULT  ABS LINUX_2.5
     3: 00000440     7 FUNC    GLOBAL DEFAULT    6 __kernel_rt_sigreturn@@LINUX_2.5
     4: 00000420     8 FUNC    GLOBAL DEFAULT    6 __kernel_sigreturn@@LINUX_2.5

Histogram for bucket list length (total of 3 buckets):
 Length  Number     % of total  Coverage
      0  1          ( 33.3%)
      1  1          ( 33.3%)     25.0%
      2  0          (  0.0%)     25.0%
      3  1          ( 33.3%)    100.0%

Version symbols section '.gnu.version' contains 5 entries:
 Addr: 0000000000000182  Offset: 0x000182  Link: 2 (.dynsym)
  000:   0 (*local*)       2 (LINUX_2.5)     2 (LINUX_2.5)     2 (LINUX_2.5)  
  004:   2 (LINUX_2.5)  

Version definition section '.gnu.version_d' contains 2 entries:
  Addr: 0x000000000000018c  Offset: 0x00018c  Link: 3 (.dynstr)
  000000: Rev: 1  Flags: BASE   Index: 1  Cnt: 1  Name: linux-gate.so.1
  0x001c: Rev: 1  Flags: none  Index: 2  Cnt: 1  Name: LINUX_2.5

Notes at offset 0x00000460 with length 0x00000018:
  Owner     Data size   Description
  Linux     0x00000004  Unknown note type: (0x00000000)

Original comment by ipconfi...@gmail.com on 30 Sep 2009 at 2:30

Disable VDSO will be right;-)
vdso will accelerate tcmalloc alot?

Original comment by ipconfi...@gmail.com on 30 Sep 2009 at 2:40

The problem is that this VDSO is linked at virtual address 0:

> Program Headers:
>   Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
>   LOAD           0x000000 0x00000000 0x00000000 0x00614 0x00614 R E 0x1000

and that contradicts my expectations in src/base/vdso_support.cc, 560:

    // Since real VDSO is never linked at address 0, and "fake" vdso
    // library always is, we know we are dealing with the "fake" one here.

Could you please try with this patch:

--- src/base/vdso_support.cc.orig       2009-09-29 20:09:59.635796890 -0700
+++ src/base/vdso_support.cc    2009-09-29 20:10:33.046781571 -0700
@@ -268,7 +268,7 @@
                                     relocation);
   for (; dynamic_entry->d_tag != DT_NULL; ++dynamic_entry) {
     ElfW(Xword) value = dynamic_entry->d_un.d_val;
-    if (link_base_ == 0) {
+    if (false && link_base_ == 0) {
       // A complication: in the real VDSO, dynamic entries are not relocated
       // (it wasn't loaded by a dynamic loader). But when testing with a
       // "fake" dlopen()ed vdso library, the loader relocates some (but

> vdso will accelerate tcmalloc alot?

Contrary to what csilvers said, this isn't about speed at all.
Mostly VDSO code is used to get better stack traces when doing CPU profiling.

If you don't care about CPU profiling, it is safe to disable VDSO code.
I would still like you to try the patch above first.

Thanks,

Original comment by ppluzhni...@gmail.com on 30 Sep 2009 at 3:18

ppluzhnikov has managed to rewrite this functionality to better tell whether 
we're
'fake' or not, which I expect will fix this bug.  It'll be in the next release.

Original comment by csilv...@gmail.com on 14 Oct 2009 at 11:27

  • Changed state: Started
This should be fixed in perftools 1.5, just released.

Original comment by csilv...@gmail.com on 20 Jan 2010 at 11:11

  • Changed state: Fixed