dmlc / MXNet.jl

MXNet Julia Package - flexible and efficient deep learning in Julia

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

memory corruption

jingpengw opened this issue · comments

Something went wrong with my MXNet.

 using MXNet
INFO: Precompiling module MXNet.
*** Error in `/usr/people/jingpeng/lib/julia.download/bin/julia': malloc(): memory corruption: 0x00000000039ac0f0 ***


julia> Pkg.build("MXNet")
INFO: Building MXNet
INFO: MXNET_HOME environment detected: /opt/mxnet
INFO: Trying to load existing libmxnet...
julia: malloc.c:2372: sysmalloc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 *(sizeof(size_t))) - 1)) & ~((2 *(sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long) old_end & pagemask) == 0)' failed.

according to a Julia issue (JuliaLang/julia#18098), this might happen due to package memory allocation.

Another possibility is mismatch of gcc. The GCC I am using is

gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.8/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.8.5-2ubuntu1~14.04.1' --with-bugurl=file:///usr/share/doc/gcc-4.8/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.8 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.8 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-libmudflap --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.8-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.8.5 (Ubuntu 4.8.5-2ubuntu1~14.04.1) 

the downloaded Julia was built with

strings -a ~/lib/julia.download/bin/julia-debug  | grep GCCGCC: (GNU) 4.1.2 20080704 (Red Hat 4.1.2-55)
GCC: (GNU) 4.1.2 20080704 (Red Hat 4.1.2-55)
GCC: (GNU) 6.2.0
GCC: (GNU) 4.1.2 20080704 (Red Hat 4.1.2-55)

My mxnet was compiled with caffe to use caffe operations. After recompiling without caffe, the error is gone. This might be a caffe interface issue, close here.

My mxnet and julia was compiled with intel icc and mkl. After recompiling caffe using icc with the help of an caffe issue (BVLC/caffe#2157), it still have the memory issue. Seems that it is not due to the mismatch of g++ version.