ROCm / ROCm

AMD ROCm™ Software - GitHub Home

Home Page:https://rocm.docs.amd.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Issue]: Segfault while importing `bitsandbytes` (hipBLASLt is suspicious)

seungduk-yanolja opened this issue · comments

Problem Description

  1. Installed bitsandbytes from https://github.com/ROCm/bitsandbytes/tree/rocm_enabled
  2. Ran python -c 'import bitsandbytes'
Full gdb log is here:
root@localhost:~/apps/bitsandbytes# gdb $(which python) /tmp/core-python.1987716.localhost.1707311281
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
    .

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /root/miniconda3/envs/axo/bin/python...

warning: core file may not match specified executable file.
[New LWP 1987716]
[New LWP 1987717]

warning: Section .reg-xstate/1987716' in core file too small. [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by python -c import bitsandbytes'.
Program terminated with signal SIGABRT, Aborted.

warning: Section `.reg-xstate/1987716' in core file too small.
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=139957344658496) at ./nptl/pthread_kill.c:44
44 ./nptl/pthread_kill.c: No such file or directory.
[Current thread is 1 (Thread 0x7f4a5bcfe440 (LWP 1987716))]
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=139957344658496) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=139957344658496) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=139957344658496, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007f4a5bd41476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 0x00007f4a5bd277f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007f49c8e8535a in __cxxabiv1::__terminate (handler=) at /opt/conda/conda-bld/gcc-compiler_1654084175708/work/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:48
#6 0x00007f49c8e853c5 in std::terminate () at /opt/conda/conda-bld/gcc-compiler_1654084175708/work/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:58
#7 0x00007f49c8e85658 in __cxxabiv1::__cxa_throw (obj=, tinfo=0x7f49c8fda220 , dest=0x7f49c8e9a110 std::runtime_error::~runtime_error())
at /opt/conda/conda-bld/gcc-compiler_1654084175708/work/gcc/libstdc++-v3/libsupc++/eh_throw.cc:95
#8 0x00007f49c9340f2a in ExtOpMasterLibrary::load(std::string const&) () from /root/miniconda3/envs/axo/lib/python3.10/site-packages/torch/lib/libhipblaslt.so
#9 0x00007f49c933e0c2 in ExtOpMasterLibrary::ExtOpMasterLibrary(std::string const&) () from /root/miniconda3/envs/axo/lib/python3.10/site-packages/torch/lib/libhipblaslt.so
#10 0x00007f49c933ce11 in (anonymous namespace)::getExtOpMasterLibrary() () from /root/miniconda3/envs/axo/lib/python3.10/site-packages/torch/lib/libhipblaslt.so
#11 0x00007f49c934e944 in _GLOBAL__sub_I_hipblaslt_ext_op.cpp () from /root/miniconda3/envs/axo/lib/python3.10/site-packages/torch/lib/libhipblaslt.so
#12 0x00007f4a5c02f47e in call_init (l=, argc=argc@entry=3, argv=argv@entry=0x7ffe73280ce8, env=env@entry=0x7ffe73280d08) at ./elf/dl-init.c:70
#13 0x00007f4a5c02f568 in call_init (env=0x7ffe73280d08, argv=0x7ffe73280ce8, argc=3, l=) at ./elf/dl-init.c:33
#14 _dl_init (main_map=0x2356010, argc=3, argv=0x7ffe73280ce8, env=0x7ffe73280d08) at ./elf/dl-init.c:117
#15 0x00007f4a5be73af5 in __GI__dl_catch_exception (exception=exception@entry=0x0, operate=operate@entry=0x7f4a5c036f40 <call_dl_init>, args=args@entry=0x7ffe7327c220) at ./elf/dl-error-skeleton.c:182
#16 0x00007f4a5c036ff6 in dl_open_worker (a=0x7ffe7327c3c0) at ./elf/dl-open.c:808
#17 dl_open_worker (a=a@entry=0x7ffe7327c3c0) at ./elf/dl-open.c:771
#18 0x00007f4a5be73a98 in __GI__dl_catch_exception (exception=exception@entry=0x7ffe7327c3a0, operate=operate@entry=0x7f4a5c036f60 <dl_open_worker>, args=args@entry=0x7ffe7327c3c0) at ./elf/dl-error-skeleton.c:208
#19 0x00007f4a5c03734e in _dl_open (file=, mode=-2147483646, caller_dlopen=0x5ca216 <_PyImport_FindSharedFuncptr+134>, nsid=-2, argc=3, argv=, env=0x7ffe73280d08) at ./elf/dl-open.c:883
#20 0x00007f4a5bd8f63c in dlopen_doit (a=a@entry=0x7ffe7327c630) at ./dlfcn/dlopen.c:56
#21 0x00007f4a5be73a98 in __GI__dl_catch_exception (exception=exception@entry=0x7ffe7327c590, operate=, args=) at ./elf/dl-error-skeleton.c:208
#22 0x00007f4a5be73b63 in __GI__dl_catch_error (objname=0x7ffe7327c5e8, errstring=0x7ffe7327c5f0, mallocedp=0x7ffe7327c5e7, operate=, args=) at ./elf/dl-error-skeleton.c:227
#23 0x00007f4a5bd8f12e in _dlerror_run (operate=operate@entry=0x7f4a5bd8f5e0 <dlopen_doit>, args=args@entry=0x7ffe7327c630) at ./dlfcn/dlerror.c:138
#24 0x00007f4a5bd8f6c8 in dlopen_implementation (dl_caller=, mode=, file=0x7f4a5b461050 "/root/miniconda3/envs/axo/lib/python3.10/site-packages/torch/_C.cpython-310-x86_64-linux-gnu.so")
at ./dlfcn/dlopen.c:71
#25 ___dlopen (file=file@entry=0x7f4a5b461050 "/root/miniconda3/envs/axo/lib/python3.10/site-packages/torch/_C.cpython-310-x86_64-linux-gnu.so", mode=) at ./dlfcn/dlopen.c:81
#26 0x00000000005ca216 in _PyImport_FindSharedFuncptr (prefix=0x61e4a5 "PyInit", shortname=0x7f4a5b4770e0 "_C",
pathname=0x7f4a5b461050 "/root/miniconda3/envs/axo/lib/python3.10/site-packages/torch/_C.cpython-310-x86_64-linux-gnu.so", fp=0x0) at /usr/local/src/conda/python-3.10.13/Python/dynload_shlib.c:100
#27 0x00000000005c9b34 in _PyImport_LoadDynamicModuleWithSpec (fp=0x0, spec=0x7f4a5b477f10) at /usr/local/src/conda/python-3.10.13/Python/importdl.c:137
#28 _imp_create_dynamic_impl (module=, file=, spec=0x7f4a5b477f10) at /usr/local/src/conda/python-3.10.13/Python/import.c:2050
#29 _imp_create_dynamic (module=, args=args@entry=0x7f4a5b5aa4b8, nargs=) at /usr/local/src/conda/python-3.10.13/Python/clinic/import.c.h:330
#30 0x00000000004fccc4 in cfunction_vectorcall_FASTCALL (func=0x7f4a5b9698a0, args=0x7f4a5b5aa4b8, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/methodobject.c:430
#31 0x00000000004f2a14 in do_call_core (kwdict=0x7f4a5b483d40, callargs=0x7f4a5b5aa4a0, func=0x7f4a5b9698a0, trace_info=0x7ffe7327cab0, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5917
#32 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b89e200, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:4277
#33 0x00000000004fcadf in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b89e200, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#34 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x7f4a5b91d520, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#35 _PyFunction_Vectorcall (func=0x7f4a5b91d510, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:342
#36 0x00000000004f1ac6 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x7f4a5b983648, callable=0x7f4a5b91d510, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:114
#37 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x7f4a5b983648, callable=0x7f4a5b91d510) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:123
#38 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x7ffe7327cc70, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5893
#39 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b9834c0, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:4181
#40 0x00000000004fcadf in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b9834c0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#41 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x7f4a5b9a2570, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#42 _PyFunction_Vectorcall (func=0x7f4a5b9a2560, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:342
#43 0x00000000004ed6d1 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x7f4a5b974838, callable=0x7f4a5b9a2560, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:114
#44 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x7f4a5b974838, callable=0x7f4a5b9a2560) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:123
#45 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x7ffe7327ce30, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5893
#46 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b9746c0, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:4198
#47 0x00000000004fcadf in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b9746c0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#48 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x7f4a5b91deb0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#49 _PyFunction_Vectorcall (func=0x7f4a5b91dea0, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:342
#50 0x00000000004ed2bf in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x7f4a5b4fa0c0, callable=0x7f4a5b91dea0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:114
#51 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x7f4a5b4fa0c0, callable=0x7f4a5b91dea0) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:123
#52 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x7ffe7327cff0, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5893
#53 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b4f9f40, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:4213
#54 0x00000000004fcadf in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b4f9f40, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
--Type for more, q to quit, c to continue without paging--
#55 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x7f4a5b91e0f0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#56 _PyFunction_Vectorcall (func=0x7f4a5b91e0e0, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:342
#57 0x00000000004ed2bf in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x7f4a5b85b9f0, callable=0x7f4a5b91e0e0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:114
#58 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x7f4a5b85b9f0, callable=0x7f4a5b91e0e0) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:123
#59 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x7ffe7327d1b0, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5893
#60 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b85b840, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:4213
#61 0x00000000004fcadf in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b85b840, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#62 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x7f4a5b91f2f0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#63 _PyFunction_Vectorcall (func=0x7f4a5b91f2e0, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:342
#64 0x00000000004ed2bf in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x7f4a5b70fdd0, callable=0x7f4a5b91f2e0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:114
#65 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x7f4a5b70fdd0, callable=0x7f4a5b91f2e0) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:123
#66 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x7ffe7327d370, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5893
#67 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b70fc40, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:4213
#68 0x00000000004fcadf in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b70fc40, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#69 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x7f4a5b91f380, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#70 _PyFunction_Vectorcall (func=0x7f4a5b91f370, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:342
#71 0x00000000004fc2a4 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x7ffe7327d4b0, callable=0x7f4a5b91f370, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:114
#72 object_vacall (tstate=0x21d0f40, base=0x7ffe7327d4b0, callable=0x7f4a5b91f370, vargs=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:734
#73 0x000000000050aa77 in _PyObject_CallMethodIdObjArgs (obj=0x7ffe7327d650, name=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:825
#74 0x0000000000509dd5 in import_find_and_load (abs_name=0x7f4a5b7605f0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/import.c:1522
#75 PyImport_ImportModuleLevelObject (name=0x7f4a5b7605f0, globals=, locals=, fromlist=0x7f4a5b6b8f40, level=0) at /usr/local/src/conda/python-3.10.13/Python/import.c:1623
#76 0x00000000004f04df in import_name (level=0x7f4a5b8bc0d0, fromlist=0x7f4a5b6b8f40, name=0x7f4a5b7605f0, f=0x22ac460, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:6018
#77 _PyEval_EvalFrameDefault (tstate=, f=0x22ac460, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:3695
#78 0x0000000000591d92 in _PyEval_EvalFrame (throwflag=0, f=0x22ac460, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#79 _PyEval_Vector (tstate=0x21d0f40, con=, locals=, args=, argcount=, kwnames=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#80 0x0000000000591cd7 in PyEval_EvalCode (co=co@entry=0x7f4a5b75c500, globals=globals@entry=0x7f4a5b853640, locals=locals@entry=0x7f4a5b853640) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:1134
#81 0x00000000005991cd in builtin_exec_impl (module=, locals=0x7f4a5b853640, globals=0x7f4a5b853640, source=0x7f4a5b75c500) at /usr/local/src/conda/python-3.10.13/Python/bltinmodule.c:1058
#82 builtin_exec (module=, args=args@entry=0x7f4a5b760e98, nargs=) at /usr/local/src/conda/python-3.10.13/Python/clinic/bltinmodule.c.h:371
#83 0x00000000004fccc4 in cfunction_vectorcall_FASTCALL (func=0x7f4a5b954e00, args=0x7f4a5b760e98, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/methodobject.c:430
#84 0x00000000004f2a14 in do_call_core (kwdict=0x7f4a5b740940, callargs=0x7f4a5b760e80, func=0x7f4a5b954e00, trace_info=0x7ffe7327da70, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5917
#85 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b89dcf0, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:4277
#86 0x00000000004fcadf in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b89dcf0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#87 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x7f4a5b91d520, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#88 _PyFunction_Vectorcall (func=0x7f4a5b91d510, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:342
#89 0x00000000004f1ac6 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x7f4a5b882148, callable=0x7f4a5b91d510, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:114
#90 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x7f4a5b882148, callable=0x7f4a5b91d510) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:123
#91 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x7ffe7327dc30, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5893
#92 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b881fc0, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:4181
#93 0x00000000004fcadf in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b881fc0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#94 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x7f4a5b9a1640, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#95 _PyFunction_Vectorcall (func=0x7f4a5b9a1630, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:342
#96 0x00000000004ed6d1 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x7f4a5b97b9f0, callable=0x7f4a5b9a1630, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:114
#97 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x7f4a5b97b9f0, callable=0x7f4a5b9a1630) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:123
#98 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x7ffe7327ddf0, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5893
#99 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b97b870, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:4198
#100 0x00000000004fcadf in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b97b870, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#101 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x7f4a5b91e0f0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#102 _PyFunction_Vectorcall (func=0x7f4a5b91e0e0, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:342
#103 0x00000000004ed2bf in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x7f4a5b8599f0, callable=0x7f4a5b91e0e0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:114
#104 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x7f4a5b8599f0, callable=0x7f4a5b91e0e0) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:123
#105 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x7ffe7327dfb0, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5893
#106 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b859840, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:4213
#107 0x00000000004fcadf in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b859840, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#108 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x7f4a5b91f2f0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#109 _PyFunction_Vectorcall (func=0x7f4a5b91f2e0, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:342
#110 0x00000000004ed2bf in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x7f4a5b70c770, callable=0x7f4a5b91f2e0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:114
#111 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x7f4a5b70c770, callable=0x7f4a5b91f2e0) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:123
#112 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x7ffe7327e170, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5893
--Type for more, q to quit, c to continue without paging--
#113 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b70c5e0, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:4213
#114 0x00000000004fcadf in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b70c5e0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#115 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x7f4a5b91f380, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#116 _PyFunction_Vectorcall (func=0x7f4a5b91f370, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:342
#117 0x00000000004fc2a4 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x7ffe7327e2b0, callable=0x7f4a5b91f370, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:114
#118 object_vacall (tstate=0x21d0f40, base=0x7ffe7327e2b0, callable=0x7f4a5b91f370, vargs=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:734
#119 0x000000000050aa77 in _PyObject_CallMethodIdObjArgs (obj=0x7ffe7327e450, name=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:825
#120 0x0000000000509dd5 in import_find_and_load (abs_name=0x7f4a5b84b4f0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/import.c:1522
#121 PyImport_ImportModuleLevelObject (name=0x7f4a5b84b4f0, globals=, locals=, fromlist=0x743720 <_Py_NoneStruct>, level=0) at /usr/local/src/conda/python-3.10.13/Python/import.c:1623
#122 0x00000000004f04df in import_name (level=0x7f4a5b8bc0d0, fromlist=0x743720 <_Py_NoneStruct>, name=0x7f4a5b84b4f0, f=0x7f4a5b902c20, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:6018
#123 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b902c20, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:3695
#124 0x0000000000591d92 in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b902c20, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#125 _PyEval_Vector (tstate=0x21d0f40, con=, locals=, args=, argcount=, kwnames=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#126 0x0000000000591cd7 in PyEval_EvalCode (co=co@entry=0x7f4a5b83b5d0, globals=globals@entry=0x7f4a5b84ad00, locals=locals@entry=0x7f4a5b84ad00) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:1134
#127 0x00000000005991cd in builtin_exec_impl (module=, locals=0x7f4a5b84ad00, globals=0x7f4a5b84ad00, source=0x7f4a5b83b5d0) at /usr/local/src/conda/python-3.10.13/Python/bltinmodule.c:1058
#128 builtin_exec (module=, args=args@entry=0x7f4a5b850e58, nargs=) at /usr/local/src/conda/python-3.10.13/Python/clinic/bltinmodule.c.h:371
#129 0x00000000004fccc4 in cfunction_vectorcall_FASTCALL (func=0x7f4a5b954e00, args=0x7f4a5b850e58, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/methodobject.c:430
#130 0x00000000004f2a14 in do_call_core (kwdict=0x7f4a5b84b400, callargs=0x7f4a5b850e40, func=0x7f4a5b954e00, trace_info=0x7ffe7327e870, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5917
#131 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b978fc0, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:4277
#132 0x00000000004fcadf in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b978fc0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#133 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x7f4a5b91d520, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#134 _PyFunction_Vectorcall (func=0x7f4a5b91d510, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:342
#135 0x00000000004f1ac6 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x7f4a5b9816c8, callable=0x7f4a5b91d510, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:114
#136 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x7f4a5b9816c8, callable=0x7f4a5b91d510) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:123
#137 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x7ffe7327ea30, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5893
#138 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b981540, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:4181
#139 0x00000000004fcadf in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b981540, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#140 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x7f4a5b9a1640, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#141 _PyFunction_Vectorcall (func=0x7f4a5b9a1630, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:342
#142 0x00000000004ed6d1 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x7f4a5b979af0, callable=0x7f4a5b9a1630, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:114
#143 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x7f4a5b979af0, callable=0x7f4a5b9a1630) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:123
#144 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x7ffe7327ebf0, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5893
#145 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b979970, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:4198
#146 0x00000000004fcadf in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b979970, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#147 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x7f4a5b91e0f0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#148 _PyFunction_Vectorcall (func=0x7f4a5b91e0e0, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:342
#149 0x00000000004ed2bf in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x7f4a5b8e75f0, callable=0x7f4a5b91e0e0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:114
#150 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x7f4a5b8e75f0, callable=0x7f4a5b91e0e0) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:123
#151 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x7ffe7327edb0, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5893
#152 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b8e7440, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:4213
#153 0x00000000004fcadf in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b8e7440, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#154 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x7f4a5b91f2f0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#155 _PyFunction_Vectorcall (func=0x7f4a5b91f2e0, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.13/Objects/call.c:342
#156 0x00000000004ed2bf in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x7f4a5b8eb0b0, callable=0x7f4a5b91f2e0, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:114
#157 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x7f4a5b8eb0b0, callable=0x7f4a5b91f2e0) at /usr/local/src/conda/python-3.10.13/Include/cpython/abstract.h:123
#158 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x7ffe7327ef70, tstate=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5893
#159 _PyEval_EvalFrameDefault (tstate=, f=0x7f4a5b8eaf20, throwflag=) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:4213
#160 0x00000000004fcadf in _PyEval_EvalFrame (throwflag=0, f=0x7f4a5b8eaf20, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Include/internal/pycore_ceval.h:46
#161 _PyEval_Vector (kwnames=optimized out, argcount=optimized out, args=optimized out, locals=0x0, con=0x7f4a5b91f380, tstate=0x21d0f40) at /usr/local/src/conda/python-3.10.13/Python/ceval.c:5067
#162 _PyFunction_Vectorcall (func=0x7f4a5b91f370, stack=, nargsf=

Internal ticket has been created for investigation.

It seems that Python cannot load libhipblaslt.so library. Can you check if this file exists?
/root/miniconda3/envs/axo/lib/python3.10/site-packages/torch/lib/libhipblaslt.so

Hi @seungduk-yanolja, can you please check if the file /root/miniconda3/envs/axo/lib/python3.10/site-packages/torch/lib/libhipblaslt.so exists ? Thanks.

HI @seungduk-yanolja, please check if the file /root/miniconda3/envs/axo/lib/python3.10/site-packages/torch/lib/libhipblaslt.so exists. Thanks.

Hi @nartmada, I cannot access the machine anymore since it was a short-time PoC. I could get a chance to reaccess the MI300X machine this month. When the access is available, I will check and let you know.
Thank you!

Thanks @seungduk-yanolja. Please let us know when you have access to the machine again.

Hi @seungduk-yanolja, hope you have a chance to access the MI300X machine again. Please let us know if you have any update. Thanks.

Hi @nartmada,

Thanks for checking in. Yes, I had and this time I did not have any issues with the bitsandbytes module.
However, this time, I had another issue with flash attention 2.
ROCm/flash-attention#40 (comment)

I used the following resources when testing MI300X.
https://erichartford.com/from-zero-to-fineturning-with-axolotl-on-rocm
https://docs.vllm.ai/en/latest/getting_started/amd-installation.html

Thanks,
Seungduk

oh I remember theres something wrong with my procedure, I will try setting it up again from scratch and update it

oh I remember theres something wrong with my procedure, I will try setting it up again from scratch and update it

@ehartford, which issue did you encountered? Thanks.