neurocard / neurocard

State-of-the-art neural cardinality estimators for join queries

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

date format error.

xchuwenbo opened this issue · comments

Hi, there,

I got this strange error. Possibly some date format error happened here and it crashed. Moreover, seems that I can run normally in another machine without GPU. Is there anything wrong with GPU or my machine time setting? Thanks if you could help. Records attached below.

Wenbo

(pid=3795897) 2021-10-09 16:24:38,223 WARNING trainable.py:640 -- Trainable._train is deprecated and will be removed in a future version of Ray. Override Trainable.step instead.
(pid=3795897) wandb: WARNING Can't convert tensor summary, upgrade tensorboard with pip install tensorboard --upgrade
(pid=3795897) 2021-10-09 16:24:38,239 WARNING trainable.py:797 -- Trainable._log_result is deprecated and will be removed in a future version of Ray. Override Trainable.log_result instead.
(pid=3795897) 2021-10-09 16:24:38,250 WARNING trainable.py:691 -- Trainable._save is deprecated and will be removed in a future version of Ray. Override Trainable.save_checkpoint instead.
(pid=3795897) Test bits: 72.52929152758935
(pid=3795897)
(pid=3795897) wandb: Waiting for W&B process to finish, PID 3796955
(pid=3795897) wandb: Program ended successfully.
*** Error in `python run.py --run test-job-light': double free or corruption (!prev): 0x000055888dab9990 ***
*** Aborted at 1633767878 (unix time) try "date -d @1633767878" if you are using GNU date ***
PC: @ 0x0 (unknown)
*** SIGABRT (@0x3ea0039ea80) received by PID 3795584 (TID 0x7f2533f2a700) from PID 3795584; stack trace: ***
@ 0x7f2533b15890 (unknown)
@ 0x7f2533790067 gsignal
@ 0x7f2533791448 abort
@ 0x7f25337ce1b4 (unknown)
@ 0x7f25337d398e (unknown)
@ 0x7f25337d4696 (unknown)
@ 0x7f2533d34958 _dl_deallocate_tls
@ 0x7f2533b0d107 __free_stacks
@ 0x7f2533b0d21f __deallocate_stack
@ 0x7f2533b0f4d4 pthread_join
@ 0x7f2311e2d87f std::thread::join()
@ 0x7f252f977ab4 ray::stats::MetricsAgentExporter::~MetricsAgentExporter()
@ 0x7f252f5d09d9 std::_Sp_counted_base<>::_M_release()
@ 0x7f252f66a1bf ray::CoreWorkerProcess::~CoreWorkerProcess()
@ 0x7f252f66eef5 ray::CoreWorkerProcess::Shutdown()
@ 0x7f252f57bc7f __pyx_tp_dealloc_3ray_7_raylet_CoreWorker()
@ 0x55888bf35eeb _PyDict_DelItem_KnownHash
@ 0x55888bf3647c _PyObjectDict_SetItem
@ 0x55888bf8fcd7 _PyObject_GenericSetAttrWithDict
@ 0x55888bf909d7 PyObject_SetAttr
@ 0x55888bf9ad1e _PyEval_EvalFrameDefault
@ 0x55888bee12b9 _PyEval_EvalCodeWithName
@ 0x55888bee23e5 _PyFunction_FastCallDict
@ 0x55888bff1ef4 atexit_callfuncs
@ 0x55888bff0957 Py_FinalizeEx
@ 0x55888c0033d3 pymain_main
@ 0x55888c0036fc _Py_UnixMain
@ 0x7f253377cb45 __libc_start_main
@ 0x55888bfa83c0 (unknown)

@xchuwenbo What is your ray version? We tested this repo with environment.yml which has older versions.

ray version: 0.8.7
I suspect that there's something wrong with my cuda and pytorch version... I am still checking it...

I fix this issue after I reinstall the consistent version of cudatoolkit.