Segmentation fault

Question

Segmentation fault

skirpichenko opened this issue 9 years ago · comments

Hello,

Thank you for your excellent method, software and description.

I faced a problem trying to employ the libffm in my ML task. I am getting segmentation fault when using it with cross-validation option. Here are my setup and data:
Ubuntu 13.10
~/libffm$ ./ffm-train -k 5 -t 30 -r 0.03 -v 2 data.txt
fold logloss
0 0.1080
Segmentation fault (core dumped)

The data.txt can be downloaded here https://drive.google.com/open?id=0B9HyQ7ZccW4-VFE0VWtxUHF2R3c

The problem arises only when working with big data files like that. If you cut it to 100K lines (it is around 250K lines) everything get OK.

Regards,
Sergey

Sergey commented 8 years ago

Thanks!

ycjuan · Answer 1 · Thu Jul 16 2015 01:01:02 GMT+0800 (China Standard Time)

Thanks for reporting the issue. Will fix the bug soon.

maybeluo · Answer 2 · Mon Mar 14 2016 14:54:25 GMT+0800 (China Standard Time)

I met the same question too. The input file was generated by "svm-scale" of libsvm (thus I believe the format is correct). The input file has 30k lines, 6k columns at most.

The OS I used was Debian wheezy with g++ 4.7

Thanks.

ycjuan · Answer 3 · Mon Mar 14 2016 23:18:24 GMT+0800 (China Standard Time)

Thanks for the bug report! I have fixed it. Also sorry for the delay, I forgot this issue actually..

Bishwarup Bhattacharjee · Answer 4 · Wed Mar 16 2016 06:55:09 GMT+0800 (China Standard Time)

Hi @guestwalk

I just downloaded your repository and tried to execute my file (svmlight format) through ffm-train - I get the same error. Could you please shed some light on how the issue can be negated?

Thanks,
Bishwarup

ycjuan · Answer 5 · Wed Mar 16 2016 08:14:42 GMT+0800 (China Standard Time)

Have you converted it to FFM format? If not please see README. Thanks.

Bishwarup Bhattacharjee · Answer 6 · Thu Mar 17 2016 00:58:55 GMT+0800 (China Standard Time)

Thanks for your reply @guestwalk . Following your suggestion I could address the issue I reported. However, now my tr_logloss and va_logloss comes out to be 'nan' for all the iterations. Could you please give me some idea of possible causes for that?

Thanks in advance,
Bishwarup

ycjuan · Answer 7 · Sat Mar 19 2016 05:04:59 GMT+0800 (China Standard Time)

It's hard to say. Indeed NAN sometimes happens in some data sets. (For example, NLP data sets) I think if you can mail me a tiny data set that can lead to NAN it would be easier to investigate. Thanks.