ycjuan / libffm

A Library for Field-aware Factorization Machines

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Segmentation fault

skirpichenko opened this issue · comments

Hello,

Thank you for your excellent method, software and description.

I faced a problem trying to employ the libffm in my ML task. I am getting segmentation fault when using it with cross-validation option. Here are my setup and data:
Ubuntu 13.10
~/libffm$ ./ffm-train -k 5 -t 30 -r 0.03 -v 2 data.txt
fold logloss
0 0.1080
Segmentation fault (core dumped)

The data.txt can be downloaded here https://drive.google.com/open?id=0B9HyQ7ZccW4-VFE0VWtxUHF2R3c

The problem arises only when working with big data files like that. If you cut it to 100K lines (it is around 250K lines) everything get OK.

Regards,
Sergey

Thanks for reporting the issue. Will fix the bug soon.

I met the same question too. The input file was generated by "svm-scale" of libsvm (thus I believe the format is correct). The input file has 30k lines, 6k columns at most.

The OS I used was Debian wheezy with g++ 4.7

Thanks.

Thanks for the bug report! I have fixed it. Also sorry for the delay, I forgot this issue actually..

Thanks!

Hi @guestwalk

I just downloaded your repository and tried to execute my file (svmlight format) through ffm-train - I get the same error. Could you please shed some light on how the issue can be negated?

Thanks,
Bishwarup

Have you converted it to FFM format? If not please see README. Thanks.

Thanks for your reply @guestwalk . Following your suggestion I could address the issue I reported. However, now my tr_logloss and va_logloss comes out to be 'nan' for all the iterations. Could you please give me some idea of possible causes for that?

Thanks in advance,
Bishwarup

It's hard to say. Indeed NAN sometimes happens in some data sets. (For example, NLP data sets) I think if you can mail me a tiny data set that can lead to NAN it would be easier to investigate. Thanks.