ycjuan / libffm

A Library for Field-aware Factorization Machines

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

nan predictions

eloiup opened this issue · comments

using the python wrapper (libffm-python)

for some reason, when the input dataset becomes too large (too many "fields" ~ about 29 or more), the predictions (at least the first iterations, havent checked if it changes eventually after N iterations) are all NaN

*edit: few samples of data, even a one row dataframe, presents the same issue, so it appears to be "fields" related

*edit2: tested, doesnt cnverge after N iterations

Having the same issue. I'm only using 1 field (e.g. same as a regular FM), training data has around 700k rows.

commented

Not sure that it's your case but still...
This could be a result of a division https://github.com/guestwalk/libffm/blob/26c13b22ae7eb829b8ed6ac191d890c25dbc5733/ffm.cpp#L598
Which sometimes might be an inf/inf due to exponentiation earlier:
https://github.com/guestwalk/libffm/blob/26c13b22ae7eb829b8ed6ac191d890c25dbc5733/ffm.cpp#L592
Then nan propagates to the rest of the coefficients through interactions.

If that's the case then it can be easily fixed by clipping t to some range (as it's done in VW, for instance).