TypeError when fitting a MultinomialLogit model

Question

TypeError when fitting a MultinomialLogit model

smmaurer opened this issue 7 years ago · comments

Arezoo (@Arezoo-bz) is getting an unusual error trying to fit MultinomialLogit models. The same notebook runs on other machines without a problem. We'll update this issue when we find a solution or work-around.

Geoff Boeing · Answer 1 · Sat Jul 29 2017 08:38:38 GMT+0800 (China Standard Time)

Is there a NaN somewhere that's being cast to int?

arezoo besharati · Answer 2 · Sun Jul 30 2017 05:04:42 GMT+0800 (China Standard Time)

@gboeing No NAN in the dataset.
@smmaurer I updated anaconda and all other requirements... But still, I get the same error. I also deleted the %%time from my codes, but it didn't change anything. Weird problem!

Paul Waddell · Answer 3 · Sun Jul 30 2017 07:03:02 GMT+0800 (China Standard Time)

I just replicated the error on my iMac. Conda is up to date, and I did a fresh install of pylogit, choicemodels, updated urbansim. Numpy 1.12.1

Paul Waddell · Answer 4 · Sun Jul 30 2017 07:17:21 GMT+0800 (China Standard Time)

pandas is 0.20.1, and Python is 3.5.3. Followed the install instructions in choicemodels.

Sam Maurer · Answer 5 · Sun Jul 30 2017 07:25:20 GMT+0800 (China Standard Time)

Interesting. How much of the destination choice notebook will run before triggering this error? Here's a shortcut to download the data files.

I'll try to replicate the error in a virtual environment so i can troubleshoot things.

Paul Waddell · Answer 6 · Sun Jul 30 2017 13:03:46 GMT+0800 (China Standard Time)

It runs all the way to the estimation cell without error.

Paul Waddell · Answer 7 · Sun Jul 30 2017 13:04:28 GMT+0800 (China Standard Time)

And it fails even if I drop all but one variable, regardless of which one remains.

Sam Maurer · Answer 8 · Tue Aug 01 2017 09:50:58 GMT+0800 (China Standard Time)

Ok, I tracked down the cause and this should be fixed in the latest PR (#13). Full explanation below for the curious!

@Arezoo-bz, when you have a chance, do a git pull on choicemodels, restart the Jupyter kernel, and try running the code again. Let me know if it works.

The culprit was this division operator: https://github.com/UDST/choicemodels/pull/13/files

In Python 3, division returns a float even if both operands are integers (see PEP 238).

Numpy used to silently accept floats as indexes if they were "close" in value to ints, but as of v1.12 it now raises a TypeError (release notes).

So the combination of these things caused a crash on machines running Python 3 and Numpy > 1.12, but now it's fixed by using the correct integer division operator.

Paul Waddell · Answer 9 · Tue Aug 01 2017 11:01:35 GMT+0800 (China Standard Time)

Confirmed, this fixed the problem. Nice detective work!