UDST / choicemodels

Python library for discrete choice modeling

Home Page:https://udst.github.io/choicemodels

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TypeError when fitting a MultinomialLogit model

smmaurer opened this issue · comments

Arezoo (@Arezoo-bz) is getting an unusual error trying to fit MultinomialLogit models. The same notebook runs on other machines without a problem. We'll update this issue when we find a solution or work-around.

screen shot 2017-07-26 at 10 56 09

screen shot 2017-07-26 at 10 56 19

Is there a NaN somewhere that's being cast to int?

@gboeing No NAN in the dataset.
@smmaurer I updated anaconda and all other requirements... But still, I get the same error. I also deleted the %%time from my codes, but it didn't change anything. Weird problem!

I just replicated the error on my iMac. Conda is up to date, and I did a fresh install of pylogit, choicemodels, updated urbansim. Numpy 1.12.1

pandas is 0.20.1, and Python is 3.5.3. Followed the install instructions in choicemodels.

Interesting. How much of the destination choice notebook will run before triggering this error? Here's a shortcut to download the data files.

I'll try to replicate the error in a virtual environment so i can troubleshoot things.

It runs all the way to the estimation cell without error.

And it fails even if I drop all but one variable, regardless of which one remains.

Ok, I tracked down the cause and this should be fixed in the latest PR (#13). Full explanation below for the curious!

@Arezoo-bz, when you have a chance, do a git pull on choicemodels, restart the Jupyter kernel, and try running the code again. Let me know if it works.


The culprit was this division operator: https://github.com/UDST/choicemodels/pull/13/files

In Python 3, division returns a float even if both operands are integers (see PEP 238).

Numpy used to silently accept floats as indexes if they were "close" in value to ints, but as of v1.12 it now raises a TypeError (release notes).

So the combination of these things caused a crash on machines running Python 3 and Numpy > 1.12, but now it's fixed by using the correct integer division operator.

Confirmed, this fixed the problem. Nice detective work!