ValueError raised without a dimention mismatch.
baharcos opened this issue · comments
I am getting ValueError: dependent and exog must have the same number of observations. from PanelOLS raised by the line 414 in source code: if y.shape[0] != x.shape[0]:
Even though I get False when I run the same line. So y.shape[0] == x.shape[0].
Can you post some example code that will produce the problem? Any change all of your x data has missing values somewhere in a row?
Sorry, the problem was the missing indexes which took me a while to figure out from the error message.
I think stating the data type and format of the arguments already in the PanelOLS function's documentation a bit more explicitly could improve the user experience. At least it would have for me!
@bashtage I have a similar problem with my own data:
from linearmodels.panel.model import PanelOLS
from patsy import dmatrices
df = pd.DataFrame({'HUD_TOTAL_UNITS': {0: 287,
1: 309, 2: 106,3: 48,4: 133,5: 2767,6: 354,7: 78,8: 1063,9: 87},
'ACS_SHARE_STUDENT': {0: 0.1667319663924078,1: 0.17409332238503,2: 0.1531424340974591,3: 0.140645770849126,4: 0.1874776433804533,
5: 0.1661870742518456,6: 0.1171084956864537,7: 0.1359496792732456,8: 0.2012157348613907,9: 0.181423395015628},
'SD_TOTALREV_w': {0: 169245923.26139638,1: 505645392.4999371,2: 130964271.42492916,3: 224003870.101752,4: 212067226.9743809,
5: 1074644265.1849256,6: 230845244.15128115,7: 241304003.7668597,8: 211527019.86738232,9: 219589384.34815124}})
y, X = dmatrices('HUD_TOTAL_UNITS ~ ACS_SHARE_STUDENT + SD_TOTALREV_w', data = df, return_type='dataframe')
model = PanelOLS(y,X, entity_effects=True, time_effects=True).fit()
Running the code above spits the following error:
ValueError: dependent and exog must have the same number of observations. The number of observations in dependent is 10, and the number of observations in exog is 30.
even though print(y.shape[0] == X.shape[0])
is true and y.shape == (10,1)
and X.shape == (10,3)
@baharcos 's point seemingly does not apply to my own issue.
- What is the full error?
- What is the index of x and y?
- What are their full shapes?
This is happending because panelOLS requires a 2-level multiindex. When you pass X as a frame without a multiindex it assumes the columns are one of the level of the index, and so the X matrix which is (10,3) is rehsaped to be (30,1), which does not agree with the 10 observations of y.
See https://bashtage.github.io/linearmodels/panel/examples/data-formats.html
@bashtage got it, I used the solution from this link: https://stackoverflow.com/questions/35798862/pandas-and-panelols-only-2-level-multiindex-are-supported. Thank you !