CI Table Not Outputting For All Classes In Column
sapaca opened this issue · comments
John Denham commented
For this code:
for name, grouped_df in df.groupby('mar'):
kmf = KaplanMeierFitter(alpha=alpha)
kmf.fit(durations=grouped_df['week'], event_observed=grouped_df['arrest'], label=name)
After running the fit above and executing this:
kmf.confidence_interval_
only 1 class is generated for the output table even though plotting will show the CI estimates (shaded area) for all classes in the column of interest. In the below output I would expect to see a 0_lower_0.95 and 0_upper_0.5 column as well.
1_lower_0.95 | 1_upper_0.95 | |
---|---|---|
0.0 | 1.000000 | 1.000000 |
12.0 | 0.873516 | 0.997320 |
23.0 | 0.857428 | 0.990427 |
26.0 | 0.834689 | 0.981385 |
39.0 | 0.811287 | 0.970985 |
42.0 | 0.788081 | 0.959609 |
43.0 | 0.742813 | 0.934739 |
46.0 | 0.720761 | 0.921486 |
52.0 | 0.720761 | 0.921486 |
Cameron Davidson-Pilon commented
I suspect it's because you are re-assigning the kmf
variable, so the last iteration defines what kmf
is (which is a KaplanMeierFitter trained only on data where mar=1).
I suggest breaking this into something like:
kmfs = {}
for name, grouped_df in df.groupby('mar'):
kmf = KaplanMeierFitter(alpha=alpha)
kmf.fit(durations=grouped_df['week'], event_observed=grouped_df['arrest'], label=name)
kmfs[name] = kmf
And then you have access to all kfms:
print(kmfs[0]. confidence_interval_)
print(kmfs[1]. confidence_interval_)
John Denham commented
Perfect, thank you. That works for me, I appreciate it!