Network input

Question

Network input

XinyeYang opened this issue 3 months ago · comments

Hello, I have recently found a new problem during the experiment: the time interval is divided by equal intervals, but there may be a case that a patient has no measurement value within the time interval, and a patient has several measurement values. This will lead to inconsistent data formats input into the neural network, may I ask how to deal with this problem? Especially for the case of small data volume, it is impossible to directly delete the case of few follow-up values, but can only be filled in, but filling too many times will certainly have an impact on the model effect.

Michael Gensheimer · Answer 1 · Wed Apr 24 2024 02:07:18 GMT+0800 (China Standard Time)

It is fine to divide follow-up time into unequal width intervals if that fits your data well. For instance, if most patients have last follow-up only a few weeks from the baseline time point but just a few have extended follow-up of several years, you probably want to make the first few follow-up time bins narrower (maybe only a few days or a few weeks), and then the subsequent ones wider. Having a similar number of events (death, etc.) in each follow-up time bin will cause there to be enough information for the model to learn the coefficients for each follow-up time period well.
If there are many patients who survived past a certain follow-up time period, and there were no events in that time period, it's fine, the model will learn just fine from this and will just predict a high chance of surviving that time period for all patients.
It depends on the exact use case so I probably can't give more specific advice.