rmnldwg / lymph

Python package for statistical modelling of lymphatic metastatic spread in head & neck cancer.

Home Page:https://lymph-model.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wrong product in `comp_posterior_joint_state_dist` function.

rmnldwg opened this issue · comments

I found a second bug in the bilateral code in the comp_posterior_joint_state_dist function. In line 605 we multiply by the ipsilateral diagnose vector. Which is correct, but the transposition of the matrix does not have the desired effect. To reach the desired effect we need to build a different vector array.

Current state:

        joint_state_dist = self.comp_joint_state_dist(t_stage=t_stage, mode=mode)
        # matrix with P(Zi=zi,Zc=zc|Xi,Xc) * P(Xi,Xc) for all states Xi,Xc.
        joint_diagnose_and_state = (
            diagnose_given_state["ipsi"].T
            * joint_state_dist
            * diagnose_given_state["contra"]
        )

Potential fix:

joint_state_dist = self.comp_joint_state_dist(t_stage='early', mode='HMM')
# matrix with P(Zi=zi,Zc=zc|Xi,Xc) * P(Xi,Xc) for all states Xi,Xc.
joint_diagnose_and_state = (
    diagnose_given_state["ipsi"][:, np.newaxis] 
    * joint_state_dist
    * diagnose_given_state["contra"]
)

Originally posted by @YoelPH in #60 (comment)

@YoelPH, nice catch! I would not have noticed that on my own. Including your solution, which of the three fixes looks more readable?

# 1. Your solution
joint_diagnose_and_state = (
    diagnose_given_state["ipsi"][:, np.newaxis] 
    * joint_state_dist
    * diagnose_given_state["contra"]
)

# 2. Using the outer product
joint_diagnose_and_state = (
    np.outer(
        diagnose_given_state["ipsi"],
        diagnose_given_state["contra"],
    ) * joint_state_dist
)

# 3. Using einsum
joint_diagnose_and_state = np.einsum(
    "i,ij,j->ij",
    diagnose_given_state["ipsi"],
    joint_state_dist,
    diagnose_given_state["contra"],
)

They all produce the same result. What do you think?

Personally I prefer the second option. My solution is not very readable and the Einsum always gives me a headache bcs I need to readup how it works, hahaha.