Lack of clarity with multidimensional implicit ODE discovered models

Question

Lack of clarity with multidimensional implicit ODE discovered models

lucafusarbassini opened this issue a year ago · comments

Hi, congratulations on the very nice idea and package - I am sure it will help lots of biologists!
I'm trying to set up a SINDy workflow on a toy mode. Here's the toy model:

`import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt

Define the system of ODEs

def system(y, t, alpha_TF, beta_TF, alpha_mRNA, beta_mRNA, K, n):
TF, mRNA = y

dTF_dt = alpha_TF - beta_TF * TF
mRNA_production_rate = alpha_mRNA * (TF**n) / (K**n + TF**n)
dmRNA_dt = mRNA_production_rate - beta_mRNA * mRNA

return [dTF_dt, dmRNA_dt]

Parameters

alpha_TF = 0.8
beta_TF = 0.5
alpha_mRNA = 2.0
beta_mRNA = 0.1
K = 2
n = 2

Initial conditions: Assume TF is present initially and mRNA is absent

TF0 = 1.0
mRNA0 = 0.0

Time grid for simulation

t = np.linspace(0, 50, 500)

Solve the system of ODEs

result = odeint(system, [TF0, mRNA0], t, args=(alpha_TF, beta_TF, alpha_mRNA, beta_mRNA, K, n))

Plot the results

plt.figure(figsize=(10, 6))
plt.plot(t, result[:, 0], label='TF', color='blue')
plt.plot(t, result[:, 1], label='mRNA', color='green')
plt.title("Dynamics of Transcription Factor and Target mRNA")
plt.xlabel("Time")
plt.ylabel("Concentration")
plt.legend()
plt.grid(True)
plt.show()
`

It's a naive simulation of expression of a gene (mRNA) controlled by a transcription factor (TF). i want to be able to retrieve the underlying dynamics with SINDy.

So here's the core of the code I've been working on, adapting your Michaelis-Menten tutorial:

import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
import pysindy as ps
from pysindy.feature_library import CustomLibrary

library_functions = [
    lambda x: x,
    lambda x, y: x * y,
    lambda x: x ** 2,
    #lambda x, y, z: x * y * z,
    #lambda x, y: x * y ** 2,
    #lambda x: x ** 3,
    #lambda x, y, z, w: x * y * z * w,
    #lambda x, y, z: x * y * z ** 2,
    #lambda x, y: x * y ** 3,
    #lambda x: x ** 4,
]

# library function names includes both 
# the x_library_functions and x_dot_library_functions names
library_function_names = [
    lambda x: x,
    lambda x, y: x + y,
    lambda x: x + x,
    #lambda x, y, z: x + y + z,
    #lambda x, y: x + y + y,
    #lambda x: x + x + x,
    #lambda x, y, z, w: x + y + z + w,
    #lambda x, y, z: x + y + z + z,
    #lambda x, y: x + y + y + y,
    #lambda x: x + x + x + x,
    #lambda x: x,
]

sindy_library = ps.PDELibrary(
    library_functions=library_functions,
    temporal_grid=t,
    function_names=library_function_names,
    include_bias=True, # True
    implicit_terms=True,
    derivative_order=1)

sindy_opt = ps.SINDyPI(
    threshold=1e-6,
    tol=1e-8,
    thresholder="l1", ### regularization parameter...
    max_iter=20000,
)

# Initialize the SINDy model with the custom library
model = ps.SINDy(optimizer=sindy_opt, feature_library=sindy_library, differentiation_method=ps.FiniteDifference(drop_endpoints=True))

# Fit the SINDy model
model.fit(x_train, t=t)

print(model.print())

I get these kinds of models:

1 = 0.625 x0 + 1.249 x0_t
x0 = 0.121 1 + 0.570 x0x0 + 0.826 x0x0_t + -0.013 x1x0_t + 0.203 x1x1_t + 0.017 x0x1x0_t + 0.006 x0x0x1_t + -0.001 x1x1x1_t
x1 = 0.000
x0x1 = 0.000
x0x0 = 0.650 x0 + 0.085 x0x1 + 0.008 x1x1 + -0.287 x0_t + 0.892 x0x1x1_t + 0.049 x0x0x1_t + 0.199 x1x1x0_t
x1x1 = 0.000
x0_t = 0.000
x1_t = 0.067 x0 + -0.062 x0x1 + 0.263 x0x0 + -0.006 x0_t + -0.076 x1x0_t + 0.240 x1x1_t + 0.001 x0x1x0_t
x0x0_t = 0.235 1 + 0.171 x0 + -0.013 x0x1 + -0.006 x1x1 + 0.027 x0x1x0_t + -0.030 x0x0x1_t + -0.266 x1x1x0_t + -0.001 x1x1x1_t
x0x1_t = 0.315 x1 + -0.040 x1x1 + -0.196 x0_t + 0.205 x1x1_t + 0.077 x0x1x0_t + -0.249 x0x0x1_t + -0.007 x1x1x0_t
x1x0_t = 0.163 1 + 0.662 x1 + -0.066 x0x0 + -0.085 x1x1 + 0.069 x0_t + -0.392 x1x1_t + 0.084 x0x1x0_t + -0.528 x0x0x1_t
x1x1_t = 0.101 x0 + 0.009 x1 + 0.458 x0x0 + -0.023 x1x1 + -0.012 x0x1x0_t + -0.154 x0x0x1_t + -0.646 x1x1x0_t + 0.001 x1x1x1_t
x0x1x0_t = 0.000
x0x1x1_t = -0.074 1 + -0.078 x0x1 + 0.410 x0x0 + -0.040 x0x1x0_t + -0.010 x0x0x1_t + 0.158 x1x1x0_t + 0.001 x1x1x1_t
x0x0x0_t = 0.000
x0x0x1_t = 0.015 x0 + 0.586 x0x1 + -0.120 x1x1 + -1.114 x1x0_t + 0.234 x0x1x0_t + 0.414 x0x0x0_t + -0.036 x1x1x0_t + -0.001 x1x1x1_t

I've also run the downstream formatting and rearranging code, but still I'm stuck. I don't get why I get a single equation per model and not two. I understand that separating x0_t and x1_t explicitly might not be possible, but still I'd expect to governing equations. what's more, as per biological reasons, I'm only interested in models where x0_t and x1_t equations are indeed separable, and I get two such equations is I run SINDy without the SINDyPI optimizer, for example: `model = ps.SINDy(feature_library=sindy_library, differentiation_method=ps.FiniteDifference(drop_endpoints=True))

model.fit(x_train, t=t)`

which returns a clearly very wrong model, but at least in that case I get it clear what output I am receiving:
(x0)' = 1.000 x0_t
(x1)' = 1.999 x0 + -1.202 x0x0

how can i understand my output? am i doing something wrong?

also, the tutorial uses odeint to integrate the discovered system because it's a trivial 1D system, which is not my case here. I've tried integrating it by diffeqpy and fipy with very poor results. something coarse with time discretization would be more than enough for my purposes.

thank you very much,
Luca