Deep Regularized Compound Gaussian Network (DR-CG-Net) for Solving Linear Inverse Problems
All rights to this code are reserved. Commercial and research licenses are available on request. Please contact aly.hoeher@colostate.edu with any requests for licenses.
An implementation of Projected Gradient Descent DR-CG-Net (PGD DR-CG-Net) and Iterative Shrinkage and Thresholding Algorithm DR-CG-Net (ISTA DR-CG-Net) from
Lyons C., Raj R. G., & Cheney M. (2023). "Deep Regularized Compound Gaussian Network for Solving Linear Inverse Problems." arXiv preprint arXiv:2311.17248.
Lyons C., Raj R. G., & Cheney M. (2024). "Deep Regularized Compound Gaussian Network for Solving Linear Inverse Problems," in IEEE Transactions on Computational Imaging, vol. 10, pp. 399-414, 2024, doi: 10.1109/TCI.2024.3369394.
See 'requirements.txt' for the python version, packages, and package versions utilized for this project.
Implementation Overview:
The DR-CG-Net method is an algorithm-unrolled deep neural network (DNN) that solves linear inverse problems to the forward measurement model
$$y = \Psi\Phi c + \nu \equiv Ac+\nu$$
where $\Psi\in\mathbb{R}^{m\times n}$ is a measurement matrix, $\Phi\in\mathbb{R}^{n\times n}$ is a change-of-basis dictionary, $x = \Phi c$ is an underlying signal of interest (i.e. an image) for $c\in\mathbb{R}^n$ the change-of-basis coefficients, $\nu\in\mathbb{R}^m$ is additive white Gaussian noise, and $y\in\mathbb{R}^m$ is the measurement/observation of $x$. As DR-CG-Net is an algorithm unrolled DNN, we first construct a compound Gaussian-based iterative algorithm to solve the above linear inverse problem and then unroll this algorithm into the DR-CG-Net framework.
Iterative Algorithm
Often an iterative algorithm is used to solve inverse problems where statistical prior information on $c$ (or $x$) is incorporated. We use a compound Gaussian (CG) prior that decomposes $c = z\odot u$ for two independent random vectors $z$ and $u$ where $u$ is Gaussian and $z$ is positive and non-Gaussian. The maximum a posteriori (MAP) estimate of $z$ and $u$ from $y = A(z\odot u)+\nu$ is given by
$$[\hat{z}, \hat{u}] = \arg\min_{[z, u]} F(z,u)$$
for
$$F(z,u) \coloneqq \frac{1}{2} ||y - A(z\odot u)||_2^2 + \frac{1}{2} u^TP_u^{-1} u + R(z)$$
where $P_u$ is the covariance matrix of $u$ and $R$ is a regularization function equal to the negative log prior of $z$ (which can be specified on a problem-specific basis). For notation, we write
$$A_z = A\text{diag}(z).$$
We consider an alternating block-coordinate descent to approximate $\hat{z}$ and $\hat{u}$. Thus, on iteration $k$$$u_k = \arg\min_u F(z_k, u) = \mathcal{T}(z; P_u) \coloneqq P_u A_z^T(I+A_zP_uA_z^T)^{-1}y$$
(note for larger signal sizes where the above inverse is intractable to solve, we instead approximate the above inverse with Nesterov accelerated gradient descent steps) and
$$z_k = J \text{ applications of a descent function } g(z,u):\mathbb{R}^n\times\mathbb{R}^n\to\mathbb{R}^n.$$
Two descent functions we employ are
The diagram below displays the DR-CG-Net architecture, which consists of alternating blocks $U_k$ and $\mathcal{Z}_k$ that estimate $u$ and $z$, respectively. Each $U_k = \mathcal{T}(Z_k; P_u)$ where $P_u$ is a positive definite matrix that is learned by DR-CG-Net and shared across the $U_k$'s. Each $\mathcal{Z}_k$ consists of $J$ blocks $g_k^{(j)}$ that correspond to the PGD or ISTA descent functions $g(z,u)$ above. In each $g_k^{(j)}$, the layer $r_k^{(j)} = z - \eta A_u^T(A_uz-y)$ for a learned step size $\eta$ and $W_k^{(j)}$ is an embedded convolutional subnetwork that approximates $\eta\nabla R(z)$ in PGD or $\text{prox}_{\eta R}$ in ISTA.
DR-CG-Net has been empirically shown to provide state-of-the-art performance when limited training data is available. Below is a sample reconstructed image.