Theory of inversion problem#
Definition of the ill-posed linear equation#
The inversion problem is described as a linear equation:
where \(\mathbf{T}\in\mathbb{R}^{M\times N}\) is a linear operator, \(\mathbf{x}\in\mathbb{R}^N\) is the a solution vector of the inversion problem, and \(\mathbf{b}\in\mathbb{R}^M\) is the given data vector.
Frequently, the above equation cannot be solved directly because of the following reasons:
The number of data \(M\) is less than the number of unknowns \(N\).
The data \(\mathbf{b}\) is contaminated by noise.
The operator \(\mathbf{T}\) (or \(\mathbf{T}^\mathsf{T}\mathbf{T}\)) is not full rank or not invertible.
The above equation is called an ill-posed linear equation.
Generalized Tikhonov Regularization#
In order to solve the ill-posed linear equation \eqref{eq:linear_equation}, we need to introduce a objective (or penalty) functional: \(O(\mathbf{x})\) and minimize the following functional [1]:
where \(\|\mathbf{x}\|_\mathbf{Q}^2\) stands for the weighted norm squared \(\mathbf{x}^\mathsf{T} \mathbf{Q} \mathbf{x}\) (compare with the Mahalanobis distance) with a symmetric positive semi-definite matrix \(\mathbf{Q}\). \(\lambda\) is a regularization parameter that controls the trade-off between the data misfit and the objective functional.
The objective functional is often defined as a quadratic form: \(O(\mathbf{x}) = \mathbf{x}^\mathsf{T} \mathbf{H} \mathbf{x} = \|\mathbf{x}\|_\mathbf{H}^2\), where \(\mathbf{H}\) is called the regularization matrix. Thus, the functional \eqref{eq:minimize-functional} can be rewritten as:
This regularization scheme is called the Generalized Tikhonov Regularization. The conventional Tikhonov regularization is a special case for \(\mathbf{Q}\) and \(\mathbf{H}\) being identity matrices. Additionally, the first and second terms are called the residual and regularization terms, respectively.
The solution \(\mathbf{x}\) can be obtained by differentiating the functional \eqref{eq: generalized-tikhonov} with respect to \(\mathbf{x}\) and setting it to zero as follows:
Therefore, the solution \(\mathbf{x}\) is given by:
Series expansion of the solution#
Although a direct inverse calculation for \eqref{eq:solution} is possible, it often needs a lot of computational resources. Additionally, to comprehend the solution, the cholesky decomposition and the singular value decomposition are often used [2].
1. Cholesky decomposition#
Let \(\mathbf{Q}\) and \(\mathbf{H}\) be factorized as follows:
where \(\mathbf{P}_\mathbf{Q}, \mathbf{P}_\mathbf{H}\) are fill-reducing permutation matrices and \(\mathbf{L}_\mathbf{Q}, \mathbf{L}_\mathbf{H}\) are lower triangular matrices. Let \eqref{eq:cholesky} be simple as follows:
2. Singular Value Decomposition#
Let us substitute the result of the cholesky decomposition \eqref{eq:cholesky-simple} into \eqref{eq:solution}:
where \(\mathbf{A} \equiv \mathbf{B} \mathbf{T} \mathbf{C}^{-1}\) and \(\hat{\mathbf{b}} \equiv \mathbf{B} \mathbf{b}\).
Let us perform the singular value decomposition to \(\mathbf{A}\):
where \(\mathbf{U}\in\mathbb{R}^{M\times r}\) and \(\mathbf{V}\in\mathbb{R}^{N\times r}\) are the left and right singular vectors, respectively, and \(\mathbf{S}\in\mathbb{R}^{r\times r}\) is a diagonal matrix with the singular values \(\sigma_i\). Here, \(r\) is the rank of \(\mathbf{A}\) and \(r\leq\min(M,N)\).
Hence, the solution \(\mathbf{x}\) can be written as:
where \(\tilde{\mathbf{V}}\in\mathbb{R}^{N\times r}\) is called the inverted solution basis and \(\mathbf{F}_\lambda\in\mathbb{R}^{r\times r}\) is a diagonal matrix, the element \(f_{\lambda, i}\) of which plays the role of a filter that suppresses the small singular values. The diagonal elements of \(\mathbf{F}_\lambda\) are given by:
where \(\sigma_i\) is the \(i\)-th diagonal element of \(\mathbf{S}\).
If matrices have the following forms:
then the \eqref{eq:sol-matrix} can be calculated as follows:
The solution of the ill-posed linear equation \eqref{eq:solution} can be expressed as a linear combination of the inverted solution basis vectors \(\tilde{\mathbf{v}}_i\). The weight of the \(i\)-th inverted solution basis vector is determined by the \(f_{\lambda, i}\). The larger the index \(i\) is, the smaller the singular value \(\sigma_i\) is and the much smaller the \(f_{\lambda, i}\) is if \(\lambda\) is sufficiently large. Therefore, the noisy components of the solution are suppressed by the regularization parameter \(\lambda\).
Expression of the squared norm using the series expansion#
For the sake of futher discussion, let us derive the expression of the squared residual norm and the squared regularization norm using the series expansion \eqref{eq:sol-expansion}.
Let each squared norm be \(\rho\) and \(\eta\) respectively:
where \(\mathbf{Q} = \mathbf{B}^\mathsf{T}\mathbf{B}\) and \(\mathbf{H} = \mathbf{C}^\mathsf{T}\mathbf{C}\).
Firstly we transform \(\mathbf{B}\mathbf{T}\tilde{\mathbf{V}}\) into the following form:
Additionally, using \(\|\mathbf{a}\|_\mathbf{Q}^2 = \|\mathbf{B}\mathbf{a}\|^2\) (\(\|\cdot\|\) is a Euclidean norm), the \(\rho\) is expressed as follows:
Also the \(\eta\) is expressed as follows:
Lastly, we derive the series-expansion form of the vector: \(\mathbf{T}\mathbf{x}_\lambda - \mathbf{b}\) as follows: