{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Analysis of discrete ill-posed problem with different types of noise\n", "\n", "This notebook is aimed at analyzing the behavior of different regularization methods for discrete ill-posed problems with different types of noise.\n", "The referenced papare is (Hansen 1992).\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from matplotlib import pyplot as plt\n", "from scipy.optimize import minimize_scalar\n", "from scipy.sparse import diags\n", "\n", "from cherab.inversion import GCV, Lcurve, _SVDBase, compute_svd\n", "from cherab.inversion.tools import parse_scientific_notation\n", "\n", "plt.rcParams[\"figure.dpi\"] = 150\n", "plt.rcParams[\"figure.constrained_layout.use\"] = True" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Definition of the example problem\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us consider the following kernel and solution function derived from the Fredholm intgral equation of the first kind:\n", "\n", "$$\n", "\\begin{align}\n", "K(s, t)\n", "&\\equiv\n", "\\left(\n", " \\cos(s) + \\cos(t)\n", "\\right)^2\n", "\\left(\n", " \\frac{\\sin \\psi(s, t)}{\\psi(s, t)}\n", "\\right)^2,\\quad\n", "\\psi(s, t)\\equiv\\pi\\left(\\sin(s) + \\sin(t)\\right)\\\\\n", "x(t)\n", "&\\equiv\n", "2.0\\exp(-4(t-0.5)^2) + \\exp(-4(t+0.5)^2)\n", "\\end{align}\n", "$$\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def _func(s: float, t: float) -> float:\n", " return np.pi * (np.sin(s) + np.sin(t))\n", "\n", "\n", "def kernel(s: float, t: float) -> float:\n", " \"\"\"The kernel function.\"\"\"\n", " u = _func(s, t)\n", " if u == 0:\n", " return (np.cos(s) + np.cos(t)) ** 2.0\n", " else:\n", " return ((np.cos(s) + np.cos(t)) * np.sin(u) / u) ** 2.0\n", "\n", "\n", "def solution(t: float) -> float:\n", " \"\"\"The solution function.\"\"\"\n", " return 2.0 * np.exp(-4.0 * (t - 0.5) ** 2.0) + np.exp(-4.0 * (t + 0.5) ** 2.0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Discrete $s$ and $t$ in the range $\\displaystyle\\left[-\\frac{\\pi}{2}, \\frac{\\pi}{2}\\right]$ are used and the noise-free data $\\bar{\\mathbf{b}}$ is given by $\\bar{\\mathbf{b}} = \\mathbf{K}\\mathbf{x}$.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "s = np.linspace(-0.5 * np.pi, 0.5 * np.pi, num=64, endpoint=True)\n", "t = np.linspace(-0.5 * np.pi, 0.5 * np.pi, num=64, endpoint=True)\n", "\n", "# discretize kernel\n", "k_mat = np.zeros((s.size, t.size))\n", "k_mat = np.array([[kernel(i, j) for j in t] for i in s])\n", "\n", "# trapezoidal rule\n", "k_mat[:, 0] *= 0.5\n", "k_mat[:, -1] *= 0.5\n", "k_mat *= t[1] - t[0]\n", "\n", "# discretize solution\n", "x_t = np.array([solution(i) for i in t])\n", "\n", "# noise-free data\n", "b_bar = k_mat @ x_t\n", "\n", "print(f\"{k_mat.shape = }\")\n", "print(f\"{x_t.shape = }\")\n", "print(f\"condition number of K is {np.linalg.cond(k_mat):.4g}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Perturbation by wite noise\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First we consider the case of uncorrelated errors (white nise), i.e., the elements $e_i$ of the purturbation $\\mathbf{e}$ are normally distributed with zero mean and standard deviation $10^{-3}$. Hence the right-hand side $\\mathbf{b}$ of the perturbed problem is given by $\\mathbf{b} = \\bar{\\mathbf{b}} + \\mathbf{e}$.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rng = np.random.default_rng()\n", "noise = rng.normal(0, 1.0e-3, b_bar.size)\n", "b = b_bar + noise" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The solution $\\mathbf{x}_\\lambda = \\left(\\mathbf{K}^\\mathsf{T}\\mathbf{K} + \\lambda\\mathbf{I}\\right)^{-1}\\mathbf{K}^\\mathsf{T}\\mathbf{b}$ is computed and opimized for the regularization parameter $\\lambda$ using the GCV and L-curve criteria.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Compute SVD using K & I matrices\n", "Imat = diags([1], shape=(t.size, t.size))\n", "s, u, basis = compute_svd(k_mat, Imat)\n", "\n", "# create GCV and L-curve objects\n", "gcv = GCV(s, u, basis, data=b)\n", "lcurve = Lcurve(s, u, basis, data=b)\n", "\n", "# solve for the regularization parameter\n", "sol_gcv, _ = gcv.solve()\n", "sol_lcurve, _ = lcurve.solve()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Show the L-curve's curvature and the GCV as a function of $\\lambda$.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, axes = plt.subplots(1, 2, figsize=(10, 5))\n", "lcurve.plot_curvature(fig=fig, axes=axes[0])\n", "gcv.plot_gcv(fig=fig, axes=axes[1])\n", "lambda_opt_lcurve = parse_scientific_notation(f\"{lcurve.lambda_opt:.2e}\")\n", "lambda_opt_gcv = parse_scientific_notation(f\"{gcv.lambda_opt:.2e}\")\n", "axes[0].set_title(f\"L-curve curvature ($\\\\lambda_\\\\mathrm{{opt}} = {lambda_opt_lcurve}$)\")\n", "axes[1].set_title(f\"GCV ($\\\\lambda_\\\\mathrm{{opt}} = {lambda_opt_gcv}$)\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us calculate the relative error for each optimal solution and compare the results.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "X_T_NORM = np.linalg.norm(x_t, axis=0)\n", "\n", "\n", "def relative_error(\n", " log_lambda: float,\n", " x_t: np.ndarray = x_t,\n", " x_t_norm: float = X_T_NORM,\n", " regularizer: _SVDBase = lcurve,\n", ") -> float:\n", " \"\"\"Calculate relative error.\"\"\"\n", " beta = 10**log_lambda\n", " sol = regularizer.solution(beta=beta)\n", " return np.linalg.norm(x_t - sol, axis=0) / x_t_norm\n", "\n", "\n", "# minimize relative error\n", "bounds = (-40, 0)\n", "res = minimize_scalar(\n", " relative_error,\n", " bounds=bounds,\n", " method=\"bounded\",\n", " args=(x_t, X_T_NORM, gcv),\n", " options={\"xatol\": 1.0e-10, \"maxiter\": 1000},\n", ")\n", "\n", "# obtain minimum relative error and lambda\n", "error_min = res.fun\n", "lambda_min = 10**res.x" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\" Regularization Parameter Relative Error\")\n", "print(\" ------------------------ ----------------\")\n", "print(f\"GCV {gcv.lambda_opt:^27.4e} {relative_error(np.log10(gcv.lambda_opt)):^12.4%}\")\n", "print(f\"L-curve {lcurve.lambda_opt:^27.4e} {relative_error(np.log10(lcurve.lambda_opt)):^12.4%}\")\n", "print(f\"Min Error {lambda_min:^27.4e} {error_min:^12.4%}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each optimal solution is plotted together with the original solution and the minimum-error one.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "nbsphinx-thumbnail" ] }, "outputs": [], "source": [ "sols = [sol_gcv, sol_lcurve, lcurve.solution(beta=lambda_min)]\n", "labels = [\"GCV\", \"L-curve\", \"Min Error\"]\n", "\n", "fig, ax = plt.subplots()\n", "ax.plot(t, x_t, ls=\"--\", lw=0.75, color=\"k\", label=\"True\")\n", "for sol, label in zip(sols, labels, strict=False):\n", " ax.plot(t, sol, label=label)\n", "\n", "ax.set_xlim(t.min(), t.max())\n", "ax.set_ylim(0, x_t.max() * 1.1)\n", "ax.set_xlabel(\"$t$\")\n", "ax.tick_params(direction=\"in\", labelsize=10, which=\"both\", top=True, right=True)\n", "ax.legend(loc=\"upper left\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Purturbation by uncorrelated noise\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next we consider the case of highly correlated errors, which are derived from a regular smoothing of the matrix $\\mathbf{K}$ and the right-hand side $\\bar{\\mathbf{b}}$:\n", "\n", "$$\n", "\\begin{align}\n", "\\tilde{\\mathbf{K}}_{i,j}\n", "&\\equiv\n", "\\mathbf{K}_{i,j}\n", "+ \\mu\n", "\\left(\n", " \\mathbf{K}_{i,j-1} + \\mathbf{K}_{i,j+1} + \\mathbf{K}_{i-1,j} + \\mathbf{K}_{i+1,j}\n", "\\right),\\\\\n", "\\tilde{\\mathbf{b}}_i\n", "&\\equiv\n", "\\bar{\\mathbf{b}}_i\n", "+ \\mu\n", "\\left(\n", " \\bar{\\mathbf{b}}_{i-1} + \\bar{\\mathbf{b}}_{i+1}\n", "\\right),\n", "\\end{align}\n", "$$\n", "\n", "where $\\mu$ is set to $0.05$ and the outside elements are set to zero.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mu = 0.05\n", "\n", "k_mat_smooth = np.zeros_like(k_mat)\n", "b = np.zeros_like(b_bar)\n", "\n", "for i in range(len(b_bar)):\n", " if i < 1:\n", " b[i] = b_bar[i] + mu * (0 + b_bar[i + 1])\n", " elif i > len(b_bar) - 2:\n", " b[i] = b_bar[i] + mu * (b_bar[i - 1] + 0)\n", " else:\n", " b[i] = b_bar[i] + mu * (b_bar[i - 1] + b_bar[i + 1])\n", "\n", "\n", "for i in range(k_mat.shape[0]):\n", " for j in range(k_mat.shape[1]):\n", " if i - 1 < 0:\n", " k1 = 0\n", " else:\n", " k1 = k_mat[i - 1, j]\n", " if i + 1 > k_mat.shape[0] - 1:\n", " k2 = 0\n", " else:\n", " k2 = k_mat[i + 1, j]\n", " if j - 1 < 0:\n", " k3 = 0\n", " else:\n", " k3 = k_mat[i, j - 1]\n", " if j + 1 > k_mat.shape[1] - 1:\n", " k4 = 0\n", " else:\n", " k4 = k_mat[i, j + 1]\n", "\n", " k_mat_smooth[i, j] = k_mat[i, j] + mu * (k1 + k2 + k3 + k4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We solve the perturbed problem with the same regularization methods and plot both criteria as a function of $\\lambda$.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Compute SVD using K & I matrices\n", "s, u, basis = compute_svd(k_mat_smooth, Imat)\n", "\n", "# create GCV and L-curve objects\n", "gcv = GCV(s, u, basis, data=b)\n", "lcurve = Lcurve(s, u, basis, data=b)\n", "\n", "# solve for the regularization parameter\n", "sol_gcv, _ = gcv.solve()\n", "sol_lcurve, _ = lcurve.solve()\n", "\n", "# plot the L-curve's curvature and GCV\n", "fig, axes = plt.subplots(1, 2, figsize=(10, 5))\n", "lcurve.plot_curvature(fig=fig, axes=axes[0])\n", "gcv.plot_gcv(fig=fig, axes=axes[1])\n", "lambda_opt_lcurve = parse_scientific_notation(f\"{lcurve.lambda_opt:.2e}\")\n", "lambda_opt_gcv = parse_scientific_notation(f\"{gcv.lambda_opt:.2e}\")\n", "axes[0].set_title(f\"L-curve curvature ($\\\\lambda_\\\\mathrm{{opt}} = {lambda_opt_lcurve}$)\")\n", "axes[1].set_title(f\"GCV ($\\\\lambda_\\\\mathrm{{opt}} = {lambda_opt_gcv}$)\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The L-curve method still works well by finding the corner of the L-curve, however, the GCV method fails to find the minimum of the GCV function in the same range of $\\lambda$.\n", "\n", "Let us calculate relative errors and plot the optimal solutions as well as the original solution and the minimum-error one.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# minimize relative error\n", "bounds = (-20, 0)\n", "res = minimize_scalar(\n", " relative_error,\n", " bounds=bounds,\n", " method=\"bounded\",\n", " args=(x_t, X_T_NORM, gcv),\n", " options={\"xatol\": 1.0e-10, \"maxiter\": 1000},\n", ")\n", "\n", "# obtain minimum relative error and lambda\n", "error_min = res.fun\n", "lambda_min = 10**res.x" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\" Regularization Parameter Relative Error\")\n", "print(\" ------------------------ ----------------\")\n", "print(f\"GCV {gcv.lambda_opt:^27.4e} {relative_error(np.log10(gcv.lambda_opt)):^12.4%}\")\n", "print(f\"L-curve {lcurve.lambda_opt:^27.4e} {relative_error(np.log10(lcurve.lambda_opt)):^12.4%}\")\n", "print(f\"Min Error {lambda_min:^27.4e} {error_min:^12.4%}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sols = [sol_lcurve, lcurve.solution(beta=lambda_min)]\n", "labels = [\"L-curve\", \"Min Error\"]\n", "\n", "fig, ax = plt.subplots()\n", "ax.plot(t, x_t, ls=\"--\", lw=0.75, color=\"k\", label=\"True\")\n", "for sol, label in zip(sols, labels, strict=False):\n", " ax.plot(t, sol, label=label)\n", "\n", "ax.set_xlim(t.min(), t.max())\n", "ax.set_ylim(0, x_t.max() * 1.1)\n", "ax.set_xlabel(\"$t$\")\n", "ax.tick_params(direction=\"in\", labelsize=10, which=\"both\", top=True, right=True)\n", "ax.legend(loc=\"upper left\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References\n" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. footbibliography::" ] } ], "metadata": { "kernelspec": { "display_name": "cherab-inv-dev", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.7" } }, "nbformat": 4, "nbformat_minor": 2 }