{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# GCV criterion\n", "\n", "Here we show how to use the Generalized Cross Validation (GCV) criterion to select the optimal\n", "regularization parameter for an example ill-posed inverse problem.
\n", "The definition of the GCV criterion is introduced in [this page](../../user/theory/gcv.ipynb).\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Definition of the example problem\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We refer to the famous Fredholm integral equation of the first kind devised by Pillips.\n", "Define the function\n", "\n", "$$\n", "\\begin{equation}\n", "f(x) \\equiv\n", "\\begin{cases}\n", "1 + \\displaystyle\\cos\\left(\\frac{\\pi x}{3}\\right), & \\text{if } | x | < 3, \\\\\n", "0, & \\text{otherwise}\n", "\\end{cases}.\n", "\\end{equation}\n", "$$\n", "\n", "Then, the kernel $K(s, t)$ and the solution $x(t)$ are given by\n", "\n", "$$\n", "\\begin{align}\n", "K(s, t) &\\equiv f(s - t)\\\\\n", "x(t) &\\equiv f(t)\n", "\\end{align}\n", "$$\n", "\n", "Both integral intervals are $[-6, 6]$.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from matplotlib import pyplot as plt\n", "from scipy.sparse import diags\n", "\n", "from cherab.inversion import GCV, compute_svd\n", "from cherab.inversion.tools import parse_scientific_notation\n", "\n", "plt.rcParams[\"figure.dpi\"] = 150" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us define functions for the kernel, the solution and the right-hand side.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def _func(x: float) -> float:\n", " if abs(x) < 3.0:\n", " return 1.0 + np.cos(np.pi * x / 3.0)\n", " else:\n", " return 0.0\n", "\n", "\n", "def kernel(t: float, s: float) -> float:\n", " \"\"\"The kernel function.\"\"\"\n", " return _func(t - s)\n", "\n", "\n", "def solution(t: float) -> float:\n", " \"\"\"The solution function.\"\"\"\n", " return _func(t)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we will generate the discretized version of the kernel $\\mathbf{K}$, solution $\\mathbf{x}$ and right-hand side $\\mathbf{b}$ using\n", "the trapezoidal rule leading to the following linear system:\n", "\n", "$$\n", "\\begin{equation}\n", "\\mathbf{K}\\mathbf{x} = \\mathbf{b},\n", "\\end{equation}\n", "$$\n", "\n", "The size of matrix $\\mathbf{K}\\in\\mathbb{R}^{M\\times N}$ is set to $M = N = 64$.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "s = np.linspace(-6.0, 6.0, num=64, endpoint=True)\n", "t = np.linspace(-6.0, 6.0, num=64, endpoint=True)\n", "\n", "# discretize kernel\n", "k_mat = np.zeros((s.size, t.size))\n", "k_mat = np.array([[kernel(i, j) for j in t] for i in s])\n", "\n", "# trapezoidal rule\n", "k_mat[:, 0] *= 0.5\n", "k_mat[:, -1] *= 0.5\n", "k_mat *= t[1] - t[0]\n", "\n", "# discretize solution\n", "x_t = np.array([solution(i) for i in t])\n", "\n", "print(f\"{k_mat.shape = }\")\n", "print(f\"{x_t.shape = }\")\n", "print(f\"condition number of K is {np.linalg.cond(k_mat):.4g}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The right-hand side $\\mathbf{b}\\in\\mathbb{R}^{M}$ is usually contaminated by noise.\n", "So, we will add a Gaussian noise to the right-hand side.\n", "$$ \\mathbf{b} = \\bar{\\mathbf{b}} + \\mathbf{e}, $$\n", "where $\\bar{\\mathbf{b}}$ is the original right-hand side ($\\bar{\\mathbf{b}}\\equiv \\mathbf{K}\\mathbf{x}$), $\\mathbf{e}$ is a vector whose elements are independently sampled from the normal distribution with mean 0 and standard deviation $10^{-3}$.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b_bar = k_mat @ x_t\n", "rng = np.random.default_rng()\n", "noise = rng.normal(0, 1.0e-3, b_bar.size)\n", "b = b_bar + noise" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Solve the inverse problem using the GCV criterion\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The solution of the ill-posed linear equation is obtained with the regularization procedure:\n", "\n", "$$\n", "\\begin{equation}\n", "\\mathbf{x}_\\lambda = \\left(\\mathbf{K}^\\mathsf{T}\\mathbf{K} + \\lambda\\mathbf{H}\\right)^{-1}\\mathbf{K}^\\mathsf{T}\\mathbf{b},\n", "\\end{equation}\n", "$$\n", "\n", "where $\\mathbf{H}$ is the regularization matrix.\n", "Here we set $\\mathbf{H} = \\mathbf{D_2}^\\mathsf{T}\\mathbf{D_2}$, where $\\mathbf{D_2}$ is the second-order difference matrix.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dmat = diags([1, -2, 1], [-1, 0, 1], shape=(t.size, t.size)).tocsr()\n", "print(f\"{dmat.shape = }\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we create GCV solver object after calculating the singular value decomposition according\n", "to the [series expansion of solution](../../user/theory/inversion.ipynb#Series-expansion-of-the-solution).\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hmat = dmat.T @ dmat\n", "s, u, basis = compute_svd(k_mat, hmat)\n", "gcv = GCV(s, u, basis, data=b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us solve the inverse problem.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sol, status = gcv.solve()\n", "print(status)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Evaluate the GCV criterion\n", "\n", "Next we evaluate the solution obtained by the GCV criterion\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plot GCV curve\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "gcv.plot_gcv();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Compare $\\mathbf{x}_\\lambda$ with $\\mathbf{x}_\\mathrm{true}$\n", "\n", "Let us compare solutions at different regularization parameters $\\lambda=10^{-9}$,\n", "$\\lambda_\\text{opt}$, $10^5$ with the true solution $\\mathbf{x}_\\mathrm{true}$.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lambdas = [1.0e-9, gcv.lambda_opt, 1.0e5]\n", "\n", "fig, axes = plt.subplots(1, 3, figsize=(10, 4), sharey=True, layout=\"constrained\")\n", "\n", "for ax, beta in zip(axes, lambdas, strict=False):\n", " ax.plot(t, x_t, \"--\", label=\"$\\\\mathbf{x}_\\\\mathrm{true}$\")\n", " ax.plot(t, gcv.solution(beta=beta), label=\"$\\\\mathbf{x}_\\\\lambda$\")\n", " ax.axhline(0, color=\"black\", lw=0.75, ls=\"--\", zorder=-1)\n", "\n", " ax.set_xlim(t.min(), t.max())\n", " ax.set_ylim(-0.5, x_t.max() * 1.1)\n", " ax.set_xlabel(\"$t$\")\n", " parsed_lambda = parse_scientific_notation(f\"{beta:.2e}\")\n", " ax.set_title(f\"$\\\\lambda = {parsed_lambda}$\")\n", " ax.tick_params(direction=\"in\", labelsize=10, which=\"both\", top=True, right=True)\n", " ax.legend(loc=\"upper left\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Check the relative error\n", "\n", "The relative error between the solution $\\mathbf{x}_\\lambda$ and the true solution\n", "$\\mathbf{x}_\\mathrm{true}$ is defined as follows:\n", "\n", "$$\n", "\\begin{equation}\n", "\\epsilon_\\mathrm{rel} = \\frac{\\|\\mathbf{x}_\\lambda - \\mathbf{x}_\\mathrm{true}\\|_2}{\\|\\mathbf{x}_\\mathrm{true}\\|_2}.\n", "\\end{equation}\n", "$$\n", "\n", "Let us seek the minimum $\\epsilon_\\mathrm{rel}$ as a function of $\\lambda$.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from scipy.optimize import minimize_scalar\n", "\n", "X_T_NORM = np.linalg.norm(x_t, axis=0)\n", "\n", "\n", "def relative_error(\n", " log_lambda: float, x_t: np.ndarray = x_t, x_t_norm: float = X_T_NORM, gcv: GCV = gcv\n", ") -> float:\n", " \"\"\"Calculate relative error.\"\"\"\n", " beta = 10**log_lambda\n", " sol = gcv.solution(beta=beta)\n", " return np.linalg.norm(x_t - sol, axis=0) / x_t_norm\n", "\n", "\n", "# minimize relative error\n", "bounds = gcv.bounds\n", "res = minimize_scalar(\n", " relative_error,\n", " bounds=bounds,\n", " method=\"bounded\",\n", " args=(x_t, X_T_NORM, gcv),\n", " options={\"xatol\": 1.0e-10, \"maxiter\": 1000},\n", ")\n", "\n", "# obtain minimum relative error and lambda\n", "error_min = res.fun\n", "lambda_min = 10**res.x\n", "\n", "print(f\"minimum relative error: {error_min:.2%} at lambda = {lambda_min:.4g}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us plot the relative error and curvature as a function of $\\lambda$.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# set regularization parameters\n", "num = 500\n", "lambdas = np.logspace(*bounds, num=num)\n", "\n", "# calculate errors and gcv\n", "errors = np.asarray([relative_error(log_lambda) for log_lambda in np.linspace(*bounds, num=num)])\n", "gcvs = np.asarray([gcv.gcv(beta) for beta in lambdas])\n", "\n", "# create figure\n", "fig, ax1 = plt.subplots()\n", "ax2 = ax1.twinx()\n", "\n", "# plot errors and gcv\n", "(p1,) = ax1.loglog(lambdas, errors, color=\"C0\")\n", "(p2,) = ax2.loglog(lambdas, gcvs, color=\"C1\")\n", "\n", "# plot minimum error vertical line and point\n", "ax1.axvline(lambda_min, color=\"r\", linestyle=\"--\", linewidth=0.75)\n", "ax1.scatter(lambda_min, error_min, color=\"r\", marker=\"o\", s=10, zorder=2)\n", "ax1.text(\n", " lambda_min,\n", " error_min,\n", " \"$\\\\lambda_\\\\mathrm{min}$\",\n", " color=\"r\",\n", " horizontalalignment=\"left\",\n", " verticalalignment=\"top\",\n", ")\n", "\n", "# plot minimum gcv vertical line and point\n", "assert gcv.lambda_opt is not None\n", "min_gcv = gcv.gcv(gcv.lambda_opt)\n", "ax1.axvline(gcv.lambda_opt, color=\"g\", linestyle=\"--\", linewidth=0.75)\n", "ax2.scatter(gcv.lambda_opt, min_gcv, color=\"g\", marker=\"o\", s=10, zorder=2)\n", "ax2.text(\n", " gcv.lambda_opt,\n", " min_gcv,\n", " \"$\\\\lambda_\\\\mathrm{opt}$\",\n", " color=\"g\",\n", " horizontalalignment=\"right\",\n", " verticalalignment=\"bottom\",\n", ")\n", "\n", "# set axes\n", "ax1.set(\n", " xlim=(lambdas[0], lambdas[-1]),\n", " ylim=(1.0e-3, 1),\n", " xlabel=\"$\\\\lambda$\",\n", " ylabel=\"Relative error $\\\\epsilon_\\\\mathrm{rel}$\",\n", ")\n", "ax2.set(ylabel=\"GCV function\")\n", "\n", "ax1.yaxis.label.set_color(p1.get_color())\n", "ax2.yaxis.label.set_color(p2.get_color())\n", "\n", "ax1.tick_params(axis=\"x\", which=\"both\", direction=\"in\", top=True)\n", "ax1.tick_params(axis=\"y\", which=\"both\", direction=\"in\", colors=p1.get_color())\n", "ax2.tick_params(axis=\"y\", which=\"both\", direction=\"in\", colors=p2.get_color())\n", "\n", "ax2.spines[\"left\"].set_color(p1.get_color())\n", "ax2.spines[\"right\"].set_color(p2.get_color())\n", "\n", "error_opt = relative_error(np.log10(gcv.lambda_opt))\n", "ax1.set_title(\n", " f\"$\\\\epsilon_\\\\mathrm{{rel}}(\\\\lambda_\\\\mathrm{{opt}})$ = {error_opt:.2%}, \"\n", " + f\"$\\\\epsilon_\\\\mathrm{{rel}}(\\\\lambda_\\\\mathrm{{min}}) = ${error_min:.2%}\"\n", ");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References\n" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. footbibliography::" ] } ], "metadata": { "kernelspec": { "display_name": "cherab-phix-dev", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.7" } }, "nbformat": 4, "nbformat_minor": 2 }