{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# GCV criterion\n",
"\n",
"Here we show how to use the Generalized Cross Validation (GCV) criterion to select the optimal\n",
"regularization parameter for an example ill-posed inverse problem.
\n",
"The definition of the GCV criterion is introduced in [this page](../../user/theory/gcv.ipynb).\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Definition of the example problem\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We refer to the famous Fredholm integral equation of the first kind devised by Pillips.\n",
"Define the function\n",
"\n",
"$$\n",
"\\begin{equation}\n",
"f(x) \\equiv\n",
"\\begin{cases}\n",
"1 + \\displaystyle\\cos\\left(\\frac{\\pi x}{3}\\right), & \\text{if } | x | < 3, \\\\\n",
"0, & \\text{otherwise}\n",
"\\end{cases}.\n",
"\\end{equation}\n",
"$$\n",
"\n",
"Then, the kernel $K(s, t)$ and the solution $x(t)$ are given by\n",
"\n",
"$$\n",
"\\begin{align}\n",
"K(s, t) &\\equiv f(s - t)\\\\\n",
"x(t) &\\equiv f(t)\n",
"\\end{align}\n",
"$$\n",
"\n",
"Both integral intervals are $[-6, 6]$.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"from matplotlib import pyplot as plt\n",
"from scipy.sparse import diags\n",
"\n",
"from cherab.inversion import GCV, compute_svd\n",
"from cherab.inversion.tools import parse_scientific_notation\n",
"\n",
"plt.rcParams[\"figure.dpi\"] = 150"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let us define functions for the kernel, the solution and the right-hand side.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def _func(x: float) -> float:\n",
" if abs(x) < 3.0:\n",
" return 1.0 + np.cos(np.pi * x / 3.0)\n",
" else:\n",
" return 0.0\n",
"\n",
"\n",
"def kernel(t: float, s: float) -> float:\n",
" \"\"\"The kernel function.\"\"\"\n",
" return _func(t - s)\n",
"\n",
"\n",
"def solution(t: float) -> float:\n",
" \"\"\"The solution function.\"\"\"\n",
" return _func(t)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then we will generate the discretized version of the kernel $\\mathbf{K}$, solution $\\mathbf{x}$ and right-hand side $\\mathbf{b}$ using\n",
"the trapezoidal rule leading to the following linear system:\n",
"\n",
"$$\n",
"\\begin{equation}\n",
"\\mathbf{K}\\mathbf{x} = \\mathbf{b},\n",
"\\end{equation}\n",
"$$\n",
"\n",
"The size of matrix $\\mathbf{K}\\in\\mathbb{R}^{M\\times N}$ is set to $M = N = 64$.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s = np.linspace(-6.0, 6.0, num=64, endpoint=True)\n",
"t = np.linspace(-6.0, 6.0, num=64, endpoint=True)\n",
"\n",
"# discretize kernel\n",
"k_mat = np.zeros((s.size, t.size))\n",
"k_mat = np.array([[kernel(i, j) for j in t] for i in s])\n",
"\n",
"# trapezoidal rule\n",
"k_mat[:, 0] *= 0.5\n",
"k_mat[:, -1] *= 0.5\n",
"k_mat *= t[1] - t[0]\n",
"\n",
"# discretize solution\n",
"x_t = np.array([solution(i) for i in t])\n",
"\n",
"print(f\"{k_mat.shape = }\")\n",
"print(f\"{x_t.shape = }\")\n",
"print(f\"condition number of K is {np.linalg.cond(k_mat):.4g}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The right-hand side $\\mathbf{b}\\in\\mathbb{R}^{M}$ is usually contaminated by noise.\n",
"So, we will add a Gaussian noise to the right-hand side.\n",
"$$ \\mathbf{b} = \\bar{\\mathbf{b}} + \\mathbf{e}, $$\n",
"where $\\bar{\\mathbf{b}}$ is the original right-hand side ($\\bar{\\mathbf{b}}\\equiv \\mathbf{K}\\mathbf{x}$), $\\mathbf{e}$ is a vector whose elements are independently sampled from the normal distribution with mean 0 and standard deviation $10^{-3}$.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"b_bar = k_mat @ x_t\n",
"rng = np.random.default_rng()\n",
"noise = rng.normal(0, 1.0e-3, b_bar.size)\n",
"b = b_bar + noise"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Solve the inverse problem using the GCV criterion\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The solution of the ill-posed linear equation is obtained with the regularization procedure:\n",
"\n",
"$$\n",
"\\begin{equation}\n",
"\\mathbf{x}_\\lambda = \\left(\\mathbf{K}^\\mathsf{T}\\mathbf{K} + \\lambda\\mathbf{H}\\right)^{-1}\\mathbf{K}^\\mathsf{T}\\mathbf{b},\n",
"\\end{equation}\n",
"$$\n",
"\n",
"where $\\mathbf{H}$ is the regularization matrix.\n",
"Here we set $\\mathbf{H} = \\mathbf{D_2}^\\mathsf{T}\\mathbf{D_2}$, where $\\mathbf{D_2}$ is the second-order difference matrix.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dmat = diags([1, -2, 1], [-1, 0, 1], shape=(t.size, t.size)).tocsr()\n",
"print(f\"{dmat.shape = }\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then we create GCV solver object after calculating the singular value decomposition according\n",
"to the [series expansion of solution](../../user/theory/inversion.ipynb#Series-expansion-of-the-solution).\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hmat = dmat.T @ dmat\n",
"s, u, basis = compute_svd(k_mat, hmat)\n",
"gcv = GCV(s, u, basis, data=b)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let us solve the inverse problem.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sol, status = gcv.solve()\n",
"print(status)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Evaluate the GCV criterion\n",
"\n",
"Next we evaluate the solution obtained by the GCV criterion\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Plot GCV curve\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"gcv.plot_gcv();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Compare $\\mathbf{x}_\\lambda$ with $\\mathbf{x}_\\mathrm{true}$\n",
"\n",
"Let us compare solutions at different regularization parameters $\\lambda=10^{-9}$,\n",
"$\\lambda_\\text{opt}$, $10^5$ with the true solution $\\mathbf{x}_\\mathrm{true}$.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"lambdas = [1.0e-9, gcv.lambda_opt, 1.0e5]\n",
"\n",
"fig, axes = plt.subplots(1, 3, figsize=(10, 4), sharey=True, layout=\"constrained\")\n",
"\n",
"for ax, beta in zip(axes, lambdas, strict=False):\n",
" ax.plot(t, x_t, \"--\", label=\"$\\\\mathbf{x}_\\\\mathrm{true}$\")\n",
" ax.plot(t, gcv.solution(beta=beta), label=\"$\\\\mathbf{x}_\\\\lambda$\")\n",
" ax.axhline(0, color=\"black\", lw=0.75, ls=\"--\", zorder=-1)\n",
"\n",
" ax.set_xlim(t.min(), t.max())\n",
" ax.set_ylim(-0.5, x_t.max() * 1.1)\n",
" ax.set_xlabel(\"$t$\")\n",
" parsed_lambda = parse_scientific_notation(f\"{beta:.2e}\")\n",
" ax.set_title(f\"$\\\\lambda = {parsed_lambda}$\")\n",
" ax.tick_params(direction=\"in\", labelsize=10, which=\"both\", top=True, right=True)\n",
" ax.legend(loc=\"upper left\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Check the relative error\n",
"\n",
"The relative error between the solution $\\mathbf{x}_\\lambda$ and the true solution\n",
"$\\mathbf{x}_\\mathrm{true}$ is defined as follows:\n",
"\n",
"$$\n",
"\\begin{equation}\n",
"\\epsilon_\\mathrm{rel} = \\frac{\\|\\mathbf{x}_\\lambda - \\mathbf{x}_\\mathrm{true}\\|_2}{\\|\\mathbf{x}_\\mathrm{true}\\|_2}.\n",
"\\end{equation}\n",
"$$\n",
"\n",
"Let us seek the minimum $\\epsilon_\\mathrm{rel}$ as a function of $\\lambda$.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from scipy.optimize import minimize_scalar\n",
"\n",
"X_T_NORM = np.linalg.norm(x_t, axis=0)\n",
"\n",
"\n",
"def relative_error(\n",
" log_lambda: float, x_t: np.ndarray = x_t, x_t_norm: float = X_T_NORM, gcv: GCV = gcv\n",
") -> float:\n",
" \"\"\"Calculate relative error.\"\"\"\n",
" beta = 10**log_lambda\n",
" sol = gcv.solution(beta=beta)\n",
" return np.linalg.norm(x_t - sol, axis=0) / x_t_norm\n",
"\n",
"\n",
"# minimize relative error\n",
"bounds = gcv.bounds\n",
"res = minimize_scalar(\n",
" relative_error,\n",
" bounds=bounds,\n",
" method=\"bounded\",\n",
" args=(x_t, X_T_NORM, gcv),\n",
" options={\"xatol\": 1.0e-10, \"maxiter\": 1000},\n",
")\n",
"\n",
"# obtain minimum relative error and lambda\n",
"error_min = res.fun\n",
"lambda_min = 10**res.x\n",
"\n",
"print(f\"minimum relative error: {error_min:.2%} at lambda = {lambda_min:.4g}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let us plot the relative error and curvature as a function of $\\lambda$.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# set regularization parameters\n",
"num = 500\n",
"lambdas = np.logspace(*bounds, num=num)\n",
"\n",
"# calculate errors and gcv\n",
"errors = np.asarray([relative_error(log_lambda) for log_lambda in np.linspace(*bounds, num=num)])\n",
"gcvs = np.asarray([gcv.gcv(beta) for beta in lambdas])\n",
"\n",
"# create figure\n",
"fig, ax1 = plt.subplots()\n",
"ax2 = ax1.twinx()\n",
"\n",
"# plot errors and gcv\n",
"(p1,) = ax1.loglog(lambdas, errors, color=\"C0\")\n",
"(p2,) = ax2.loglog(lambdas, gcvs, color=\"C1\")\n",
"\n",
"# plot minimum error vertical line and point\n",
"ax1.axvline(lambda_min, color=\"r\", linestyle=\"--\", linewidth=0.75)\n",
"ax1.scatter(lambda_min, error_min, color=\"r\", marker=\"o\", s=10, zorder=2)\n",
"ax1.text(\n",
" lambda_min,\n",
" error_min,\n",
" \"$\\\\lambda_\\\\mathrm{min}$\",\n",
" color=\"r\",\n",
" horizontalalignment=\"left\",\n",
" verticalalignment=\"top\",\n",
")\n",
"\n",
"# plot minimum gcv vertical line and point\n",
"assert gcv.lambda_opt is not None\n",
"min_gcv = gcv.gcv(gcv.lambda_opt)\n",
"ax1.axvline(gcv.lambda_opt, color=\"g\", linestyle=\"--\", linewidth=0.75)\n",
"ax2.scatter(gcv.lambda_opt, min_gcv, color=\"g\", marker=\"o\", s=10, zorder=2)\n",
"ax2.text(\n",
" gcv.lambda_opt,\n",
" min_gcv,\n",
" \"$\\\\lambda_\\\\mathrm{opt}$\",\n",
" color=\"g\",\n",
" horizontalalignment=\"right\",\n",
" verticalalignment=\"bottom\",\n",
")\n",
"\n",
"# set axes\n",
"ax1.set(\n",
" xlim=(lambdas[0], lambdas[-1]),\n",
" ylim=(1.0e-3, 1),\n",
" xlabel=\"$\\\\lambda$\",\n",
" ylabel=\"Relative error $\\\\epsilon_\\\\mathrm{rel}$\",\n",
")\n",
"ax2.set(ylabel=\"GCV function\")\n",
"\n",
"ax1.yaxis.label.set_color(p1.get_color())\n",
"ax2.yaxis.label.set_color(p2.get_color())\n",
"\n",
"ax1.tick_params(axis=\"x\", which=\"both\", direction=\"in\", top=True)\n",
"ax1.tick_params(axis=\"y\", which=\"both\", direction=\"in\", colors=p1.get_color())\n",
"ax2.tick_params(axis=\"y\", which=\"both\", direction=\"in\", colors=p2.get_color())\n",
"\n",
"ax2.spines[\"left\"].set_color(p1.get_color())\n",
"ax2.spines[\"right\"].set_color(p2.get_color())\n",
"\n",
"error_opt = relative_error(np.log10(gcv.lambda_opt))\n",
"ax1.set_title(\n",
" f\"$\\\\epsilon_\\\\mathrm{{rel}}(\\\\lambda_\\\\mathrm{{opt}})$ = {error_opt:.2%}, \"\n",
" + f\"$\\\\epsilon_\\\\mathrm{{rel}}(\\\\lambda_\\\\mathrm{{min}}) = ${error_min:.2%}\"\n",
");"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## References\n"
]
},
{
"cell_type": "raw",
"metadata": {
"raw_mimetype": "text/restructuredtext"
},
"source": [
".. footbibliography::"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "cherab-phix-dev",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.7"
}
},
"nbformat": 4,
"nbformat_minor": 2
}