{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# GCV criterion\n",
    "\n",
    "Here we show how to use the Generalized Cross Validation (GCV) criterion to select the optimal\n",
    "regularization parameter for an example ill-posed inverse problem.<br>\n",
    "The definition of the GCV criterion is introduced in [this page](../../user/theory/gcv.ipynb).\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Definition of the example problem\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We refer to the famous Fredholm integral equation of the first kind devised by <cite data-footcite-t=\"Phillips1962-pk\">Pillips</cite>.\n",
    "Define the function\n",
    "\n",
    "$$\n",
    "\\begin{equation}\n",
    "f(x) \\equiv\n",
    "\\begin{cases}\n",
    "1 + \\displaystyle\\cos\\left(\\frac{\\pi x}{3}\\right), & \\text{if } | x | < 3, \\\\\n",
    "0, & \\text{otherwise}\n",
    "\\end{cases}.\n",
    "\\end{equation}\n",
    "$$\n",
    "\n",
    "Then, the kernel $K(s, t)$ and the solution $x(t)$ are given by\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "K(s, t) &\\equiv f(s - t)\\\\\n",
    "x(t) &\\equiv f(t)\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "Both integral intervals are $[-6, 6]$.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from matplotlib import pyplot as plt\n",
    "from scipy.sparse import diags\n",
    "\n",
    "from cherab.inversion import GCV, compute_svd\n",
    "from cherab.inversion.tools import parse_scientific_notation\n",
    "\n",
    "plt.rcParams[\"figure.dpi\"] = 150"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us define functions for the kernel, the solution and the right-hand side.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def _func(x: float) -> float:\n",
    "    if abs(x) < 3.0:\n",
    "        return 1.0 + np.cos(np.pi * x / 3.0)\n",
    "    else:\n",
    "        return 0.0\n",
    "\n",
    "\n",
    "def kernel(t: float, s: float) -> float:\n",
    "    \"\"\"The kernel function.\"\"\"\n",
    "    return _func(t - s)\n",
    "\n",
    "\n",
    "def solution(t: float) -> float:\n",
    "    \"\"\"The solution function.\"\"\"\n",
    "    return _func(t)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then we will generate the discretized version of the kernel $\\mathbf{K}$, solution $\\mathbf{x}$ and right-hand side $\\mathbf{b}$ using\n",
    "the trapezoidal rule leading to the following linear system:\n",
    "\n",
    "$$\n",
    "\\begin{equation}\n",
    "\\mathbf{K}\\mathbf{x} = \\mathbf{b},\n",
    "\\end{equation}\n",
    "$$\n",
    "\n",
    "The size of matrix $\\mathbf{K}\\in\\mathbb{R}^{M\\times N}$ is set to $M = N = 64$.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "s = np.linspace(-6.0, 6.0, num=64, endpoint=True)\n",
    "t = np.linspace(-6.0, 6.0, num=64, endpoint=True)\n",
    "\n",
    "# discretize kernel\n",
    "k_mat = np.zeros((s.size, t.size))\n",
    "k_mat = np.array([[kernel(i, j) for j in t] for i in s])\n",
    "\n",
    "# trapezoidal rule\n",
    "k_mat[:, 0] *= 0.5\n",
    "k_mat[:, -1] *= 0.5\n",
    "k_mat *= t[1] - t[0]\n",
    "\n",
    "# discretize solution\n",
    "x_t = np.array([solution(i) for i in t])\n",
    "\n",
    "print(f\"{k_mat.shape = }\")\n",
    "print(f\"{x_t.shape = }\")\n",
    "print(f\"condition number of K is {np.linalg.cond(k_mat):.4g}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The right-hand side $\\mathbf{b}\\in\\mathbb{R}^{M}$ is usually contaminated by noise.\n",
    "So, we will add a Gaussian noise to the right-hand side.\n",
    "$$ \\mathbf{b} = \\bar{\\mathbf{b}} + \\mathbf{e}, $$\n",
    "where $\\bar{\\mathbf{b}}$ is the original right-hand side ($\\bar{\\mathbf{b}}\\equiv \\mathbf{K}\\mathbf{x}$), $\\mathbf{e}$ is a vector whose elements are independently sampled from the normal distribution with mean 0 and standard deviation $10^{-3}$.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "b_bar = k_mat @ x_t\n",
    "rng = np.random.default_rng()\n",
    "noise = rng.normal(0, 1.0e-3, b_bar.size)\n",
    "b = b_bar + noise"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Solve the inverse problem using the GCV criterion\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The solution of the ill-posed linear equation is obtained with the regularization procedure:\n",
    "\n",
    "$$\n",
    "\\begin{equation}\n",
    "\\mathbf{x}_\\lambda = \\left(\\mathbf{K}^\\mathsf{T}\\mathbf{K} + \\lambda\\mathbf{H}\\right)^{-1}\\mathbf{K}^\\mathsf{T}\\mathbf{b},\n",
    "\\end{equation}\n",
    "$$\n",
    "\n",
    "where $\\mathbf{H}$ is the regularization matrix.\n",
    "Here we set $\\mathbf{H} = \\mathbf{D_2}^\\mathsf{T}\\mathbf{D_2}$, where $\\mathbf{D_2}$ is the second-order difference matrix.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dmat = diags([1, -2, 1], [-1, 0, 1], shape=(t.size, t.size)).tocsr()\n",
    "print(f\"{dmat.shape = }\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then we create GCV solver object after calculating the singular value decomposition according\n",
    "to the [series expansion of solution](../../user/theory/inversion.ipynb#Series-expansion-of-the-solution).\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hmat = dmat.T @ dmat\n",
    "s, u, basis = compute_svd(k_mat, hmat)\n",
    "gcv = GCV(s, u, basis, data=b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us solve the inverse problem.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sol, status = gcv.solve()\n",
    "print(status)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Evaluate the GCV criterion\n",
    "\n",
    "Next we evaluate the solution obtained by the GCV criterion\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Plot GCV curve\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "gcv.plot_gcv();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Compare $\\mathbf{x}_\\lambda$ with $\\mathbf{x}_\\mathrm{true}$\n",
    "\n",
    "Let us compare solutions at different regularization parameters $\\lambda=10^{-9}$,\n",
    "$\\lambda_\\text{opt}$, $10^5$ with the true solution $\\mathbf{x}_\\mathrm{true}$.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lambdas = [1.0e-9, gcv.lambda_opt, 1.0e5]\n",
    "\n",
    "fig, axes = plt.subplots(1, 3, figsize=(10, 4), sharey=True, layout=\"constrained\")\n",
    "\n",
    "for ax, beta in zip(axes, lambdas, strict=False):\n",
    "    ax.plot(t, x_t, \"--\", label=\"$\\\\mathbf{x}_\\\\mathrm{true}$\")\n",
    "    ax.plot(t, gcv.solution(beta=beta), label=\"$\\\\mathbf{x}_\\\\lambda$\")\n",
    "    ax.axhline(0, color=\"black\", lw=0.75, ls=\"--\", zorder=-1)\n",
    "\n",
    "    ax.set_xlim(t.min(), t.max())\n",
    "    ax.set_ylim(-0.5, x_t.max() * 1.1)\n",
    "    ax.set_xlabel(\"$t$\")\n",
    "    parsed_lambda = parse_scientific_notation(f\"{beta:.2e}\")\n",
    "    ax.set_title(f\"$\\\\lambda = {parsed_lambda}$\")\n",
    "    ax.tick_params(direction=\"in\", labelsize=10, which=\"both\", top=True, right=True)\n",
    "    ax.legend(loc=\"upper left\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Check the relative error\n",
    "\n",
    "The relative error between the solution $\\mathbf{x}_\\lambda$ and the true solution\n",
    "$\\mathbf{x}_\\mathrm{true}$ is defined as follows:\n",
    "\n",
    "$$\n",
    "\\begin{equation}\n",
    "\\epsilon_\\mathrm{rel} = \\frac{\\|\\mathbf{x}_\\lambda - \\mathbf{x}_\\mathrm{true}\\|_2}{\\|\\mathbf{x}_\\mathrm{true}\\|_2}.\n",
    "\\end{equation}\n",
    "$$\n",
    "\n",
    "Let us seek the minimum $\\epsilon_\\mathrm{rel}$ as a function of $\\lambda$.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.optimize import minimize_scalar\n",
    "\n",
    "X_T_NORM = np.linalg.norm(x_t, axis=0)\n",
    "\n",
    "\n",
    "def relative_error(\n",
    "    log_lambda: float, x_t: np.ndarray = x_t, x_t_norm: float = X_T_NORM, gcv: GCV = gcv\n",
    ") -> float:\n",
    "    \"\"\"Calculate relative error.\"\"\"\n",
    "    beta = 10**log_lambda\n",
    "    sol = gcv.solution(beta=beta)\n",
    "    return np.linalg.norm(x_t - sol, axis=0) / x_t_norm\n",
    "\n",
    "\n",
    "# minimize relative error\n",
    "bounds = gcv.bounds\n",
    "res = minimize_scalar(\n",
    "    relative_error,\n",
    "    bounds=bounds,\n",
    "    method=\"bounded\",\n",
    "    args=(x_t, X_T_NORM, gcv),\n",
    "    options={\"xatol\": 1.0e-10, \"maxiter\": 1000},\n",
    ")\n",
    "\n",
    "# obtain minimum relative error and lambda\n",
    "error_min = res.fun\n",
    "lambda_min = 10**res.x\n",
    "\n",
    "print(f\"minimum relative error: {error_min:.2%} at lambda = {lambda_min:.4g}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us plot the relative error and curvature as a function of $\\lambda$.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# set regularization parameters\n",
    "num = 500\n",
    "lambdas = np.logspace(*bounds, num=num)\n",
    "\n",
    "# calculate errors and gcv\n",
    "errors = np.asarray([relative_error(log_lambda) for log_lambda in np.linspace(*bounds, num=num)])\n",
    "gcvs = np.asarray([gcv.gcv(beta) for beta in lambdas])\n",
    "\n",
    "# create figure\n",
    "fig, ax1 = plt.subplots()\n",
    "ax2 = ax1.twinx()\n",
    "\n",
    "# plot errors and gcv\n",
    "(p1,) = ax1.loglog(lambdas, errors, color=\"C0\")\n",
    "(p2,) = ax2.loglog(lambdas, gcvs, color=\"C1\")\n",
    "\n",
    "# plot minimum error vertical line and point\n",
    "ax1.axvline(lambda_min, color=\"r\", linestyle=\"--\", linewidth=0.75)\n",
    "ax1.scatter(lambda_min, error_min, color=\"r\", marker=\"o\", s=10, zorder=2)\n",
    "ax1.text(\n",
    "    lambda_min,\n",
    "    error_min,\n",
    "    \"$\\\\lambda_\\\\mathrm{min}$\",\n",
    "    color=\"r\",\n",
    "    horizontalalignment=\"left\",\n",
    "    verticalalignment=\"top\",\n",
    ")\n",
    "\n",
    "# plot minimum gcv vertical line and point\n",
    "assert gcv.lambda_opt is not None\n",
    "min_gcv = gcv.gcv(gcv.lambda_opt)\n",
    "ax1.axvline(gcv.lambda_opt, color=\"g\", linestyle=\"--\", linewidth=0.75)\n",
    "ax2.scatter(gcv.lambda_opt, min_gcv, color=\"g\", marker=\"o\", s=10, zorder=2)\n",
    "ax2.text(\n",
    "    gcv.lambda_opt,\n",
    "    min_gcv,\n",
    "    \"$\\\\lambda_\\\\mathrm{opt}$\",\n",
    "    color=\"g\",\n",
    "    horizontalalignment=\"right\",\n",
    "    verticalalignment=\"bottom\",\n",
    ")\n",
    "\n",
    "# set axes\n",
    "ax1.set(\n",
    "    xlim=(lambdas[0], lambdas[-1]),\n",
    "    ylim=(1.0e-3, 1),\n",
    "    xlabel=\"$\\\\lambda$\",\n",
    "    ylabel=\"Relative error $\\\\epsilon_\\\\mathrm{rel}$\",\n",
    ")\n",
    "ax2.set(ylabel=\"GCV function\")\n",
    "\n",
    "ax1.yaxis.label.set_color(p1.get_color())\n",
    "ax2.yaxis.label.set_color(p2.get_color())\n",
    "\n",
    "ax1.tick_params(axis=\"x\", which=\"both\", direction=\"in\", top=True)\n",
    "ax1.tick_params(axis=\"y\", which=\"both\", direction=\"in\", colors=p1.get_color())\n",
    "ax2.tick_params(axis=\"y\", which=\"both\", direction=\"in\", colors=p2.get_color())\n",
    "\n",
    "ax2.spines[\"left\"].set_color(p1.get_color())\n",
    "ax2.spines[\"right\"].set_color(p2.get_color())\n",
    "\n",
    "error_opt = relative_error(np.log10(gcv.lambda_opt))\n",
    "ax1.set_title(\n",
    "    f\"$\\\\epsilon_\\\\mathrm{{rel}}(\\\\lambda_\\\\mathrm{{opt}})$ = {error_opt:.2%}, \"\n",
    "    + f\"$\\\\epsilon_\\\\mathrm{{rel}}(\\\\lambda_\\\\mathrm{{min}}) = ${error_min:.2%}\"\n",
    ");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## References\n"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    ".. footbibliography::"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "cherab-phix-dev",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}