PRESS criterion#

Definition#

The Predicted Residual Error Sum of Squares (PRESS), called also the ordinary cross-validation, is based on the basic leave-one-out cross-validation, which is proposed by Allen[1]. Let \(\mathbf{x}_\lambda^{(l)}\) be the solution in which the \(l\text{-th}\) observation is omitted. The PRESS criterion’s argument is that if \(\lambda\) is a good choice, then the \(l\text{-th}\) component \(\left(\mathbf{T}\mathbf{x}_\lambda^{(l)}\right)_l\) should be a good predictor of \(b_l\). Therefore, the PRESS criterion leads to choosing \(\lambda\) as the minimizer of the PRESS function \(\mathcal{P}(\lambda)\), defined by

\[\begin{equation} \mathcal{P}(\lambda) \equiv \sum_{l=1}^{M} \left[ \left( \mathbf{T}\mathbf{x}_\lambda^{(l)} \right)_l - b_l \right]^2. \end{equation}\]

It can be rewritten by Sherman-Morrison-Woodbury formula:

\[\begin{equation} \mathcal{P}(\lambda) = \| \mathbf{B}_\lambda \left( \mathbf{I} -\mathbf{A}_\lambda \right) \mathbf{b} \|_2^2, \end{equation}\]

where

\[\begin{split}\begin{align*} \mathbf{A}_\lambda &\equiv \mathbf{T}\left(\mathbf{T}^\mathsf{T}\mathbf{T} + \lambda\mathbf{H}\right)^{-1}\mathbf{T}^\mathsf{T},\\ \mathbf{B}_\lambda &\equiv \text{diag} \left( \cdots,\frac{1}{1 - a_{\lambda, ii}},\cdots \right),\\ a_{\lambda, ii} &\equiv \left(\mathbf{A}_\lambda\right)_{ii}. \end{align*}\end{split}\]

Using series-expansion form of the solution, \(\mathcal{P}(\lambda)\) can be written as

\[\begin{equation} \mathcal{P}(\lambda) = \text{Comming soon...}. \label{eq:PRESS_series} \end{equation}\]

Deriviation of \eqref{eq:PRESS_series}#

Using decomposed solution form, We have

\[\begin{split}\begin{align*} \mathbf{A}_\lambda \mathbf{b} &= \mathbf{T} \mathbf{x}_\lambda\\ &= \mathbf{T}\tilde{\mathbf{V}}\mathbf{F}_\lambda\mathbf{S}^{-1}\mathbf{U}^\mathsf{T}\mathbf{b}\\ &= \mathbf{U}\mathbf{S}\mathbf{F}_\lambda\mathbf{S}^{-1}\mathbf{U}^\mathsf{T}\mathbf{b}\quad(\because \mathbf{T}\tilde{\mathbf{V}} = \mathbf{U}\mathbf{S})\\ &= \mathbf{U}\mathbf{F}_\lambda\mathbf{U}^\mathsf{T}\mathbf{b}.\\ \therefore \mathbf{A}_\lambda &= \mathbf{U}\mathbf{F}_\lambda\mathbf{U}^\mathsf{T}. \end{align*}\end{split}\]

Limitation#

Comming soon…

References#