Minimization and estimation of the variance of prediction errors for cross-validation designs 1/2 ~ Mathematik, Informatik und Statistik - Open Access LMU

We consider the mean prediction error of a classification or
regression procedure as well as its cross-validation estimates, and
investigate the variance of this estimate as a function of an
arbitrary cross-validation design. We decompose this variance into
a scalar product of coefficients and certain covariance
expressions, such that the coefficients depend solely on the
resampling design, and the covariances depend solely on the data's
probability distribution. We rewrite this scalar product in such a
form that the initially large number of summands can gradually be
decreased down to three under the validity of a quadratic
approximation to the core covariances. We show an analytical
example in which this quadratic approximation holds true exactly.
Moreover, in this example, we show that the leave-p-out estimator
of the error depends on p only by means of a constant and can,
therefore, be written in a much simpler form. Furthermore, there is
an unbiased estimator of the variance of K-fold cross-validation,
in contrast to a claim in the literature. As a consequence, we can
show that Balanced Incomplete Block Designs have smaller variance
than K-fold cross-validation. In a real data example from the UCI
machine learning repository, this property can be confirmed. We
finally show how to find Balanced Incomplete Block Designs in
practice.

Minimization and estimation of the variance of prediction errors for cross-validation designs 1/2

Beschreibung

Weitere Episoden

A General Framework for the Selection of Effect Type in Ordinal Regression 2/2

A General Framework for the Selection of Effect Type in Ordinal Regression 1/2

Identifiability in penalized function-on-function regression models

Identifiability in penalized function-on-function regression models

Evaluation of a new k-means approach for exploratory clustering of items

Kommentare (0)

Abonnenten

Anmelden mit

Minimization and estimation of the variance of prediction errors for cross-validation designs 1/2

Beschreibung

Weitere Episoden

A General Framework for the Selection of Effect Type in Ordinal Regression 2/2

A General Framework for the Selection of Effect Type in Ordinal Regression 1/2

Identifiability in penalized function-on-function regression models

Identifiability in penalized function-on-function regression models

Evaluation of a new k-means approach for exploratory clustering of items

Kommentare (0)

Abonnenten

Bleibe beim Podcasting auf dem Laufenden

Anmeldung

Anmelden mit