Testing the additional predictive value of high-dimensional molecular data
Podcast
Podcaster
Beschreibung
vor 14 Jahren
Background: While high-dimensional molecular data such as
microarray gene expression data have been used for disease outcome
prediction or diagnosis purposes for about ten years in biomedical
research, the question of the additional predictive value of such
data given that classical predictors are already available has long
been under-considered in the bioinformatics literature. Results: We
suggest an intuitive permutation-based testing procedure for
assessing the additional predictive value of high-dimensional
molecular data. Our method combines two well-known statistical
tools: logistic regression and boosting regression. We give clear
advice for the choice of the only method parameter (the number of
boosting iterations). In simulations, our novel approach is found
to have very good power in different settings, e. g. few strong
predictors or many weak predictors. For illustrative purpose, it is
applied to the two publicly available cancer data sets.
Conclusions: Our simple and computationally efficient approach can
be used to globally assess the additional predictive power of a
large number of candidate predictors given that a few clinical
covariates or a known prognostic index are already available. It is
implemented in the R package "globalboosttest" which is publicly
available from R-forge and will be sent to the CRAN as soon as
possible.
microarray gene expression data have been used for disease outcome
prediction or diagnosis purposes for about ten years in biomedical
research, the question of the additional predictive value of such
data given that classical predictors are already available has long
been under-considered in the bioinformatics literature. Results: We
suggest an intuitive permutation-based testing procedure for
assessing the additional predictive value of high-dimensional
molecular data. Our method combines two well-known statistical
tools: logistic regression and boosting regression. We give clear
advice for the choice of the only method parameter (the number of
boosting iterations). In simulations, our novel approach is found
to have very good power in different settings, e. g. few strong
predictors or many weak predictors. For illustrative purpose, it is
applied to the two publicly available cancer data sets.
Conclusions: Our simple and computationally efficient approach can
be used to globally assess the additional predictive power of a
large number of candidate predictors given that a few clinical
covariates or a known prognostic index are already available. It is
implemented in the R package "globalboosttest" which is publicly
available from R-forge and will be sent to the CRAN as soon as
possible.
Weitere Episoden
In Podcasts werben
Abonnenten
München
Kommentare (0)