Variable Selection Bias in Classification Trees Based on Imprecise Probabilities
Beschreibung
vor 19 Jahren
Classification trees based on imprecise probabilities provide an
advancement of classical classification trees. The Gini Index is
the default splitting criterion in classical classification trees,
while in classification trees based on imprecise probabilities, an
extension of the Shannon entropy has been introduced as the
splitting criterion. However, the use of these empirical entropy
measures as split selection criteria can lead to a bias in variable
selection, such that variables are preferred for features other
than their information content. This bias is not eliminated by the
imprecise probability approach. The source of variable selection
bias for the estimated Shannon entropy, as well as possible
corrections, are outlined. The variable selection performance of
the biased and corrected estimators are evaluated in a simulation
study. Additional results from research on variable selection bias
in classical classification trees are incorporated, implying
further investigation of alternative split selection criteria in
classification trees based on imprecise probabilities.
advancement of classical classification trees. The Gini Index is
the default splitting criterion in classical classification trees,
while in classification trees based on imprecise probabilities, an
extension of the Shannon entropy has been introduced as the
splitting criterion. However, the use of these empirical entropy
measures as split selection criteria can lead to a bias in variable
selection, such that variables are preferred for features other
than their information content. This bias is not eliminated by the
imprecise probability approach. The source of variable selection
bias for the estimated Shannon entropy, as well as possible
corrections, are outlined. The variable selection performance of
the biased and corrected estimators are evaluated in a simulation
study. Additional results from research on variable selection bias
in classical classification trees are incorporated, implying
further investigation of alternative split selection criteria in
classification trees based on imprecise probabilities.
Weitere Episoden
vor 11 Jahren
In Podcasts werben
Kommentare (0)