The effects of pruning methods on the predictive accuracy of induced decision trees (Q2711697)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: The effects of pruning methods on the predictive accuracy of induced decision trees |
scientific article
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | The effects of pruning methods on the predictive accuracy of induced decision trees |
scientific article |
Statements
25 April 2001
0 references
induction of decision trees
0 references
decision tree pruning
0 references
state space
0 references
cross- validation study
0 references
0 references
The effects of pruning methods on the predictive accuracy of induced decision trees (English)
0 references
Various heuristic methods have been proposed for the construction of a decision tree, among which the most widely known is the top-down approach. In top-down induction of decision trees it is possible to identify three tasks: (1) the assignment of each leaf with a class; (2) the selection of the splits according to a selection measure; (3) the decision when to declare a node terminal or to continue splitting it. Methods that control the growth of a decision tree during its construction are called pre-pruning methods, while the others are called post-pruning methods. NEWLINENEWLINENEWLINEMany post-pruning (or simply pruning) methods have been proposed in the literature, some of which are: reduced error pruning, minimum error pruning, pessimistic error pruning, critical value pruning, cost-complexity pruning, and error-based pruning. A previous comparative study has already pointed out both their similarities and their differences and investigated the real effect of some of these methods on both the predictive accuracy and the size of the induced tree. In that study, optimally pruned trees have been used to evaluate the maximum improvement produced by an ideal pruning algorithm. NEWLINENEWLINENEWLINEThe main purpose of this paper is that of providing a further comparison of these pruning methods. This article presents a unifying framework according to which any pruning method can be defined as a four-tuple (space, operators, evaluation function, search strategy), and the pruning process can be cast as an optimization problem. A new empirical analysis of the effect of post-pruning on both the predictive accuracy and the size od induced decision trees is reported. The experimental comparison of the pruning methods involves 14 data sets and is based on the cross-validation procedure. The results confirm most of the conclusions drawn in a previous comparison based on the holdout procedure.
0 references