Features
CART uses an intuitive, Windows based
interface, making it accessible to both technical and non technical users.
Underlying the "easy" interface, however, is a mature theoretical foundation
that distinguishes CART from other methodologies and other decision trees.
Salford Systems' CART is the only decision tree system based on the original
CART code developed by world renowned Stanford University and University
of California at Berkeley statisticians; this code now includes enhancements
that were co-developed by Salford Systems and CART's originators.
Based on a decade of machine learning
and statistical research, CART provides stable performance and reliable
results. Its proven
methodology is characterized by:
a reliable
pruning strategy,
CART's developers determined
definitively that no stopping rule could be relied on to discover the optimal
tree, so they introduced the notion of over-growing trees and then pruning
back; this idea, fundamental to CART, ensures that important structure
is not overlooked by stopping too soon. Other decision tree techniques
use problematic stopping rules.
a powerful
binary split search approach, and
CART's binary decision trees
are more sparing with data and detect more structure before too little
data is left for learning. Other decision tree approaches use multi-way
splits that fragment the data rapidly, making it difficult to detect rules
that require broad ranges of data to discover.
automatic self
validation procedures.
In the search for patterns
in databases it is essential to avoid the trap of "overfitting," or finding
patterns that apply only to the training data. CART's embedded test
disciplines ensure that the patterns found will hold up when applied to
new data. Further, the testing and selection of the optimal tree
are an integral part of the CART algorithm. Testing in other decision
tree techniques is conducted after-the-fact and tree selection is left
up to the user.
In addition, CART accommodates many different
types of real world modeling problems by providing a
unique combination
of automated solutions:
surrogate
splitters intelligently handle missing values,
CART handles missing values
in the database by substituting "surrogate splitters," which are back-up
rules that closely mimic the action of primary splitting rules. The
surrogate splitter contains information that is typically similar to what
would be found in the primary splitter. Other products' approaches
treat all records with missing values as if the records all had the same
unknown value; with that approach all such "missings" are assigned to the
same bin. In CART, each record is processed using data specific to
that record; this allows records with different data patterns to be handled
differently, which results in a better characterization of the data.
adjustable
misclassification penalties help avoid the most costly errors, and
CART can accommodate situations
in which some misclassifications, or cases that have been incorrectly classified,
are more serious than others. CART users can specify a higher penalty
for misclassifying certain data, and the software will steer the tree away
from that type of error. Further, when CART cannot guarantee a correct
classification, it will try to ensure that the error it does make is less
costly. If credit risk is classified as low, moderate, or high, for
example, it would be much more costly to classify a high risk person as
low risk than as moderate risk. Traditional data mining tools cannot
distinguish between these errors.
alternative
splitting criteria make progress when other criteria fail.
CART includes seven single
variable splitting criteria - Gini, symmetric Gini, twoing, ordered twoing
and class probability for classification trees, and least squares and least
absolute deviation for regression trees - and one multi-variable splitting
criteria, the linear combinations method. The default Gini method
typically performs best, but, given specific circumstances, other methods
can generate more accurate models. CART's unique "twoing" procedure,
for example, is tuned for classification problems with many classes, such
as modeling which of 170 products would be chosen by a given consumer.
To deal more effectively with select data patterns, CART also offers splits
on linear combinations of continuous predictor variables.
|