Friday, March 9, 2012

Decision trees, how is Prediction probability calculated?

How is the value of Prediction Probability calculated in the context of decision trees?

The prediction probability for a particular state is the probability of that state in the particular tree node used for prediction. (i.e. Count(State)/Support_of_tree_node).

Please let me know if you have any additional questions

|||

It is not clear to me how you calculate the prediction probability of a case.

Why will the count of a case be different from 1 ?

How you calculate the support of the tree node?

It it possible to get a refference to an article explaining those principals on the specific microsoft decision tree application?

|||

Assume you have a model that predicts BikeBuyer based on HomeOwnership flag.

Assume also that the tree has a split on HomeOwnership = true

The support of the tree node is the number of training cases where HO=true

The probability for BikeBuyer=1 in that node is

#(HO=true AND BikeBuyer=1) / #(HO=true)

In general, Probability (Attr=Value) in a tree node is:

#(NodeCondition & Attr=Value) / #(NodeCondition)

The actual value returned by the decision tree algorithm is adjusted with regard to the marginal probability of the node (number of training cases in the node / all training cases)

|||

1. Since we speak about cases we are talking always about all attributes.

Do you refer to the leaf in tree that led to the classification of the specific case, and use only the attributes on the path to it, or you use all attributes of the case?

2. Are you in fact saying that the predicted probability of a case ( that usualy has all the attributes) is calculated as the simple multiplication of the probabilities? (assuming independence) or you search for mutual distribution?

3. What is the parameter used for the ordering of the cases for the Lift chart?

|||

It is a bit confusing at predict time, but you need to realize that much of the consideration of all attributes is taken care of at model training time. Therefore, the attribute on the path are considered the attributes that are important for the prediction. The predict probability is therefore a simple ratio of cases at the leaf that meet the target and all the cases at the leaf.

The predicted probability of a case, in naive terms, would be the simple multiplication of the probabilities, but that is not used for the Decision Tree,

The lift chart is ordered by the predict probability of the target. If you do not specify a target, it is ordered by the predict probability of the highest probability state.

No comments:

Post a Comment