EASE Material

This is an old revision of the document!

NEEMS Lecture: 5. Train the Next Action Classifier

In the previous section we talked about some machine learning theory. This section will cover the training of our decision tree model.

First, the prepared_narratives that were created earlier in this lecture, are split between train_set and test_set. Then the train_set is split into the features and labels, where the features contain the previous and parent actions for each action, and the labels contain the data about which action comes after another (with the next_ prefix).

Since all the functionality is already implemented, the test_set can be prepared as the train_set has already been.

#TODO Split the test set into features and labels. Their variables should have the prefix test_set
test_set_features, test_set_features_cols, test_set_labels, test_set_labels_cols \
    = split_data_in_features_and_labels(test_set)
test_set_labels.head()

The parameters that we speak of are within the two range constructions, namely the 9 for the max_depth and 21 for max_leaf_notes. Finding parameters that train the model to an F1 value above 0.9 can be done programmatically by increasing both values, since more depth and nodes result in a more precise model. Taking a maximum depth of 14 and 26 leaf nodes are the lowest values that reach a score of just above 0.9.

parameters = {'max_depth':range(1,14,1), 'max_leaf_nodes': range(2,26,1)}

Now if the tree model is exported, a .dot file should be generated in the data directory, where this exercise is downloaded to. By executing the shell command below the export, a PNG image is generated.

# cd to the data directory, then execute this 
dot -Tpng tree.dot -o tree.png

Open the PNG file to see the decision tree of the model we just trained. With the parameters used above, the generated decision tree looks like this.

Zooming in to the root node we can see the fist decisions made. The root contains the decision, that gains the most information. First the type of the current action is determined, then the higher-order parent action. Both information finally lead to a concluded next action in one of the leaf nodes.

Depending on the highest entry in the value list in a node, the most likely next action is determined, which is NoAction in the root node, since this is the best prediction so far, without having made any decisions.

1.) Say, the type of action we currently try to predict something for is MovingToLocation (value higher than 0.5). From the root node we would go down the right branch (false), since type_MovingToLocation <= 0.5 is false.
2a.) The next decision to be done is based on whether the parent is MovingToOperate and let's say it is not (value below 0.5). Also recognize, that the class entry on the node changed to LookingForSomething, since this is the most likely next action so far. When parent_MovingToOperate <= 0.5 evaluates to true the branching goes on to the left, finally hitting a leaf node.
3a.) The model predicts, when the type of action is MovingToLocation and its parent is not MovingToOperate, the most likely next action to follow will be LookingForSomething since this is the only entry in the value list.

2b.) In a different scenario the parent_MovingToOperate is above 0.5, meaning the decision would evaluate false. We now hit a different leaf node to the right.
3b.) Even though the value list has multiple positive entries, the highest value is ClosingADrawer, which would be the prediction in a scenario where type is MovingToLocation and parent is MovingToOperate. You can see that the prediction is not very accurate, since there is at least one other high entry in the value list, representing the OpeningADrawer action.

If you are interested in specific entries in the value list, there will be an illustration in the next chapter, where the values are represented in a confusion matrix. The labels of said confusion matrix are in the same order as the value list, such that you can investigate what other classes could have been predicted aside from the highest value.

In the last section we will evaluate the trained decision tree model.