Differences

This shows you the differences between two versions of the page.

--- ease:machinelearning:classifier_training [2020/06/22 10:06] – s_fuyedc
+++ ease:machinelearning:classifier_training [2020/06/22 11:46] (current) – s_fuyedc
@@ Line 13: / Line 13: @@
 </code>
-The parameters that we speak of are within the two //range// constructions, namely the 9 for the max_depth and 21 for max_leaf_notes. Finding parameters that train the model to an F1 value above 0.9 can be done programmatically by increasing both values, since more depth and nodes result in a more precise model. Taking a maximum depth of 14 and 26 leaf nodes are the lowest values that reach a score of just above 0.9.
+After preparing the test- and dataset, two suitable parameters need to be decided for, determining the depth and breadth of the decision tree. The parameters that we speak of are within the two //range// constructions, namely the 9 for the max_depth and 21 for max_leaf_notes. Finding parameters that train the model to an F1 value above 0.9 can be done programmatically by increasing both values, since more depth and nodes result in a more precise model. Taking a maximum depth of 14 and 26 leaf nodes is the lowest setting to reach a score of just above 0.9.
 <code python>
 parameters = {'max_depth':range(1,14,1), 'max_leaf_nodes': range(2,26,1)}
@@ Line 30: / Line 30: @@
 {{ :ease:tree_root.png |}}
-Depending on the highest entry in the //value// list in a node, the most likely //next// action is determined, which is //NoAction// in the root node, since this is the best prediction so far, without having made any decisions.
+Depending on the highest entry in the //value// list in a node, the most likely //next// action is determined, which is //NoAction// in the root node, since this is the best prediction so far, without having made any decisions. Also, the Gini score shows very well how certain a decision have been made at any point in the tree, where we begin with a pretty high score of 0.813 at the root.
 .) Say, the type of action we currently try to predict something for is //MovingToLocation// (value higher than 0.5). From the root node we would go down the right branch (false), since //type_MovingToLocation %%<=%% 0.5// is false.\\
 a.) The next decision to be done is based on whether the //parent// is //MovingToOperate// and let's say it is **not** (value **below** 0.5). Also recognize, that the //class// entry on the node changed to //LookingForSomething//, since this is the most likely //next// action so far. When //parent_MovingToOperate %%<=%% 0.5// evaluates to true the branching goes on to the left, finally hitting a leaf node.\\
-a.) The model predicts, when the type of action is //MovingToLocation// and its parent is **not** //MovingToOperate//, the most likely //next// action to follow will be //LookingForSomething// since this is the only entry in the //value// list.
+a.) The model predicts, when the type of action is //MovingToLocation// and its parent is **not** //MovingToOperate//, the most likely //next// action to follow will be //LookingForSomething// since this is the only entry in the //value// list. With a Gini score of 0.0 this decision is unlikely to be wrong, based the data we have.
 b.) In a different scenario the //parent_MovingToOperate// is **above** 0.5, meaning the decision would evaluate false. We now hit a different leaf node to the right.\\
-b.) Even though the //value// list has multiple positive entries, the highest value is //ClosingADrawer//, which would be the prediction in a scenario where //type// is //MovingToLocation// and //parent// is //MovingToOperate//. You can see that the prediction is not very accurate, since there is at least one other high entry in the //value// list, representing the //OpeningADrawer// action.
+b.) Even though the //value// list has multiple positive entries, the highest value is //ClosingADrawer//, which would be the prediction in a scenario where //type// is //MovingToLocation// and //parent// is //MovingToOperate//. You can see that with a Gini Score of 0.579 the prediction is not very certain, since there is at least one other high entry in the //value// list, representing the //OpeningADrawer// action. But with the features given, this is still our best prediction.
+Finding the right parameters is one of the hardest things when it comes to training a model. Achieving a Gini score of 0.0 in every single leaf might indicate that the model is heavily overfit. On the other hand, having a model too vague may result in generalized prediction with small variance. By reducing the feature space just right, the predictions will be certain enough for your purposes.
 If you are interested in specific entries in the //value// list, there will be an illustration in the next chapter, where the values are represented in a confusion matrix. The labels of said confusion matrix are in the same order as the //value// list, such that you can investigate what other classes could have been predicted aside from the highest value.
 In [[https://ease-crc.org/material/ease/machinelearning/classifier_evaluation|the last section]] we will evaluate the trained decision tree model.