ease:machinelearning
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| ease:machinelearning [2020/06/19 12:56] – s_fuyedc | ease:machinelearning [2020/06/22 11:36] (current) – s_fuyedc | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== Machine Learning through NEEMS ====== | ====== Machine Learning through NEEMS ====== | ||
| - | [[https:// | + | This lecture |
| - | + | ||
| - | This tutorial gives an overview of the lecture about narrative enabled episode memories (NEEMS) by Sebastian Koralewski. | + | |
| Nowadays machine learning is applicable for a large variety of cases. This lecture shows how to use the recordings of activities performed by a robot in a kitchen environment to predict a likely course of action, based on the observed data and their probability of success, e.g. setting up a table for breakfast, or cleaning it up afterward. | Nowadays machine learning is applicable for a large variety of cases. This lecture shows how to use the recordings of activities performed by a robot in a kitchen environment to predict a likely course of action, based on the observed data and their probability of success, e.g. setting up a table for breakfast, or cleaning it up afterward. | ||
| - | The platform of choice is Jupyter Notebook, a handy framework that is ready to launch in any python environment. Please find the repository' | + | ===== Setup ===== |
| + | |||
| + | The platform of choice is Jupyter Notebook, a handy framework that is ready to launch in any python environment. Within the Jupyter | ||
| Some example data is provided with this tutorial in a CSV format. They contain records of robot activities, the so-called NEEMS. | Some example data is provided with this tutorial in a CSV format. They contain records of robot activities, the so-called NEEMS. | ||
| - | ===== NEEMS Lecture ===== | + | Please follow the [[https:// |
| - | In the browser, the filesystem of the repository | + | When the setup is finished your default browser should open the Notebook. |
| - | === Visualizing the Data === | + | {{ : |
| - | In the first section, we analyze | + | From left to right: |
| + | *Save the notebook with all your changes to Exercise.ipynb on your machine. | ||
| + | *Inserting | ||
| + | *Cut, copy and paste can be either | ||
| + | *Moving cells up and down isn't necessary as well and would potentially break the logical structure | ||
| + | *Running a code block can either be done with this button or with //CTRL-Enter//. | ||
| + | *Interrupting the kernel is useful when a code block takes to long to be executed. You can change your code and re-run | ||
| + | *Restarting the kernel resets all local variables and definitions from executing the code blocks. | ||
| + | *Restart & running all code blocks can be helpful when picking up this lecture at a later point, when you have already implemented some of the functionality | ||
| + | *Markdown is the default language of choice for writing plain text. You don't need to change this. | ||
| + | *The full palette of possible commands can be opened | ||
| - | {{ : | + | Hitting the TAB button while coding can be helpful for auto-completion. |
| - | The goal of this whole tutorial is to predict the next robot action, based on the probable likelihood of action following another. Every entry in the knowledge log is one recorded action; it has an entry for its duration (time to finish), | + | Remember |
| - | Feel free to look into the narratives' | + | ===== NEEMS Lecture |
| - | + | ||
| - | ==== Solutions | + | |
| - | + | ||
| - | When the exercise is opened freshly, remember to execute all the code snippets again, such that the variables and functions are defined and usable later in the lecture. Hit the button in the header of the lecture to execute all code blocks at once. If you get stuck somewhere in the lecture, feel free to ask the tutor or consider the following solutions. | + | |
| - | + | ||
| - | == 1. Data Analysis == | + | |
| - | + | ||
| - | <code python># | + | |
| - | narratives[header_names.NEXT].value_counts().plot.pie(figsize=(10, | + | |
| - | </ | + | |
| - | + | ||
| - | == 2.1 Filling Empty Cells == | + | |
| - | We are working on the // | + | |
| - | <code python># Print modified data | + | |
| - | fill_empty_cells(narratives) | + | |
| - | # Check if the code is working | + | |
| - | fill_empty_cells(narratives).isna().any() | + | |
| - | # If all entries tell ' | + | |
| - | + | ||
| - | This function takes care of null entries in the data, and replaces those entries with predefined values. | + | |
| - | <code python> | + | |
| - | # Solution | + | |
| - | def fill_empty_cells(data): | + | |
| - | filled_data = data.copy() | + | |
| - | + | ||
| - | filled_data[header_names.PARENT]= filled_data[header_names.PARENT].fillna(' | + | |
| - | #TODO Fill the rest of the remaining empty cells | + | |
| - | filled_data[header_names.NEXT]= filled_data[header_names.NEXT].fillna(' | + | |
| - | filled_data[header_names.PREVIOUS]= filled_data[header_names.PREVIOUS].fillna(' | + | |
| - | return filled_data | + | |
| - | </ | + | |
| - | == 2.2 Transform Categorical Values to Numeric Values == | + | |
| - | One-hot-encoding transforms our data into values of 0 or 1, which makes it easier to work with. When you print out the function output on the narratives data, scroll to the left to see that the table has expanded. | + | |
| - | <code python> | + | |
| - | def transform_categorial_to_one_hot_encoded(data): | + | |
| - | encoded_data = data.copy() | + | |
| - | + | ||
| - | encoded_parent_data = pd.get_dummies(encoded_data[header_names.PARENT], | + | |
| - | encoded_data = pd.concat([encoded_data, | + | |
| - | + | ||
| - | #TODO Transform the rest of the categorial features into one hot encoded features | + | |
| - | #Hint: NEXT must not be encoded | + | |
| - | encoded_type_data = pd.get_dummies(encoded_data[header_names.TYPE], | + | |
| - | encoded_previous_data = pd.get_dummies(encoded_data[header_names.PREVIOUS], | + | |
| - | encoded_data = pd.concat([encoded_data, | + | |
| - | encoded_type_data, | + | |
| - | encoded_previous_data], | + | |
| - | + | ||
| - | return encoded_data | + | |
| - | </ | + | |
| - | == 2.3 Data Cleaning == | + | |
| - | For predicting which action follows upon another action, we don't need any of the initial columns, only the ones generated by one-hot-encoding. //Remember to also remove the ID column!// | + | |
| - | <code python> | + | |
| - | def clean(data): | + | |
| - | cleaned_data = data.copy() | + | |
| - | + | ||
| - | #TODO Decide which columns are not required to be able to predict the next robot action | + | |
| - | #Hint: The NEXT column IS required. | + | |
| - | + | ||
| - | cols = [header_names.PARENT, | + | |
| - | header_names.TYPE, | + | |
| - | header_names.START_TIME, | + | |
| - | header_names.END_TIME, | + | |
| - | header_names.DURATION, | + | |
| - | header_names.PREVIOUS, | + | |
| - | header_names.ID] | + | |
| - | + | ||
| - | for col in cols: | + | |
| - | cleaned_data = cleaned_data.drop(col, | + | |
| - | + | ||
| - | return cleaned_data | + | |
| - | </ | + | |
| - | == 2.4 Data Preparation Pipeline == | + | |
| - | Simply apply the three functions above on the // | + | |
| - | <code python> | + | |
| - | def prepare_data(data): | + | |
| - | prepared_data = data.copy() | + | |
| - | + | ||
| - | prepared_data = fill_empty_cells(prepared_data) | + | |
| - | #TODO apply all preparation methods on prepare_data | + | |
| - | prepared_data = transform_categorial_to_one_hot_encoded(prepared_data) | + | |
| - | prepared_data = clean(prepared_data) | + | |
| - | + | ||
| - | return prepared_data | + | |
| - | </ | + | |
| - | == 2.5 Prepared Data Evaluation == | + | |
| - | + | ||
| - | <code python> | + | |
| - | #TODO store the prepared narratives in a prepared_narratives variable and evalute them by printing them | + | |
| - | prepared_narratives = prepare_data(narratives) | + | |
| - | </ | + | |
| - | <code python> | + | |
| - | #TODO verifiy that the prepared narratives do not have any empty cells | + | |
| - | prepared_narratives.isna().any() | + | |
| - | </ | + | |
| - | == 3. Brief Introduction to Decision Trees == | + | |
| - | + | ||
| - | The statistic model of choice is a decision tree. Such models have the advantage of being visually inspectable and comprehendible, | + | |
| - | + | ||
| - | The first part of this section showcases what we have done so far on a simple example. | + | |
| - | + | ||
| - | The chapter **3.2 Preview of a Trained Decision Tree** shows an example decision tree, based on the example data. It resembles the decision about where to put an object. In each node of the tree, there are 4 to 5 entries. To elaborate on the entries' | + | |
| - | + | ||
| - | {{ : | + | |
| - | + | ||
| - | // | + | |
| - | + | ||
| - | //gini = 0.719// gives the Gini Impurity value, which is calculated and explained in section 3.4 | + | |
| - | + | ||
| - | //samples = 8// tells the number of possible decisions for the whole scenario. In the root, we have all 8 entries of our example_data at our service. | + | |
| - | + | ||
| - | //value = [3, 2, 1, 2]// tells the sum of decidable goal locations over all objects, sorted like in the table above. [cupboard, dishwasher, drawer, fridge]. The further down we go along the tree, the more decisions have been made, and the fewer possibilities for a decision are left. So if the first node decides, the object is **not** milk, the predicted goal location will never be //fridge//, since no other object than milk is going to the fride, an therefore the last entry of this array of numbers is going to be 0 for all upcoming nodes. | + | |
| - | + | ||
| - | // | + | |
| - | + | ||
| - | **3.3 Classification and Regression Trees / 3.4 Gini Impurity** | + | |
| - | + | ||
| - | Gini Impurity, just like the //entropy// of a model, is a measurement for the likelihood of a new random variable to be incorrectly labeled, if it were to be randomly labeled according to the distribution of the labels. The higher the Gini impurity, the less likely successful labeling of a new variable is. | + | |
| - | + | ||
| - | The calculation of the Gini Impurity is shown in the lecture, as well as its implementation. For our example_data this value is at 0.71875, rounded to 0.719 as shown in the decision tree above. We will now find out how to use this calculation for our purposes. | + | |
| - | + | ||
| - | **3.5 Cost Function / 3.6. Picking a Threshold / 3.7 Determining the Root node** | + | |
| - | + | ||
| - | The goal now is to calculate the impurity depending on which feature of the example_data is chosen as the root node of the decision tree. Therefore a cost function is implemented considering one feature (e.g. ' | + | |
| - | + | ||
| - | This cost function is applied to every feature and sorted ascending with their respective cost value. Determining the root node is essential for optimizing the model. A feature with especially high influence in the model is considered to gain a lot of information, | + | |
| - | + | ||
| - | **4. Additional Machine Learning Theory** | + | |
| - | + | ||
| - | Cross-validation is a technique of training a model, where training-set and testing-set are interchanged a couple of times, to potentially exclude valleys of falsely learned influence of features. | + | |
| - | + | ||
| - | Confusion Matrices illustrate how well the model labels the data correctly and incorrectly, | + | |
| - | + | ||
| - | Accuracy, Precision, and Recall are measurements of the quality of a model, just like confusion matrices. | + | |
| - | + | ||
| - | **5. Train the Next Action Classifier** | + | |
| - | + | ||
| - | First, the prepared_narratives that were created earlier in this lecture, are split between train_set and test_set. Then the train_set is split into the features and labels, where the features contain the previous and parent actions for each action, and the labels contain the data about which action comes after another (with the //next_// prefix). | + | |
| - | + | ||
| - | Since all the functionality is already implemented, | + | |
| - | <code python> | + | |
| - | #TODO Split the test set into features and labels. Their variables should have the prefix test_set | + | |
| - | test_set_features, | + | |
| - | | + | |
| - | test_set_labels.head() | + | |
| - | </ | + | |
| - | + | ||
| - | The parameters that we speak of are within the two //range// constructions, | + | |
| - | <code python> | + | |
| - | parameters | + | |
| - | </ | + | |
| - | Now if the tree model is exported, a .dot file should be generated in the //data// directory, where this exercise is downloaded to. By executing the shell command below the export, a PNG image is generated. | + | |
| - | <code bash> | + | |
| - | # cd to the data directory, then execute this | + | |
| - | dot -Tpng tree.dot -o tree.png | + | |
| - | </ | + | |
| - | Open the PNG file to see the decision tree of the model we just trained. With the parameters used above, the generated decision tree looks like this. | + | |
| - | + | ||
| - | {{ : | + | |
| - | + | ||
| - | Zooming in to the root node we can see the fist decisions made. The root contains the decision, that gains the most information. First the //type// of the current action is determined, then the higher-order //parent// action. Both information finally lead to a concluded //next// action in one of the leaf nodes. | + | |
| - | + | ||
| - | {{ : | + | |
| - | + | ||
| - | Depending on the highest entry in the //value// list in a node, the most likely //next// action is determined, which is // | + | |
| - | + | ||
| - | 1. Say, the type of action we currently try to predict something for is // | + | |
| - | + | ||
| - | 2a. The next decision to be done is based on whether the //parent// is // | + | |
| - | + | ||
| - | 3a. The model predicts, when the type of action is // | + | |
| - | + | ||
| - | + | ||
| - | 2b. In a different scenario the // | + | |
| - | + | ||
| - | 3b. Even though the //value// list has multiple positive entries, the highest value is // | + | |
| - | + | ||
| - | If you are interested in specific entries in the //value// list, there will be an illustration in the next chapter, where the values are represented in a confusion matrix. The labels of said confusion matrix are in the same order as the //value// list, such that you can investigate what other classes could have been predicted aside from the highest value. | + | |
| - | + | ||
| - | ** 6. Evaluate the Next Action Classifier ** | + | |
| - | Now it comes to evaluating what the tree model is capable of. The purpose of this model is to predict which action is the most likely to happen, depending on the previously performed action. Simply execute the code blocks to see the outcome. | + | |
| - | + | ||
| - | The first block shows a table that gives information about precision, recall, F1-Score, and support of the model. Read more about these terms further above in the lesson. | + | |
| - | + | ||
| - | The second code block is much more interesting. It generates a confusion matrix, showing for each action how often it was predicted successfully. In an optimal model, this matrix would only show entries on a diagonal line from top-left to bottom-right. If the confusion matrix is visualized without labels at the left and bottom side, check the code in the beginning and compare it with the solutions provided in this tutorial. Having the //NEXT// column removed is the most probable mistake. | + | |
| - | {{ : | + | This lecture is granulated into six consecutive sections. Since the Jupyter Notebook is a Python program itself, it is important to do each section in the given order. If you separate the lecture over several sessions, remember to execute the code from previous sections before continuing your work. |
| - | Along the rows we see the true classes, or what is expected | + | *[[ease: |
| + | *[[ease: | ||
| + | *[[ease: | ||
| + | *[[ease: | ||
| + | *[[ease: | ||
| + | *[[ease: | ||
| - | There seems to be some confusion especially about the // | ||
| - | <code python> | ||
| - | other_narratives = narratives[(narratives.next == ' | ||
| - | other_narratives[[header_names.PARENT, | ||
| - | </ | ||
| - | Apparently both actions seem to have mostly the same parent and previous actions, as well as its type. Wanting to do a // | ||
ease/machinelearning.1592571391.txt.gz · Last modified: 2020/06/19 12:56 by s_fuyedc
