Decision Tree Learning - A Helpful Illustrated Guide in Python

This tutorial will show you everything you need to get started training your first models using decision tree learning in Python. To help you grasp this topic thoroughly, I attacked it from different perspectives: textual, visual, and audio-visual. So, let’s get started!

Why Decision Trees?

Deep learning has become the megatrend within artificial intelligence and machine learning. Yet, training large neural networks is not always the best choice. It’s the bazooka in machine learning, effective but not efficient.

A human will not understand in practice why the neural network classifies one way or the other. It is just a black box. Should you blindly invest your money into a stock recommended by a neural network? As you do not know the basis of the decision of a neural network, it can be hard to blindly trust its recommendations.

Many ML divisions in large companies must be able to explain the reasoning of their ML algorithms. Deep learning models fail to do this, but this is where decision trees excel!

This is one reason for the popularity of decision trees. Decision trees are more human-friendly and intuitive. You know exactly how the decisions emerged. And you can even hand tune the ML model of you want to.

The decision tree consists of branching nodes and leaf nodes. A branching node is a variable (also called feature) that is given as input to your decision problem. For each possible value of this feature, there is a child node.

A leaf node represents the predicted class given the feature values along the path to the root. Each leaf node has an associated probability, i.e., how often have we seen this particular instance (choice of feature values) in the training data. Moreover, each leaf node has an associated class or output value which is the predicted class of the input given by the branching nodes.

Video Decision Trees

I explain decision trees in this video:

In case you need to refresh your Python skills, feel free to deepen your Python code understanding with the Finxter web app.

Explanation Simple Example

You already know decision trees very well from your own experience. They represent a structured way of making decisions – each decision opening new branches. By answering a bunch of questions, you will finally land on the recommended outcome.

Here is an example:

Decision trees are used for classification problems such as “which subject should I study, given my interests?”. You start at the top. Now, you repeatedly answer questions (select the choices that describe your features best). Finally, you reach a leaf node of the tree. This is the recommended class based on your feature selection.

There are many nuances to decision tree learning. For example, in the above figure, the first question carries more weight than the last question. If you like maths, the decision tree will never recommend you art or linguistics. This is useful because some features may be much more important for the classification decision than others. For example, a classification system that predicts your current health may use your sex (feature) to practically rule out many diseases (classes).

Hence, the order of the decision nodes lends itself for performance optimizations: place the features at the top that have a high impact on the final classification. In decision tree learning will then aggregate the questions that do not have a high impact on the final classification as shown in the next graphic:

Suppose the full decision tree looks like the tree on the left. For any combination of features, there is a separate classification outcome (the tree leaves). However, some features may not give you any additional information with respect to the classification problem (e.g. the first “Language” decision node in the example). Decision tree learning would effectively get rid of these nodes for efficiency reasons. This is called “pruning”.

Decision Tree Code in Python

Here’s some code on how you can run a decision tree in Python using the sklearn library for machine learning:

## Dependencies
import numpy as np
from sklearn import tree


## Data: student scores in (math, language, creativity) --> study field
X = np.array([[9, 5, 6, "computer science"],
              [1, 8, 1, "literature"],
              [5, 7, 9, "art"]])


## One-liner
Tree = tree.DecisionTreeClassifier().fit(X[:,:-1], X[:,-1])

## Result & puzzle
student_0 = Tree.predict([[8, 6, 5]])
print(student_0)

student_1 = Tree.predict([[3, 7, 9]])
print(student_1)

The data in the code snippet describes three students with their estimated skill level (a score between 1-10) in the three areas math, language, and creativity. We also know the study subjects of these students. For example, the first student is highly skilled in maths and studies computer science. The second student is skilled in language much more than in the other two skills and studies literature. The third student is good in creativity and studies art.

The one-liner creates a new decision tree object and trains the model using the fit function on the labeled training data (the last column is the label). Internally, it creates three nodes, one for each feature math, language, and creativity.

When predicting the class of the student_0 (math=8, language=6, creativity=5), the decision tree returns “computer science”. It has learned that this feature pattern (high, medium, medium) is an indicator for the first class. On the other hand, when asked for (3, 7, 9), the decision tree predicts “art” because it has learned that the score (low, medium, high) hints to the third class.

Note that the algorithm is non-deterministic. In other words, when executing the same code twice, different results may arise. This is common for machine learning algorithms that work with random generators. In this case, the order of the features is randomly permuted, so the final decision tree may have a different order of the features.

Where to Go From Here?

Enough theory. Let’s get some practice!

Coders get paid six figures and more because they can solve problems more effectively using machine intelligence and automation.

To become more successful in coding, solve more real problems for real people. That’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

You build high-value coding skills by working on practical coding projects!

Do you want to stop learning with toy projects and focus on practical code projects that earn you money and solve real problems for people?

🚀 If your answer is YES!, consider becoming a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

If you just want to learn about the freelancing opportunity, feel free to watch my free webinar “How to Build Your High-Income Skill Python” and learn how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!