Skip to main content

Command Palette

Search for a command to run...

21.1 Introduction of Decision Trees

Updated
3 min read
21.1 Introduction of Decision Trees

Have you ever noticed that your life already runs on a decision tree?

  • If it’s raining → take an umbrella

  • Else if it’s sunny → take sunglasses

  • Else → just pray and go out

Congratulations! You just built your first Decision Tree without knowing any machine learning.

In machine learning, computers do the same thing—except instead of umbrellas, they predict things like “Will the student pass?” or “Will we play tennis today?”

Imagine you are hungry.

  • If money > 200 → order pizza

  • Else if money > 50 → eat samosa

  • Else → drink water and sleep

This “food selection logic” is exactly how a Decision Tree works.

In this blog, we’ll see how machines make decisions step by step using a Decision Tree

  • Introduction

  • What is a Decision Tree?

  • Structure of decision tree Components (root, node, leaf)

  • How splitting works

  • Entropy / Gini (with intuition)

  • Example (Play Tennis)

  • Summary

What is a Decision Tree?

A Decision Tree is a supervised machine learning algorithm used for classification and regression.
It works by splitting data step by step based on conditions and finally giving an output.

It looks like a tree structure, where:

  • Each internal node represents a decision

  • Each branch represents an outcome of the decision

  • Each leaf node represents the final output

Real-Life Example

Consider deciding a person’s education based on age:

  • If age ≤ 15 → Goes to School

  • If age > 15 and age ≤ 21 → Goes to College

  • Else → Working

This decision-making process can be represented using a decision tree.

Structure of a Decision Tree

(a) Root Node

  • The top-most node

  • Represents the first decision

  • There is only one root node

(b) Child Node (Decision Node)

  • Comes from the root or another child

  • Represents further decisions

  • There can be many child nodes

(c) Leaf Node

  • Represents the final output

  • Does not split further

  • All outputs come from leaf nodes

4. Important Points

  • A decision tree starts from the root node

  • It follows conditions until it reaches a leaf node

  • There can be many leaf nodes

  • There can be only one root node

  • Leaf nodes give the final result/output

Decision Tree as Flow of If–Else Conditions

A decision tree works similar to if–else statements in programming.

Example:

if age <= 15:

    print(“Person goes to school”)

elif age > 15 and age <= 21:

    print(“Person goes to college”)

else:

    print(“Person is working”)

Splitting Criteria in Decision Trees

So far, we have understood that a decision tree makes decisions by asking questions at each node.

But an important question arises:
Which question should be asked first?
In other words, how does the decision tree decide which attribute to choose for splitting the data?

To answer this, decision trees use certain splitting criteria that help select the best attribute at each node.

What is Splitting?
Splitting is the process of dividing data at a node into smaller subsets based on a feature.

The goal of splitting is to make child nodes more pure (i.e., data in each node belongs to mostly one class).

Different decision tree algorithms use different splitting criteria to choose the best feature.

The commonly used splitting criteria are:

  • Entropy and Information Gain (ID3)

  • Gini Index (CART)

  • Gain Ratio (C4.5)