21.3 Numerical example of decision tree using ID3

Numerical example of decision tree using ID3
Suppose, we have, 2 features
Feature 1,
(Feature = F1)
Feature
Explanation
Root Node:
Feature → 9 Yes / 5 No
First Split (C1):
6 Yes / 2 No → More pure node
Second Split (C2):
3 Yes / 3 No → Highly impure node
Step 1: Entropy of Root Node
Entropy Formula
Root node is impure
Step 2: Entropy of Child Node C1 (6Y / 2N)
More pure node
Step 3: Entropy of Child Node C2 (3Y / 3N)
Highly impure node (maximum entropy)
Information gain
Where:
S→ Entire dataset (parent node)
A→ Attribute / feature used for splitting
Values(A)→ All possible values of attribute
Sv→ Subset of where attribute A=v
∣S∣→→ Total number of samples
∣Sv∣→ Number of samples in subset
H(S)→ Entropy of parent node
H(Sv )→ Entropy of child node
Step 4: Weighted Entropy After Split
Step 5: Information Gain
Formula
(Feature 2)
Root (S): 9Y / 5N → Total = 14
Child C1: 5Y / 1N → Total = 6
Child C2: 4Y / 4N → Total = 8
Step 1: Entropy of Root Node
Step 2: Entropy of Child C1 (5Y / 1N)
Step 3: Entropy of Child C2 (4Y / 4N)
Step 4: Weighted Entropy After Split
Step 5: Information Gain for Feature 2
Feature 1
Root: 9Y / 5N
Children:
C1 → 6Y / 2N
C2 → 3Y / 3N
Feature 2
Root: 9Y / 5N
Children:
C1 → 5Y / 1N
C2 → 4Y / 4N
Comparison Table:
Final Conclusion (ID3 Decision):
ID3 algorithm always selects the feature with the highest Information Gain as the root
node.
Since:
Feature 2 is selected as the ROOT FEATURE
Where is ID3 Used? (Practical Applications)
ID3 is mainly used in situations where the problem is a classification task and the data is
categorical in nature. Because ID3 produces easy-to-understand decision trees, it is widely
used in educational and real-life decision-making systems.
Education Domain
Predicting student performance (Pass / Fail)
Deciding eligibility for scholarships
Analyzing attendance vs result
Easy rules like IF attendance is high AND marks are good → Pass make ID3 suitable here.
Medical Diagnosis
Disease diagnosis based on symptoms
(Yes / No type decisions)
Identifying risk levels (High / Medium / Low)
Doctors prefer interpretable models, and ID3 gives clear decision rules.
Weather-Based Decisions
Predicting whether an outdoor event should be conducted
Classic example: Play Tennis problem
Weather attributes are categorical (Sunny, Rainy, Windy), perfect for ID3.
Business Decision Making
Customer segmentation
Deciding loan approval (Approve / Reject)
Credit risk analysis
ID3 helps convert data into simple if–else rules for managers.
Marketing & Customer Analysis
Predicting whether a customer will buy a product
Identifying target customers based on preferences
Categorical customer data works well with ID3.
Limitations of ID3
Biased toward attributes with more values
Cannot directly handle continuous data
Sensitive to noisy data
Can overfit the dataset




