5 Best Ways to Check Whether One Tree Is Subtree of Another in Python

Rate this post

πŸ’‘ Problem Formulation: Trees are a fundamental data structure in computer science, and a common problem is to determine if one tree is a subtree of another. Given two binary trees, one called ‘T’ and the other ‘S’, the challenge is to check if ‘S’ is a subtree of ‘T’. A subtree of ‘T’ is defined as a tree consisting of a node in ‘T’ and all of its descendants in ‘T’. The subtree corresponding to the root node is the entire tree; the subtree corresponding to any other node is called a proper subtree. The desired output is a boolean value, True if ‘S’ is a subtree of ‘T’, otherwise False.

Method 1: Using Recursion to Compare Nodes

This method involves a recursive strategy to compare each node of both trees for equality. A helper function is used to check if two trees are identical starting from the root node, while the main function calls this helper recursively on each node of tree ‘T’ to find if ‘S’ is its subtree.

Here’s an example:

class TreeNode:
    def __init__(self, x):
        self.val = x
        self.left = None
        self.right = None

def isSubtree(s, t):
    if not s:
        return False
    if isSameTree(s, t):
        return True
    return isSubtree(s.left, t) or isSubtree(s.right, t)

def isSameTree(s, t):
    if not s and not t:
        return True
    if s and t and s.val == t.val:
        return isSameTree(s.left, t.left) and isSameTree(s.right, t.right)
    return False

# Example trees
T = TreeNode(1)
T.left = TreeNode(2)
T.right = TreeNode(3)
S = TreeNode(2)

print(isSubtree(T, S))

Output: True

This code snippet defines a binary tree and two functions: isSameTree() which recursively verifies if two trees are identical, and isSubtree() which leverages isSameTree() to determine if the second tree is a subtree of the first one. In the given example, tree ‘S’ which is a single node tree with the value ‘2’ is indeed a subtree of tree ‘T’.

Method 2: String Serialization of Trees

By serializing both trees to a string representation, this method uses a substring check to determine if one tree is the subtree of the other. The preorder traversal is typically used to serialize the tree, ensuring that null nodes are represented in the serialization to maintain tree structure.

Here’s an example:

class TreeNode:
    def __init__(self, x):
        self.val = x
        self.left = None
        self.right = None

def isSubtree(s, t):
    def serialize(node):
        return '^' + str(node.val) + '#' + serialize(node.left) + serialize(node.right) if node else '$'
    
    strT = serialize(t)
    strS = serialize(s)
    return strS in strT

# Example trees
T = TreeNode(1)
T.left = TreeNode(2)
T.right = TreeNode(3)
S = TreeNode(2)

print(isSubtree(S, T))

Output: True

In this snippet, the trees are serialized into strings using the serialize() function. This includes a marker for the start of each node, its value, and a marker for null children. isSubtree() checks if serialized tree ‘S’ is a substring of serialized tree ‘T’. This indicates ‘S’ is indeed a subtree of ‘T’ and outputs True.

Method 3: Using Hashing for Tree Comparison

This technique builds on the previous serialization approach by using hashing (e.g., SHA or MD5) to create a unique fingerprint for each subtree during traversal. Subtree comparison then becomes a simple hash value comparison, improving efficiency for large trees.

Here’s an example:

import hashlib

class TreeNode:
    def __init__(self, x):
        self.val = x
        self.left = None
        self.right = None

def hash_tree(node):
    if not node:
        return "null"
    left_hash = hash_tree(node.left)
    right_hash = hash_tree(node.right)
    node_hash = hashlib.md5(("Node" + str(node.val) + left_hash + right_hash).encode()).hexdigest()
    return node_hash

def isSubtree(s, t):
    hash_t = hash_tree(t)
    hash_s = hash_tree(s)

    return hash_t == hash_s

# Example trees
T = TreeNode(1)
T.left = TreeNode(2)
T.right = TreeNode(3)
S = TreeNode(2)

print(isSubtree(S, T))

Output: False

The code snippet employs the hashlib library to generate MD5 hashes for each subtree in the hash_tree() function. Unfortunately, this method outputs False because it only checks if the whole tree ‘S’ is identical to tree ‘T’, rather than if ‘S’ is a subtree of ‘T’. It demonstrates a hash-based comparison technique but highlights a limitation of this approach in handling subtrees.

Method 4: Optimized Recursive Comparison

Improving upon the basic recursive method, this optimized solution checks subtree structure at each step of the recursive calls and stops early if a mismatch is found. It thus avoids unnecessary checks, which leads to efficiency gains especially when the answer is negative early on.

Here’s an example:

class TreeNode:
    def __init__(self, x):
        self.val = x
        self.left = None
        self.right = None

def compareTrees(s, t):
    if s is None and t is None:
        return True
    if s is None or t is None:
        return False
    return s.val == t.val and compareTrees(s.left, t.left) and compareTrees(s.right, t.right)

def checkSubtree(T, S):
    if S is None:
        return True
    if T is None:
        return False
    if compareTrees(T, S):
        return True
    return checkSubtree(T.left, S) or checkSubtree(T.right, S)

# Example trees
T = TreeNode(1)
T.left = TreeNode(2)
T.right = TreeNode(3)
S = TreeNode(2)

output = checkSubtree(T, S)
print(output)

Output: True

The code defines a compareTrees() function for a direct comparison between two trees and a checkSubtree() function that extends the algorithm to efficiently check all subtrees within ‘T’ for equality with ‘S’. If a match is found, the function returns early, improving the overall efficiency. In the case of the trees provided, the result is True because ‘S’ is a subtree of ‘T’.

Bonus One-Liner Method 5: Leveraging Python’s Built-in Libraries

Possibly the most Pythonic way to solve this challenge, this approach utilizes Python’s standard libraries to help serialize the trees. The pickle library can be used here to serialize the entire tree, and then we can check for the serialized object as a substring.

Here’s an example:

import pickle

class TreeNode:
    def __init__(self, x):
        self.val = x
        self.left = None
        self.right = None

def isSubtree(s, t):
    t_dump = pickle.dumps(t)
    s_dump = pickle.dumps(s)
    return s_dump in t_dump

# Example trees
T = TreeNode(1)
T.left = TreeNode(2)
T.right = TreeNode(3)
S = TreeNode(2)

print(isSubtree(S, T))

Output: False

This snippet serializes trees ‘T’ and ‘S’ using Python’s pickle module and then checks for the presence of ‘S’ within ‘T’ using simple substring matching. However, the output is False because pickled data contains metadata that can prevent this approach from being reliable for subtree checks. The example serves to illustrate a creative one-liner approach, but may not be suitable for all scenarios.

Summary/Discussion

  • Method 1: Recursive Node Comparison. Straightforward and intuitive implementation. Potential inefficiency with large or complex trees.
  • Method 2: String Serialization. Simplifies the problem to a string search. Not the most space-efficient due to string storage.
  • Method 3: Hashing Tree Comparison. Unique finger-printing of trees. Not suitable for subtree checks as it only validates complete tree matches.
  • Method 4: Optimized Recursive Comparison. Improved version of Method 1 with early termination. Effectively balances readability and performance.
  • Method 5: Python’s Built-in Libraries. A Pythonic one-liner. Elegant but unreliable due to metadata in serialized objects.