5 Best Ways to Check Whether One Tree Is Subtree of Another in Python

Rate this post

π‘ Problem Formulation: Trees are a fundamental data structure in computer science, and a common problem is to determine if one tree is a subtree of another. Given two binary trees, one called ‘T’ and the other ‘S’, the challenge is to check if ‘S’ is a subtree of ‘T’. A subtree of ‘T’ is defined as a tree consisting of a node in ‘T’ and all of its descendants in ‘T’. The subtree corresponding to the root node is the entire tree; the subtree corresponding to any other node is called a proper subtree. The desired output is a boolean value, `True` if ‘S’ is a subtree of ‘T’, otherwise `False`.

Method 1: Using Recursion to Compare Nodes

This method involves a recursive strategy to compare each node of both trees for equality. A helper function is used to check if two trees are identical starting from the root node, while the main function calls this helper recursively on each node of tree ‘T’ to find if ‘S’ is its subtree.

Here’s an example:

```class TreeNode:
def __init__(self, x):
self.val = x
self.left = None
self.right = None

def isSubtree(s, t):
if not s:
return False
if isSameTree(s, t):
return True
return isSubtree(s.left, t) or isSubtree(s.right, t)

def isSameTree(s, t):
if not s and not t:
return True
if s and t and s.val == t.val:
return isSameTree(s.left, t.left) and isSameTree(s.right, t.right)
return False

# Example trees
T = TreeNode(1)
T.left = TreeNode(2)
T.right = TreeNode(3)
S = TreeNode(2)

print(isSubtree(T, S))```

Output: `True`

This code snippet defines a binary tree and two functions: `isSameTree()` which recursively verifies if two trees are identical, and `isSubtree()` which leverages `isSameTree()` to determine if the second tree is a subtree of the first one. In the given example, tree ‘S’ which is a single node tree with the value ‘2’ is indeed a subtree of tree ‘T’.

Method 2: String Serialization of Trees

By serializing both trees to a string representation, this method uses a substring check to determine if one tree is the subtree of the other. The preorder traversal is typically used to serialize the tree, ensuring that null nodes are represented in the serialization to maintain tree structure.

Here’s an example:

```class TreeNode:
def __init__(self, x):
self.val = x
self.left = None
self.right = None

def isSubtree(s, t):
def serialize(node):
return '^' + str(node.val) + '#' + serialize(node.left) + serialize(node.right) if node else '\$'

strT = serialize(t)
strS = serialize(s)
return strS in strT

# Example trees
T = TreeNode(1)
T.left = TreeNode(2)
T.right = TreeNode(3)
S = TreeNode(2)

print(isSubtree(S, T))```

Output: `True`

In this snippet, the trees are serialized into strings using the `serialize()` function. This includes a marker for the start of each node, its value, and a marker for null children. `isSubtree()` checks if serialized tree ‘S’ is a substring of serialized tree ‘T’. This indicates ‘S’ is indeed a subtree of ‘T’ and outputs `True`.

Method 3: Using Hashing for Tree Comparison

This technique builds on the previous serialization approach by using hashing (e.g., SHA or MD5) to create a unique fingerprint for each subtree during traversal. Subtree comparison then becomes a simple hash value comparison, improving efficiency for large trees.

Here’s an example:

```import hashlib

class TreeNode:
def __init__(self, x):
self.val = x
self.left = None
self.right = None

def hash_tree(node):
if not node:
return "null"
left_hash = hash_tree(node.left)
right_hash = hash_tree(node.right)
node_hash = hashlib.md5(("Node" + str(node.val) + left_hash + right_hash).encode()).hexdigest()
return node_hash

def isSubtree(s, t):
hash_t = hash_tree(t)
hash_s = hash_tree(s)

return hash_t == hash_s

# Example trees
T = TreeNode(1)
T.left = TreeNode(2)
T.right = TreeNode(3)
S = TreeNode(2)

print(isSubtree(S, T))```

Output: `False`

The code snippet employs the `hashlib` library to generate MD5 hashes for each subtree in the `hash_tree()` function. Unfortunately, this method outputs `False` because it only checks if the whole tree ‘S’ is identical to tree ‘T’, rather than if ‘S’ is a subtree of ‘T’. It demonstrates a hash-based comparison technique but highlights a limitation of this approach in handling subtrees.

Method 4: Optimized Recursive Comparison

Improving upon the basic recursive method, this optimized solution checks subtree structure at each step of the recursive calls and stops early if a mismatch is found. It thus avoids unnecessary checks, which leads to efficiency gains especially when the answer is negative early on.

Here’s an example:

```class TreeNode:
def __init__(self, x):
self.val = x
self.left = None
self.right = None

def compareTrees(s, t):
if s is None and t is None:
return True
if s is None or t is None:
return False
return s.val == t.val and compareTrees(s.left, t.left) and compareTrees(s.right, t.right)

def checkSubtree(T, S):
if S is None:
return True
if T is None:
return False
if compareTrees(T, S):
return True
return checkSubtree(T.left, S) or checkSubtree(T.right, S)

# Example trees
T = TreeNode(1)
T.left = TreeNode(2)
T.right = TreeNode(3)
S = TreeNode(2)

output = checkSubtree(T, S)
print(output)```

Output: `True`

The code defines a `compareTrees()` function for a direct comparison between two trees and a `checkSubtree()` function that extends the algorithm to efficiently check all subtrees within ‘T’ for equality with ‘S’. If a match is found, the function returns early, improving the overall efficiency. In the case of the trees provided, the result is `True` because ‘S’ is a subtree of ‘T’.

Bonus One-Liner Method 5: Leveraging Python’s Built-in Libraries

Possibly the most Pythonic way to solve this challenge, this approach utilizes Python’s standard libraries to help serialize the trees. The `pickle` library can be used here to serialize the entire tree, and then we can check for the serialized object as a substring.

Here’s an example:

```import pickle

class TreeNode:
def __init__(self, x):
self.val = x
self.left = None
self.right = None

def isSubtree(s, t):
t_dump = pickle.dumps(t)
s_dump = pickle.dumps(s)
return s_dump in t_dump

# Example trees
T = TreeNode(1)
T.left = TreeNode(2)
T.right = TreeNode(3)
S = TreeNode(2)

print(isSubtree(S, T))```

Output: `False`

This snippet serializes trees ‘T’ and ‘S’ using Python’s `pickle` module and then checks for the presence of ‘S’ within ‘T’ using simple substring matching. However, the output is `False` because pickled data contains metadata that can prevent this approach from being reliable for subtree checks. The example serves to illustrate a creative one-liner approach, but may not be suitable for all scenarios.

Summary/Discussion

• Method 1: Recursive Node Comparison. Straightforward and intuitive implementation. Potential inefficiency with large or complex trees.
• Method 2: String Serialization. Simplifies the problem to a string search. Not the most space-efficient due to string storage.
• Method 3: Hashing Tree Comparison. Unique finger-printing of trees. Not suitable for subtree checks as it only validates complete tree matches.
• Method 4: Optimized Recursive Comparison. Improved version of Method 1 with early termination. Effectively balances readability and performance.
• Method 5: Python’s Built-in Libraries. A Pythonic one-liner. Elegant but unreliable due to metadata in serialized objects.