5 Best Ways to Find the Lowest Common Ancestor of a Binary Tree Using Python

💡 Problem Formulation: Given a binary tree and two nodes, the task is to find the lowest common ancestor (LCA). The LCA is the lowest node in the tree that has both given nodes as descendants (a node can be a descendant of itself). For instance, given a tree rooted at node 1 and the nodes 2 and 3, the LCA would be node 1 as it is the root of both nodes.

Method 1: Recursive Approach

This method uses recursion to traverse the binary tree and find the LCA. Starting at the root node, the function calls itself on the left and right subtrees. The LCA is found where the paths from root to the two given nodes diverge. The function signature can be def findLCA(root, node1, node2).

Here’s an example:

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

def findLCA(root, p, q):
    if not root or root == p or root == q:
        return root
    left = findLCA(root.left, p, q)
    right = findLCA(root.right, p, q)
    if left and right:
        return root
    return left if left else right

# Example tree and nodes:
root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)
lca = findLCA(root, root.left, root.right)
print(f'The LCA is: {lca.val}')

The output of this code snippet:

The LCA is: 1

This code snippet defines a simple TreeNode class and a function findLCA() that uses recursion to determine the LCA of two nodes in a binary tree. It returns the root when either of the nodes is found, or when it reaches a null pointer, indicating the end of the branch. The answer is deduced by left-right traversal and by checking the availability of the nodes in the left and right subtrees.

Method 2: Iterative Approach

Unlike the recursive approach, the iterative approach avoids the potential for stack overflow on large trees by using a while loop. The function uses parent pointers to trace back from each node to the root, storing the path for one node. It then traces the second node and finds the common node in the path. Function signature can be def findLCAIterative(root, node1, node2).

Here’s an example:

# Assuming TreeNode class as defined in Method 1

def findLCAIterative(root, p, q):
    # Stack for tree traversal
    stack = [root]
    # Dictionary for parent pointers
    parent = {root: None}
    
    # Iterate until we find both the nodes p and q
    while p not in parent or q not in parent:
        node = stack.pop()
        if node.left:
            parent[node.left] = node
            stack.append(node.left)
        if node.right:
            parent[node.right] = node
            stack.append(node.right)
    
    # Ancestors set() for node p
    ancestors = set()
    
    # Process all ancestors for node p using parent pointers
    while p:
        ancestors.add(p)
        p = parent[p]
    
    # The first ancestor of q which appears in p's ancestor set() is their LCA
    while q not in ancestors:
        q = parent[q]
    return q

# Example usage
lca = findLCAIterative(root, root.left, root.right)
print(f'The LCA is: {lca.val}')

The output of this code snippet:

The LCA is: 1

This code snippet first builds a dictionary parent that keeps track of each node’s parent as we traverse the entire tree. It uses two main steps: finding all ancestors of one node by following parent pointers, and then following the parent pointers from the second node until we find a common ancestor.

Method 3: Path Comparison

The path comparison approach first finds the paths from the root to the two nodes and then compares these paths to find the last common node. This method doesn’t modify the tree or require additional pointers but does use additional space for the paths. Function signature could be def findLCAPathComparison(root, node1, node2).

Here’s an example:

# Assuming TreeNode class as defined in Method 1

def findPath(root, path, k):
    if root is None:
        return False
    # Append this node to the path
    path.append(root)
    # See if the k is same as root's key
    if root == k:
        return True
    # Check if k is found in left or right sub-tree
    if (root.left and findPath(root.left, path, k)) or (root.right and findPath(root.right, path, k)):
        return True
    # If not present in subtree rooted with root, remove root from path and return False
    path.pop()
    return False

def findLCAPathComparison(root, p, q):
    # Find paths from root to p and q
    # It will store the nodes in the paths
    path1, path2 = [], []
    if not findPath(root, path1, p) or not findPath(root, path2, q):
        return None
    # Compare the paths to get the first different value
    i = 0
    while i < len(path1) and i < len(path2) and path1[i] == path2[i]:
        i += 1
    return path1[i-1]

# Example usage
lca = findLCAPathComparison(root, root.left, root.right)
print(f'The LCA is: {lca.val}')

The output of this code snippet:

The LCA is: 1

This code snippet involves two main functions: findPath(), which records the path to a given node, and findLCAPathComparison(), which compares the paths found to determine the LCA. The paths are stored as lists and compared element-wise to find the first non-matching node, which indicates the split point and thus the LCA.

Method 4: Tarjan’s Off-line Lowest Common Ancestors Algorithm

Tarjan’s off-line lowest common ancestors algorithm is an advanced method that finds LCAs for multiple pairs of nodes simultaneously. It uses a union-find data structure to dynamically keep track of which sets of nodes are reachable from one another as the tree is traversed. The function signature would be def tarjansOLCA(root).

This code snippet is omitted as Tarjan’s algorithm is complex and would significantly exceed the space limits of this article.

Bonus One-Liner Method 5: Using Python Libraries

If you’re using a Python data structure library that supports binary trees, such as networkx, you may find built-in methods to determine the LCA. This approach is quick and practical if the overhead of additional dependencies is acceptable for your project. Function signature might be nx.lowest_common_ancestor(G, node1, node2), where G is a Graph representation of the tree.

Here’s an example:

import networkx as nx

# Assuming G is a networkx Graph that represents our tree
# and nodes are added with G.add_node() and G.add_edge()

lca = nx.lowest_common_ancestor(G, 2, 3)
print(f'The LCA is: {lca}')

The output of this code snippet:

The LCA is: 1

This code snippet demonstrates how you could leverage the powerful networx library to directly find the LCA of two nodes. Assuming G is a properly built Graph representation of your binary tree, finding the LCA is as simple as calling the method lowest_common_ancestor with the two target nodes as arguments.

Summary/Discussion

Method 1: Recursive Approach. Strengths: Intuitive and simple to implement. Weaknesses: May cause a stack overflow on very large trees.
Method 2: Iterative Approach. Strengths: More memory-efficient on large trees. Weaknesses: Slightly more complex than the recursive approach.
Method 3: Path Comparison. Strengths: No recursion or modification to the tree. Weaknesses: Additional space complexity due to path storage.
Method 4: Tarjan’s Off-line Lowest Common Ancestors Algorithm. Strengths: Efficient for multiple LCA queries. Weaknesses: Algorithm complexity may make it difficult to implement and understand.
Method 5: Using Python Libraries. Strengths: Fast to implement with less code. Weaknesses: Requires additional dependencies.