5 Best Ways to Sort Using a Binary Search Tree in Python

Rate this post

πŸ’‘ Problem Formulation: Sorting a list of numbers for better data organization and retrieval is a common problem in computer science. A binary search tree is a node-based binary tree data structure which has the following properties: The left subtree of a node contains only nodes with keys less than the node’s key. The right subtree of a node contains only nodes with keys greater than the node’s key. Both the left and right subtrees must also be binary search trees. This article will demonstrate how to sort a list of integers [3, 1, 2, 4] by inserting them into a binary search tree and then performing an in-order traversal to retrieve the elements in sorted order: [1, 2, 3, 4].

Method 1: Implementing the Binary Search Tree Class

This method involves creating a binary search tree class from scratch. It includes methods to insert nodes in a manner that maintains the binary search tree property, and an in-order traversal method to output the sorted elements. The strength of this method lies in its classic approach and clarity in teaching binary search tree concepts.

Here’s an example:

class Node:
    def __init__(self, key):
        self.left = None
        self.right = None
        self.val = key

def insert(root, key):
    if root is None:
        return Node(key)
    else:
        if root.val < key:
            root.right = insert(root.right, key)
        else:
            root.left = insert(root.left, key)
    return root

def inorder_traversal(root, result=None):
    if result is None:
        result = []
    if root:
        inorder_traversal(root.left, result)
        result.append(root.val)
        inorder_traversal(root.right, result)
    return result

# Example usage:
tree_root = None
for value in [3, 1, 2, 4]:
    tree_root = insert(tree_root, value)
sorted_list = inorder_traversal(tree_root)
print(sorted_list)

Output: [1, 2, 3, 4]

This code defines a Node class to store values and a binary search tree, with the insert() and inorder_traversal() functions defining the binary search tree structure and its traversal, respectively. By using these, the numbers are sorted as they are added to the tree and retrieved in sorted order using in-order traversal.

Method 2: Using a Library – bintrees

For those who prefer working with libraries, bintrees is a Python package that provides several tree-based classes including binary search trees. This method takes advantage of pre-written, optimized code. Its convenience and speed are the significant benefits, making it particularly suitable for more extensive datasets or applications where development speed is crucial.

Here’s an example:

from bintrees import BinaryTree

def sort_with_bintrees(values):
    bt = BinaryTree(values)
    return list(bt)

sorted_list = sort_with_bintrees([(3, None), (1, None), (2, None), (4, None)])
print(sorted_list)

Output: [(1, None), (2, None), (3, None), (4, None)]

In the example above, we use the BinaryTree class from the bintrees library to instantiate a binary tree. The values are added as tuples with the structure (key, value). After constructing the tree, we convert it back into a list to retrieve the sorted keys.

Method 3: Augmenting Python’s Sorted Containers

Python’s Sorted Containers is a library that offers performance advantages and ease of use. This method involves sorting data by inserting it into a SortedList, which maintains the sort order. This is particularly good for situations where the dataset changes frequently, as the list remains sorted after each insert.

Here’s an example:

from sortedcontainers import SortedList

values = [3, 1, 2, 4]
sl = SortedList()
for value in values:
    sl.add(value)
sorted_list = list(sl)
print(sorted_list)

Output: [1, 2, 3, 4]

The SortedList object from the Sorted Containers library handles the sorting internally when new items are added. This makes it an extremely straightforward way to maintain and work with a sorted collection of items.

Method 4: Object-Oriented Approach with Custom Node Class

This object-oriented approach takes the custom class method in Method 1 a step further by encapsulating the insertion and traversal abilities inside the node class itself. This is an excellent demonstration of encapsulation principles in OOP, making the BST class more resilient and easier to manage.

Here’s an example:

class BSTNode:
    def __init__(self, key):
        self.key = key
        self.left = self.right = None

    def insert(self, key):
        if key < self.key:
            if self.left is None:
                self.left = BSTNode(key)
            else:
                self.left.insert(key)
        else:
            if self.right is None:
                self.right = BSTNode(key)
            else:
                self.right.insert(key)

    def in_order(self):
        if self.left is not None:
            yield from self.left.in_order()
        yield self.key
        if self.right is not None:
            yield from self.right.in_order()

root = BSTNode(3)
for key in [1, 2, 4]:
    root.insert(key)
sorted_list = list(root.in_order())
print(sorted_list)

Output: [1, 2, 3, 4]

The code snippet shows a BSTNode class with insert() and in_order() methods. The node is initialized with a key value, and has methods to insert new nodes and perform an in-order traversal (using Python’s yield for generator-based traversal), producing a sorted list of elements.

Bonus One-Liner Method 5: Using Python Generators

For Python enthusiasts who love concise code, generators offer an elegant one-liner solution. This method uses generator expressions to sort a list by constructing a binary search tree implicitly and performing an in-order traversal directly within the generator expression. This method compacts the traditional approach into a minimalistic form that can be written in just one line.

Here’s an example:

sorted_list = (x for x in sorted([3, 1, 2, 4]))
print(list(sorted_list))

Output: [1, 2, 3, 4]

While this code doesn’t show a binary search tree explicitly, it uses the built-in sorted() function, which implements a sort based on the Timsort algorithm, and then constructs a generator that yields the sorted results. For the purpose of a concise demonstration, it serves as a reminder that sometimes a direct solution is readily available in Python’s standard library.

Summary/Discussion

  • Method 1: Implementing the Binary Search Tree Class. This method is educational and great for understanding the inner workings of binary search trees. However, it can be overkill for small datasets and is not as optimized as library alternatives.
  • Method 2: Using a Library – bintrees. Offers optimized tree operations care of a well-maintained library. While it’s powerful and convenient, it introduces an external dependency and can be less transparent for learning purposes.
  • Method 3: Augmenting Python’s Sorted Containers. This method provides both performance and usability, scaling well with large and dynamic datasets. The trade-off is that this method does not explicitly use a binary search tree under the hood.
  • Method 4: Object-Oriented Approach with Custom Node Class. It promotes better encapsulation and code structure, improving maintainability. It is more complex than necessary for simple sorting tasks and requires understanding of OOP concepts.
  • Method 5: One-Liner Using Python Generators. This is the epitome of Pythonic simplicity and is unbeatable for brevity. It is, however, somewhat deceptive as no actual binary search tree is implemented here.