The Best-First Search Algorithm in Python

4.5/5 - (2 votes)

You can watch the slides as a GIF here:

And download the slides as PDF here.

What’s the Best-First Search Algorithm?

After several articles on uninformed search algorithms, we continue our journey to informed search algorithms. The first one in the series is the Best-First search algorithm.

In general, informed search algorithms use some kind of auxiliary information to guide their search strategy. Not being statically determined upfront makes them an interesting choice for a wide range of applications. However, their performance is greatly determined by the quality of auxiliary information, commonly known in computer science as heuristic function, h(vertex).

A best-first search algorithm in this article uses a greedy, hence best-first approach, where the next vertex choice is determined by its immediate value, rather than the overall quality of the path otherwise determined by the algorithm.

Before we’ll dive into the algorithm and the Python implementation, let’s first skim over some related graph tutorials you may enjoy and that may help your understanding!

This algorithm is part of our graph algorithm tutorials:

Each of these tutorial links opens in a new browser tab.

What’s the Purpose of Best-First Search?

Depending on the quality and type of heuristic function, the best-first search algorithm can behave as both the DFS (depth-first search algorithm) and BFS (breadth-first search algorithm). It can also switch between them and is more efficient than BFS and DFS.

Applications: Therefore, the best-first search algorithm shares the domain of application with both algorithms, among others, like finding connected components, performing topological sorting, finding the bridges of a graph, determining the closeness of any two vertices in a graph or a tree, and solving puzzles with a unique solution, such as labyrinths.

However, the best-first search algorithm is not optimal; it can get stuck in a loop or the worst case, even perform as a DFS.

Best-First Search Overview – How Does It Work?

The best-first search algorithm starts the graph traversal by marking the start vertex as visited, i.e. putting it in the dictionary and placing it into the priority queue of candidate vertices. We will use the term explored, which is synonymous with terms expanded or extended in other literature.

Vertex priority determines the next, best-first vertex to be explored. Then, the best and currently, the only vertex is chosen to be explored. The algorithm will check if the vertex corresponds to the entity being searched for (in our example below, this is commented as a trivial check).

  • If the entity being searched for is found, the algorithm will stop executing and it will return the corresponding vertex. 
  • Otherwise, the algorithm will loop through its neighboring, unvisited vertices and place them into the priority queue.

Once again, the cycle of choosing, exploring, and populating the priority queue continues, until the priority queue becomes exhausted. At that point, the best-first search algorithm stops its execution. Since the heuristic function greatly influences the algorithm performance, the function’s accuracy is crucial.

What Are Properties of Best-First Search?

The main property of the best-first search algorithm lies in its versatility, i.e., the fact that it can switch between the BFS and DFS approach of traversing the graph.

Its performance depends on the quality of the heuristic function, which in most cases represents the distance estimation from the goal vertex. The choice of heuristic function can influence the algorithm to find the shortest possible path to the goal vertex, to never complete the search, and everything in between these two extremes.

How is Best-First Search Implemented?

The implementation of our best-first search algorithm is achieved by function best_first() and a modification of the underlying class Graph.

The best_first() function takes three parameters:

  • The graph parameter takes an initialized Graph object (see the blog on the breadth-first search algorithm, the section on graphs).
  • The start_vertex parameter takes the starting vertex, which we choose freely (remember, a graph is not a tree, there is no absolute root).
  • The target parameter is the entity we want to find in the graph, enclosed in a vertex.

For a better understanding of the algorithm and its implementation, each step is precisely described in the code below.

First, we will take a look at the modifications (marked) of the Graph.Vertex subclass:

class Vertex:
    __slots__ = '_entity', '_h'

    def __init__(self, entity, h=0):
        self._entity = entity
        self._h = h

    # The real-world entity is represented by the Vertex object.
    def entity(self):
        return self._entity

    # The real-world entity has a heuristic function of h.
    def h(self):
        return self._h

    # We have to implement __hash__ to use the object as a dictionary key.
    def __hash__(self):
        return hash(id(self))

    def __lt__(self, other):
        return self.h() < other.h()

Special attribute __slots__ is extended by adding a second internal variable/function h via the parameter _h of the initialization method __init__.

The next important change refers to the introduction of the object comparison operator less than, < by implementing a special method __lt__. We require this method to enable the comparison of the Vertex objects in a priority queue.

With these changes in place, implementation of the core function, best_first() is:

def best_first(graph, start_vertex, target):
    # Create the priority queue for open vertices.
    vertices_pq = PriorityQueue()

    # Adds the start vertex to the priority queue.
    print(f'Visiting/queueing vertex {start_vertex.entity()}')
    print('Prioritized vertices (vertex, h(vertex)):',
          *((vert.entity(), vert.h()) for vert in vertices_pq.queue)
           , end=2 * '\n')

    # The starting vertex is visited first and has no leading edges.
    # If we did not put it into 'visited' in the first iteration,
    # it would end up in 'visited' during the second iteration, pointed to
    # by one of its children vertices as a previously unvisited vertex.
    visited[start_vertex] = None

    # Loops until the priority list gets empty.
    while not vertices_pq.empty():
        # Gets the vertex with the lowest cost.
        vertex = vertices_pq.get()
        print(f'Exploring vertex {vertex.entity()}')
        if vertex.entity() == target:
            return vertex
        # Examine each non-visited adjoining edge/vertex.
        for edge in graph.adjacent_edges(vertex):
            # Gets the second endpoint.
            v_2nd_endpoint = edge.opposite(vertex)

            if v_2nd_endpoint not in visited:
                # Adds the second endpoint to 'visited' and maps
                # the leading edge for the search path reconstruction.
                visited[v_2nd_endpoint] = edge

                print(f'Visiting/queueing vertex {v_2nd_endpoint.entity()}')
        print('Prioritized vertices (vertex, h(vertex)):',
              *((vert.entity(), vert.h()) for vert in vertices_pq.queue)
               , end=2 * '\n')
    return None

Before we can test the algorithm, we have to initialize a graph and build it by adding vertices and edges to it:

# Initializes an empty graph (object).
g = Graph()

# Loads the graph with the first ten vertices and arbitrary heuristics.
for i in range(10):
    g.add_vertex(i, i*2+1)

# Constructs the 'vertices' dictionary for a more
# convenient access during the graph construction.
vertices = {k.entity(): k for k in g.vertices()}

# Constructs an arbitrary graph from
# the existing vertices and edges.
g.add_edge(vertices[0], vertices[1])
g.add_edge(vertices[0], vertices[2])
g.add_edge(vertices[2], vertices[4])
g.add_edge(vertices[4], vertices[3])
g.add_edge(vertices[3], vertices[5])
g.add_edge(vertices[0], vertices[5])
g.add_edge(vertices[2], vertices[6])

# Initializes the search path and a dictionary of visited vertices.
path = []
visited = {}

Now that we have prepared everything, we can test best_first() and see how it works. Here is the part of the code that runs the algorithm, constructs the search path (if there is one), and shows in a step-by-step manner how it proceeds through the graph:

# Starts the search.
result = best_first(g, vertices[5], 6)

# If the entity is found...
if result is not None:
    # The search path ends with the found vertex (entity). 
    # Each vertex is a container for its real-world entity.
    path_vertex = result
    # The entity is added to the 'path'.
    # Constructs the rest of the search path (if it exists)...
    while True:
        # Gets a discovery edge leading to the vertex.
        path_edge = visited.get(path_vertex)
        # If the path vertex is the root, it has no discovery edge...
        if path_edge is None:
        # Otherwise, gets the second (parent vertex) endpoint.
        path_vertex = path_edge.opposite(path_vertex)
        # The entity is added to the 'path'.
    print('Search path found:', end=' ')
    # The path is reversed and starts with the root vertex.
    print(*reversed(path), sep=' -> ')
# Otherwise...
    print('\nEntity is not found')

The test run gave us the output:

Visiting/queueing vertex 5
Prioritized vertices (vertex, h(vertex)): (5, 11)

Exploring vertex 5
Visiting/queueing vertex 3
Visiting/queueing vertex 0
Prioritized vertices (vertex, h(vertex)): (0, 1) (3, 7)

Exploring vertex 0
Visiting/queueing vertex 1
Visiting/queueing vertex 2
Prioritized vertices (vertex, h(vertex)): (1, 3) (3, 7) (2, 5)

Exploring vertex 1
Prioritized vertices (vertex, h(vertex)): (2, 5) (3, 7)

Exploring vertex 2
Visiting/queueing vertex 4
Visiting/queueing vertex 6
Prioritized vertices (vertex, h(vertex)): (3, 7) (4, 9) (6, 13)

Exploring vertex 3
Prioritized vertices (vertex, h(vertex)): (4, 9) (6, 13)

Exploring vertex 4
Prioritized vertices (vertex, h(vertex)): (6, 13)

Exploring vertex 6
Search path found: 5 -> 0 -> 2 -> 6

Based on the output, we can see that the search started from vertex 5 and that the best_first() has found the entity vertex 6. The entire search path is also displayed, and we should note that the search path is the shortest one: 5 -> 0 -> 2 -> 6.

However, the path may not always be the shortest one, as we will demonstrate with the next example. Before we re-run the algorithm, we have to replace our (previously generated!) heuristic function values by explicit definition, thus forcing the algorithm to take a slight detour:

# Loads the graph with the first seven vertices and worse heuristics.
g.add_vertex(0, 3)
g.add_vertex(1, 6)
g.add_vertex(2, 4)
g.add_vertex(3, 1)
g.add_vertex(4, 2)
g.add_vertex(5, 7)
g.add_vertex(6, 5)

The re-run gave us the output:

Visiting/queueing vertex 5
Prioritized vertices (vertex, h(vertex)): (5, 7)

Exploring vertex 5
Visiting/queueing vertex 3
Visiting/queueing vertex 0
Prioritized vertices (vertex, h(vertex)): (3, 1) (0, 3)

Exploring vertex 3
Visiting/queueing vertex 4
Prioritized vertices (vertex, h(vertex)): (4, 2) (0, 3)

Exploring vertex 4
Visiting/queueing vertex 2
Prioritized vertices (vertex, h(vertex)): (0, 3) (2, 4)

Exploring vertex 0
Visiting/queueing vertex 1
Prioritized vertices (vertex, h(vertex)): (2, 4) (1, 6)

Exploring vertex 2
Visiting/queueing vertex 6
Prioritized vertices (vertex, h(vertex)): (6, 5) (1, 6)

Exploring vertex 6
Search path found: 5 -> 3 -> 4 -> 2 -> 6

After a re-run, we got a longer path to our solution without modifying the algorithm, but only by changing the heuristics values for our vertices. After our simple demonstration, we just noticed how sensitive the best-first algorithm is to the precision/selection of the heuristic function.

Efficiency Analysis Best-First Search

The algorithm’s worst-case time complexity is O(bd). It is determined by the heuristic function and the number of explored nodes, which increase exponentially with the depth of solution d over the branching factor b.

The algorithm’s worst-case space complexity is O(bd) with the depth of solution d over the branching factor b.

The best-first search algorithm is not optimal, as it can yield a search path longer than an optimal one. Other outcomes also include finding the shortest path, and never finding the path if the algorithm degenerates into a DFS and ends up in the infinite descent.

However, with careful selection of a heuristic function, predetermined by quality information about the problem being solved, the best-first search algorithm can be very efficient.


In this article, we learned about the best-first search algorithm.

  • First, we explained what a best-first search algorithm is.
  • Second, we took a look at what are its common purposes and applications.
  • Third, we went through an explanation of how the algorithm works.
  • Fourth, we examined the algorithm’s main properties.
  • Fifth, we went through the implementation of the algorithm, which is based on the Graph abstract data structure (for class implementation, see the blog on the breadth-first search algorithm and apply the changes to the Graph.Vertex subclass as indicated above). We also tested the algorithm by calling its main function, best_first(), and analyzed its steps of execution for the shortest and longest path scenarios.
  • Sixth, we analyzed the algorithm efficiency.

In the end, we concluded that the algorithm’s efficiency is not optimal, and if the solution exists, the best-first search algorithm will probably find it along the path determined by the heuristic function. The algorithm might also take a virtually infinite time in reaching the solution, but this behavior can be prevented by constructing the heuristic function using the relevant knowledge about the graph and vertice relationships.

Programmer Humor

Question: How did the programmer die in the shower? ☠️

Answer: They read the shampoo bottle instructions:
Lather. Rinse. Repeat.

Academy Course – Mastering the Top 10 Graph Algorithms

If you want to improve your fundamental computer science skills, there’s nothing more effective than studying algorithms.

To help you master the most important graph algorithms, we’ve just launched the “Top 10 Algorithms” course at the Finxter Computer Science Academy. This great course from Finxter Star Creator Matija ⭐ teaches you the most important graph algorithms such as BFS, DFS, A*, and Dijkstra. 

Understanding these algorithms will not only make you a better coder, but it’ll also lay a strong foundation on which you can build your whole career as a computer scientist.

Click the screenshot to find out more: