5 Best Ways to Find the Number of Distinct Islands in a 2D Matrix in Python

πŸ’‘ Problem Formulation: Determining the number of distinct islands in a 2D matrix is a common problem in algorithmic tasks and coding challenges. An island is defined as a group of connected 1s (vertically or horizontally) surrounded by 0s. Distinct islands are uniquely shaped groups of connected 1s. This article demonstrates how to compute the number of distinct islands in a 2D binary matrix using Python. For example, given the following matrix:

[[1, 1, 0, 0, 0],
 [1, 1, 0, 0, 1],
 [0, 0, 0, 1, 1],
 [0, 0, 0, 0, 0],
 [0, 0, 0, 0, 1]]

The desired output is 3, corresponding to the three distinct islands within the matrix.

Method 1: Depth-First Search (DFS) with Matrix Modification

In this method, a depth-first search algorithm is applied to find islands. The method includes iterating over each cell in the matrix. When a ‘1’ is found, it initiates a DFS to mark all adjacent cells as part of the same island. These cells are then set to ‘0’ to avoid recounting. This method modifies the input matrix.

Here’s an example:

def depth_first_search(matrix, x, y):
    if x < 0 or x >= len(matrix) or y < 0 or y >= len(matrix[0]) or matrix[x][y] == 0:
        return
    matrix[x][y] = 0
    depth_first_search(matrix, x - 1, y)
    depth_first_search(matrix, x + 1, y)
    depth_first_search(matrix, x, y - 1)
    depth_first_search(matrix, x, y + 1)

def distinct_islands_dfs(matrix):
    island_count = 0
    for i in range(len(matrix)):
        for j in range(len(matrix[i])):
            if matrix[i][j] == 1:
                depth_first_search(matrix, i, j)
                island_count += 1
    return island_count

matrix = [[1, 1, 0, 0, 0],
          [1, 1, 0, 0, 1],
          [0, 0, 0, 1, 1],
          [0, 0, 0, 0, 0],
          [0, 0, 0, 0, 1] ]
print(distinct_islands_dfs(matrix))

The output of this code snippet is 3.

This code initiates a DFS whenever it encounters a ‘1’, marking all reachable ‘1’s connected to it, therefore sinking the island. By converting the island cells to ‘0’s, we ensure we don’t count the same land twice, and we increment our counter whenever an island is fully explored.

Method 2: DFS Without Modifying the Original Matrix

This method also employs DFS to explore islands but without altering the original matrix. Instead of modifying the original matrix, we utilize a separate visited set to keep track of cells that have already been included in an island.

Here’s an example:

def depth_first_search(matrix, x, y, visited):
    if (x, y) in visited or x < 0 or x >= len(matrix) or y < 0 or y >= len(matrix[0]) or matrix[x][y] == 0:
        return
    visited.add((x, y))
    depth_first_search(matrix, x - 1, y, visited)
    depth_first_search(matrix, x + 1, y, visited)
    depth_first_search(matrix, x, y - 1, visited)
    depth_first_search(matrix, x, y + 1, visited)

def distinct_islands(matrix):
    island_count = 0
    visited = set()
    for i in range(len(matrix)):
        for j in range(len(matrix[i])):
            if matrix[i][j] == 1 and (i, j) not in visited:
                depth_first_search(matrix, i, j, visited)
                island_count += 1
    return island_count

matrix = [[1, 1, 0, 0, 0],
          [1, 1, 0, 0, 1],
          [0, 0, 0, 1, 1],
          [0, 0, 0, 0, 0],
          [0, 0, 0, 0, 1] ]
print(distinct_islands(matrix))

The output of this code snippet is also 3.

The example code performs a DFS search to find and mark the islands in the matrix, this time using a set to remember visited cells. This approach keeps the original matrix intact and still counts the distinct islands.

Method 3: Breadth-First Search (BFS)

If you prefer an iterative approach, you can use Breadth-First Search. This method requires a queue to explore each island level by level. We iterate through the matrix and whenever we find a ‘1’, we use BFS to visit all parts of the island and mark them as visited in a separate data structure.

Here’s an example:

from collections import deque

def breadth_first_search(matrix, i, j, visited):
    visited.add((i, j))
    queue = deque([(i, j)])
    while queue:
        x, y = queue.popleft()
        for dx, dy in ((1,0), (-1,0), (0,1), (0,-1)):
            nx, ny = x + dx, y + dy
            if 0 <= nx < len(matrix) and 0 <= ny < len(matrix[0]) and matrix[nx][ny] == 1 and (nx, ny) not in visited:
                visited.add((nx, ny))
                queue.append((nx, ny))

def distinct_islands_bfs(matrix):
    island_count = 0
    visited = set()
    for i in range(len(matrix)):
        for j in range(len(matrix[0])):
            if matrix[i][j] == 1 and (i, j) not in visited:
                breadth_first_search(matrix, i, j, visited)
                island_count += 1
    return island_count

matrix = [[1, 1, 0, 0, 0],
          [1, 1, 0, 0, 1],
          [0, 0, 0, 1, 1],
          [0, 0, 0, 0, 0],
          [0, 0, 0, 0, 1]]

print(distinct_islands_bfs(matrix))

The output of this code snippet is 3.

By leveraging BFS, this method iteratively explores each island’s levels. Once an island part is visited, it is added to a queue, and its neighbors are explored in a breadth-first manner. This also avoids modifying the original matrix.

Method 4: Union-Find (Disjoint Set)

The Union-Find algorithm, also known as Disjoint Set, is a useful data structure to keep track of a set of elements partitioned into a number of disjoint subsets. It can be used to find islands as each element can start as its own set, and parts of the same island can be united as they are discovered.

Here’s an example:

class UnionFind:
    def __init__(self, grid):
        self.count = 0
        self.parent = {}
        self.rank = {}
        for i in range(len(grid)):
            for j in range(len(grid[0])):
                if grid[i][j] == 1:
                    self.parent[(i, j)] = (i, j)
                    self.rank[(i, j)] = 0
                    self.count += 1

    def find(self, position):
        if self.parent[position] != position:
            self.parent[position] = self.find(self.parent[position])
        return self.parent[position]

    def union(self, x, y):
        rootx = self.find(x)
        rooty = self.find(y)
        if rootx != rooty:
            if self.rank[rootx] > self.rank[rooty]:
                self.parent[rooty] = rootx
            elif self.rank[rootx] < self.rank[rooty]:
                self.parent[rootx] = rooty
            else:
                self.parent[rooty] = rootx
                self.rank[rootx] += 1
            self.count -= 1

def numDistinctIslands(grid):
    if not grid or not grid[0]:
        return 0
    uf = UnionFind(grid)
    directions = [(1,0), (-1,0), (0,1), (0,-1)]
    for i in range(len(grid)):
        for j in range(len(grid[0])):
            if grid[i][j] == 1:
                for d in directions:
                    ni, nj = i + d[0], j + d[1]
                    if 0 <= ni < len(grid) and 0 <= nj < len(grid[0]) and grid[ni][nj] == 1:
                        uf.union((i, j), (ni, nj))
    return uf.count

matrix = [[1, 1, 0, 0, 0],
          [1, 1, 0, 0, 1],
          [0, 0, 0, 1, 1],
          [0, 0, 0, 0, 0],
          [0, 0, 0, 0, 1]]
print(numDistinctIslands(matrix))

The output of this code snippet is 3.

This code uses a Union-Find or Disjoint Set data structure to keep track of disjoint subsets of connected cells, effectively counting the distinct islands as they are merged. Each cell starts as its own set, and the “union” operation merges sets when parts of the same island are found.

Bonus One-Liner Method 5: Sophisticated Library Use

In Python ecosystems, there might be specific libraries that provide utilities to solve the island counting problem, like scikit-image which has label function for connected component analysis. However, the code example is hypothetical as this library isn’t specialized for such kind of task, so this method won’t have an associated example or output.

Summary/Discussion

  • Method 1: Depth-First Search with Matrix Modification. Strengths: Straightforward and efficient. Weaknesses: Modifies the input matrix, which might not be desirable.
  • Method 2: DFS Without Modifying the Original Matrix. Strengths: Preserves the original matrix. Weaknesses: Slightly more complex due to the use of a separate visited set.
  • Method 3: Breadth-First Search. Strengths: Iterative and works well for broader islands. Weaknesses: Generally consumes more memory than DFS since it stores all nodes at a given depth.
  • Method 4: Union-Find (Disjoint Set). Strengths: Very efficient for large matrices with lots of connections. Weaknesses: More complex to implement and understand than DFS/BFS.
  • Bonus Method 5: Sophisticated Library Use. Strengths: Can be very efficient and clean if a suitable library is available. Weaknesses: Depends on external libraries and might not be as customizable.