Effectively Removing Surrounded Islands in Python

Rate this post

πŸ’‘ Problem Formulation: The challenge involves writing a program to remove ‘islands’ from a matrix (2D list) where an island is defined as a group of connected ones (1’s) not at the border of the matrix, and water is represented by zeros (0’s). These islands are completely surrounded by water (0’s). For instance, given an input matrix:

[[1, 1, 1, 1],
 [1, 1, 0, 1],
 [1, 0, 1, 1],
 [1, 1, 1, 1]]
the expected output after removing islands would be:
[[1, 1, 1, 1],
 [1, 1, 0, 1],
 [1, 0, 0, 1],
 [1, 1, 1, 1]]

Method 1: Depth-First Search (DFS)

Depth-First Search (DFS) is a traversal algorithm that can be adapted to identify and remove islands that are not connected to the matrix border. By iteratively descending one path as far as possible before backtracking, we can mark all nodes (in this case, cells with a ‘1’) that are part of an island. We can modify DFS to avoid marking border-connected land, ensuring that only surrounded islands are removed.

Here’s an example:

def removeIslands(matrix):
    def dfs(i, j):
        if i = len(matrix) or j = len(matrix[0]) or matrix[i][j] != 1:
            return
        matrix[i][j] = 2
        directions = [(-1, 0), (1, 0), (0, -1), (0, 1)]
        for dir in directions:
            dfs(i + dir[0], j + dir[1])
        
    for i in [0, len(matrix) - 1]:
        for j in range(len(matrix[0])):
            if matrix[i][j] == 1:
                dfs(i, j)
                
    for i in range(len(matrix)):
        for j in [0, len(matrix[0]) - 1]:
            if matrix[i][j] == 1:
                dfs(i, j)
    
    for i in range(len(matrix)):
        for j in range(len(matrix[0])):
            if matrix[i][j] == 1:
                matrix[i][j] = 0
            elif matrix[i][j] == 2:
                matrix[i][j] = 1
                
    return matrix

The output of the code would be:

[[1, 1, 1, 1],
 [1, 1, 0, 1],
 [1, 0, 0, 1],
 [1, 1, 1, 1]]

This method first changes all 1’s that are connected to the border to 2’s (indicating they are not to be removed), then changes the remaining 1’s (the islands) to 0’s, and finally, changes the 2’s back to 1’s. This way, only surrounded by water “islands” are removed from the matrix.

Method 2: Using a Queue for Breadth-First Search (BFS)

Breadth-First Search (BFS), similar to DFS, is used for traversing or searching tree or graph data structures. The BFS algorithm can also be used to identify and remove surrounded islands by visiting all nodes level-by-level. Utilizing a queue helps in visiting each cell of the matrix, starting from the border cells, and only marking those which are not connected to the borders. All cells still marked as ‘1’ after the BFS will be part of an island and thus changed to ‘0’.

Here’s an example:

from collections import deque

def removeIslands(matrix):
    rows, cols = len(matrix), len(matrix[0])
    queue = deque()

    for r in range(rows):
        for c in range(cols):
            if r in [0, rows - 1] or c in [0, cols - 1]:
                if matrix[r][c] == 1:
                    queue.append((r, c))
                    matrix[r][c] = 2

    while queue:
        r, c = queue.popleft()
        directions = [(1, 0), (-1, 0), (0, 1), (0, -1)]
        for dr, dc in directions:
            rr, cc = r + dr, c + dc
            if 0 <= rr < rows and 0 <= cc < cols and matrix[rr][cc] == 1:
                queue.append((rr, cc))
                matrix[rr][cc] = 2

    for r in range(rows):
        for c in range(cols):
            if matrix[r][c] == 1:
                matrix[r][c] = 0
            elif matrix[r][c] == 2:
                matrix[r][c] = 1

    return matrix

The output of the code would be:

[[1, 1, 1, 1],
 [1, 1, 0, 1],
 [1, 0, 0, 1],
 [1, 1, 1, 1]]

In this BFS implementation, a queue is used to visit the cells with a first-in-first-out order. Starting from the borders, all connected 1’s are turned into 2’s, preserving the perimeter. Then, the unvisited 1’s, which are the islands, are set to 0’s. Finally, the temporary 2’s are converted back to 1’s to restore the matrix without the islands.

Method 3: Union-Find (Disjoint Set)

The Union-Find algorithm, also known as the Disjoint Set algorithm, maintains a set of elements partitioned into a number of disjoint (non-overlapping) subsets. It can be used to identify connected components and is especially efficient for union and find operations. When checking for surrounded islands, this algorithm can effectively group all the connected components and identify which ones are islands by checking their connection to the border.

Here’s an example:

class UnionFind:    
    def __init__(self, size):
        self.parent = [i for i in range(size)]

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]

    def union(self, x, y):
        rootX = self.find(x)
        rootY = self.find(y)
        self.parent[rootY] = rootX

def removeIslands(matrix):
    rows, cols = len(matrix), len(matrix[0])
    uf = UnionFind(rows * cols + 1)
    dummy = rows * cols

    for r in range(rows):
        for c in range(cols):
            if matrix[r][c] == 1:
                if r in [0, rows - 1] or c in [0, cols - 1]:
                    uf.union(r * cols + c, dummy)
                else:
                    for dr, dc in ((1, 0), (0, 1)):
                        rr, cc = r + dr, c + dc
                        if 0 <= rr < rows and 0 <= cc < cols and matrix[rr][cc] == 1:
                            uf.union(r * cols + c, rr * cols + cc)

    for r in range(rows):
        for c in range(cols):
            if matrix[r][c] == 1:
                if uf.find(r * cols + c) != uf.find(dummy):
                    matrix[r][c] = 0

    return matrix

The output of the code would be:

[[1, 1, 1, 1],
 [1, 1, 0, 1],
 [1, 0, 0, 1],
 [1, 1, 1, 1]]

This Union-Find implementation initializes each cell as a separate set and unions them when they are horizontally or vertically adjacent to one another. All border cells are unioned with a dummy node, indicating that they are connected to the border. After the unions are complete, if a cell’s root is not the dummy node, it is an island and is set to 0.

Method 4: Recursive Flood Fill

Recursive Flood Fill is a simple algorithm often used to fill in areas on images or matrices. It starts at a target cell and changes its value, then recursively progresses to neighboring cells that match the starting cell’s original value. To remove surrounded islands, flood fill would be initiated from the border cells and would mark all land that is connected to the border. The remaining unmarked cells are parts of surrounded islands and can subsequently be set to water.

Here’s an example:

def removeIslands(matrix):
    def floodFill(r, c):
        if 0 <= r < len(matrix) and 0 <= c < len(matrix[0]) and matrix[r][c] == 1:
            matrix[r][c] = -1
            for dr, dc in [(-1, 0), (1, 0), (0, -1), (0, 1)]:
                floodFill(r + dr, c + dc)

    for i in range(len(matrix)):
        for j in [0, len(matrix[0]) - 1]:
            floodFill(i, j)
    for i in [0, len(matrix) - 1]:
        for j in range(len(matrix[0])):
            floodFill(i, j)

    for r in range(len(matrix)):
        for c in range(len(matrix[0])):
            if matrix[r][c] == 1:
                matrix[r][c] = 0
            elif matrix[r][c] == -1:
                matrix[r][c] = 1

    return matrix

The output of the code would be:

[[1, 1, 1, 1],
 [1, 1, 0, 1],
 [1, 0, 0, 1],
 [1, 1, 1, 1]]

The recursive flood fill method marks all 1’s that are encountered starting from the border using a placeholder value (in this case, -1). After completing the flood fill, the matrix is iterated again, setting marked cells back to land (1) and the remaining unmarked 1’s (which are the islands) to water (0).

Bonus One-Liner Method 5: Iterative Flood Fill with Stack

A variation of the flood fill algorithm, the iterative version using a stack, avoids the potential problem of Python’s recursion limit for large matrices. Instead of using the call stack through recursion, this method maintains a stack data structure to track which cells have been visited and continue the flood until no further cells match the condition.

Here’s an example:

def removeIslands(matrix):
    rows, cols, stack = len(matrix), len(matrix[0]), []

    for r in range(rows):
        stack.extend([(r, 0), (r, cols - 1)])
    for c in range(cols):
        stack.extend([(0, c), (rows - 1, c)])

    while stack:
        r, c = stack.pop()
        if 0 <= r < rows and 0 <= c < cols and matrix[r][c] == 1:
            matrix[r][c] = -1
            stack.extend([(r - 1, c), (r + 1, c), (r, c - 1), (r, c + 1)])

    for r in range(rows):
        for c in range(cols):
            matrix[r][c] = 1 if matrix[r][c] == -1 else 0

    return matrix

The output of the code would be:

[[1, 1, 1, 1],
 [1, 1, 0, 1],
 [1, 0, 0, 1],
 [1, 1, 1, 1]]

This iterative method works similarly to the recursive flood fill, but instead of function calls, the coordinates to be checked are pushed onto a stack. Each item popped from the stack is processed by marking the current land cell and pushing its adjacent cells onto the stack until no more land cells are found. Then, the process of resetting cells to either water or land proceeds as in the other methods.

Summary/Discussion

  • Method 1: Depth-First Search (DFS). Strengths include clarity and a logical approach to identifying and removing islands. Weaknesses are potential stack overflows with exceptionally large matrices due to recursion.
  • Method 2: Breadth-First Search (BFS) with a Queue. This approach is more iterative in nature and is less likely to hit recursion limits. However, it may have additional memory overhead compared to DFS due to the maintenance of the queue.
  • Method 3: Union-Find (Disjoint Set). Strengths involve efficient unite and find operations that can minimize the number of matrix traversals. Its weakness can be its more complex code and the overhead of the additional Union-Find data structure.
  • Method 4: Recursive Flood Fill. An intuitive method that fills islands recursively from the borders inward. Same as DFS, large matrices could hit the recursion limit.
  • Method 5: Iterative Flood Fill with Stack. This method avoids recursion limits and is quite efficient; however, its stack can grow large, resulting in higher memory usage comparable to the BFS queue.