5 Best Ways to Color Identification in Images Using Python and OpenCV

💡 Problem Formulation: The challenge involves analyzing images to detect and identify colors accurately. An input image may contain various objects, and the desired output is information regarding the dominant colors present, with potential applications in image categorization, digital asset management, and visual search systems.

Method 1: Use of the inRange function for Color Detection

OpenCV’s inRange function allows us to filter a specific color within a range in the HSV color space. It is particularly useful when we need to highlight a certain color in an image or segment an image based on the color.

♥️ Info: Are you AI curious but you still have to create real impactful projects? Join our official AI builder club on Skool (only $5): SHIP! - One Project Per Month

Here’s an example:

import cv2
import numpy as np

image = cv2.imread('image.jpg')
hsv_image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)

# Define the color range for detection
lower_blue = np.array([110,50,50])
upper_blue = np.array([130,255,255])

# Threshold the image to get only blue colors
mask = cv2.inRange(hsv_image, lower_blue, upper_blue)

# Bitwise-AND mask and original image
result = cv2.bitwise_and(image, image, mask=mask)

cv2.imshow('image', result)
cv2.waitKey(0)
cv2.destroyAllWindows()

Output: A window displaying the original image with only the blue colors highlighted.

This snippet loads the image, converts it to the HSV color space, and then creates a mask that selects a range of blue. Using bit-wise AND, it then produces an output image that retains only the parts of the original image with the specified blue hue.

Method 2: K-Means Clustering for Color Quantization

K-means clustering is a method used to cluster data points into k groups. In the context of color identification, it can classify the dominant colors in an image by grouping the colors into k clusters.

Here’s an example:

import cv2
import numpy as np
from sklearn.cluster import KMeans

def find_dominant_colors(image, k=4):
    # Resize the image to reduce the computation time
    image = cv2.resize(image, (64, 64), interpolation = cv2.INTER_AREA)
    # Reshape the image to be a list of pixels
    pixels = image.reshape((-1, 3))
    
    # Perform KMeans
    kmeans = KMeans(n_clusters=k)
    kmeans.fit(pixels)
    
    # Get the colors
    colors = kmeans.cluster_centers_

    return colors.astype(int)

image = cv2.imread('image.jpg')
dominant_colors = find_dominant_colors(image)

print(dominant_colors)

Output: A list of k dominant color values in the image represented as RGB.

The code resizes the image for faster computation, reshapes it into a pixel array, and applies k-means clustering. The resultant cluster centers represent the dominant colors which are displayed as RGB values.

Method 3: Color Histograms for Color Distribution Analysis

Color histograms are visual representations of the color distribution in an image. By plotting a histogram for each color channel, we can analyze the prominence of colors.

Here’s an example:

import cv2
from matplotlib import pyplot as plt

image = cv2.imread('image.jpg')
color = ('b','g','r')

# Plot histogram for each color channel
for i,col in enumerate(color):
    histogram = cv2.calcHist([image],[i],None,[256],[0,256])
    plt.plot(histogram,color = col)
    plt.xlim([0,256])

plt.show()

Output: A plot displaying three color histograms (blue, green, red) for the image.

The code calculates the color histograms using OpenCV and then uses matplotlib to display them.

Method 4: RGB Color Space Analysis

Directly analyzing an image in the RGB color space gives us insight into the intensity of red, green, and blue in an image. This method straight-up counts pixel values in the RGB array.

Here’s an example:

import cv2
import numpy as np

image = cv2.imread('image.jpg')
# Flatten the 2D image array into 1D and count unique color occurrences
unique_colors, counts = np.unique(image.reshape(-1, 3), axis=0, return_counts=True)

# Sort colors by occurrence
sorted_colors = unique_colors[counts.argsort()[::-1]]

print(sorted_colors[:5])  # Print top 5 colors

Output: An array of the five most frequent colors in the image.

The code flattens the image array and uses numpy functions to find and sort the unique colors by their occurrence frequency, then prints out the top five most prominent colors.

Bonus One-Liner Method 5: Simple Pixel-Counting

This rudimentary method involves counting the frequency of all pixels directly matching a color in the RGB color space.

Here’s an example:

import cv2
import numpy as np

image = cv2.imread('image.jpg')
target_color = np.array([255, 0, 0])  # Example target color: pure red
matches = np.sum(np.all(image == target_color, axis=2))

print(matches)

Output: The number of pixels in the image that match the target color exactly.

The code snippet here compares every pixel in the image to the target RGB color and sums up all matches giving a total pixel count for that specific color.

Summary/Discussion

Method 1: Using inRange. Strengths: High precision for known color ranges. Weaknesses: Not effective for variable lighting conditions.
Method 2: K-Means Clustering. Strengths: Effective for identifying dominant colors. Weaknesses: Computationally expensive at high resolutions.
Method 3: Color Histograms. Strengths: Good for analyzing color distribution. Weaknesses: May be misleading for complex images with similar color distributions.
Method 4: RGB Color Space Analysis. Strengths: Simple and intuitive. Weaknesses: Can be noisy as every unique color is considered.
Bonus Method 5: Pixel-Counting. Strengths: Very simple to implement. Weaknesses: Only practical for identifying exact color matches, not shades or ranges.