5 Best Ways to Perform Image Translation Using OpenCV in Python

💡 Problem Formulation: In digital image processing, performing image translation refers to shifting an image along the x and y axes. Given an input image, the objective is to translate the image by a specified number of pixels horizontally and/or vertically and produce an output image. For example, if we input an image of a cat and desire to move it 50 pixels to the right and 30 pixels down, this operation will yield a new image with the cat repositioned accordingly.

Method 1: Using warpAffine for Manual Translation

OpenCV’s warpAffine() function allows users to apply an affine transformation to an image. Translation is a type of affine transformation. You define a transformation matrix specifying the x and y translations, and warpAffine() applies this matrix to produce the translated image.

Here’s an example:

import cv2
import numpy as np

# Load the image
image = cv2.imread('cat.jpg')

# Define the translation matrix
M = np.float32([[1, 0, 50], [0, 1, 30]])

# Perform the translation
translated_image = cv2.warpAffine(image, M, (image.shape[1], image.shape[0]))

# Save the result
cv2.imwrite('translated_cat.jpg', translated_image)

In the above code, the output will be an image named ‘translated_cat.jpg’ that has been translated 50 pixels right and 30 pixels down.

The code snippet first reads the original image ‘cat.jpg’, creates a transformation matrix that moves the image by the specified amounts in the x (50 pixels) and y (30 pixels) directions, applies the matrix using warpAffine(), and then saves the translated image.

Method 2: Translation with Imutils Library

The Python library Imutils includes convenience functions for image processing, including translation. Imutils’ translate() function makes image translation straightforward by encapsulating the creation of the translation matrix and use of warpAffine().

Here’s an example:

import cv2
import imutils

# Load the image
image = cv2.imread('cat.jpg')

# Perform the translation
translated_image = imutils.translate(image, 50, 30)

# Save the result
cv2.imwrite('translated_cat.jpg', translated_image)

Here the output will be a translated image just like in Method 1, but with less code.

After loading the ‘cat.jpg’ image, the translate() function from Imutils is called with the image and the x, y translations. It internally calls warpAffine() to produce the translated image which is then saved. This method is simpler and requires less code than Method 1.

Method 3: Custom Function for Image Translation

Creating a custom function for image translation allows for reusability and abstraction of the translation process. This function takes an image and the x and y translations as parameters and returns the translated image by using OpenCV’s warpAffine() function.

Here’s an example:

import cv2
import numpy as np

def translate_image(image, x, y):
    M = np.float32([[1, 0, x], [0, 1, y]])
    return cv2.warpAffine(image, M, (image.shape[1], image.shape[0]))

# Load the image
image = cv2.imread('cat.jpg')
translated_image = translate_image(image, 50, 30)
cv2.imwrite('translated_cat.jpg', translated_image)

Your output is the consistently translated image, obtained via a reusable function.

The code snippet defines a function translate_image() that creates a translation matrix for the given x and y values and applies it to the input image. It then loads an image, calls the translation function, and saves the output.

Method 4: Affine Transformation Function

The affine transformation includes translation, rotation, scaling and more. By using OpenCV’s getAffineTransform() and warpAffine() in tandem, you can create more complex transformations that include translation as one of the transformations.

Here’s an example:

import cv2
import numpy as np

# Load the image
image = cv2.imread('cat.jpg')

# Coordinates of triangle points for the affine transformation
pt1 = np.float32([[50, 50], [200, 50], [50, 200]])
pt2 = np.float32([[10, 100], [200, 50], [100, 250]])

# Affine transformation matrix
M = cv2.getAffineTransform(pt1, pt2)

# Apply affine transformation
translated_image = cv2.warpAffine(image, M, (image.shape[1], image.shape[0]))

# Save the result
cv2.imwrite('translated_cat.jpg', translated_image)

The output is an image with more complex transformation including translation.

This code snippet demonstrates a more intricate translation by specifying points before and after the transformation. It uses these points with getAffineTransform() to create a transformation matrix which is then applied to the image using warpAffine().

Bonus One-Liner Method 5: Using numpy slicing for Simple Translation

For translation to the right or down, numpy slicing can be used as an extremely simplistic approach, albeit with limitations such as the inability to translate left or up and loss of parts of the image that are shifted out of frame.

Here’s an example:

import cv2
import numpy as np

# Load the image
image = cv2.imread('cat.jpg')

# Perform a simple right-down translation using slicing
translated_image = np.zeros_like(image)
translated_image[30:, 50:] = image[:-30, :-50]

# Save the result
cv2.imwrite('translated_cat.jpg', translated_image)

With this approach, the resulting image is translated right by 50 pixels and down by 30 pixels.

This code snippet utilizes numpy arrays’ powerful slicing to shift the image. This simple translation technique offsets the image within its own array, leaving the top-left corner filled with zeros (black) due to the translation, and crops the part of the image that shifts beyond the original dimensions.

Summary/Discussion

Method 1: warpAffine for Manual Translation. This traditional method gives full control over the translation process. However, it requires manual creation of the transformation matrix.
Method 2: Translation with Imutils Library. This higher-level abstraction simplifies the process and reduces the amount of code. It’s straightforward but abstracts the underlying details, which can reduce flexibility.
Method 3: Custom Function for Image Translation. A custom function promotes code reuse and abstraction, making translating multiple images more efficient. The downside is the initial effort required to define the function.
Method 4: Affine Transformation Function. Using affine transformations allows the combination of multiple transformations, including translation, for complex effects. This method is powerful but also more complex to set up and understand.
Bonus Method 5: Using numpy slicing. Super simple and doesn’t require OpenCV, but is limited in functionality and results in lost image data when translating, since it effectively crops the image to fit the new position.