Convert Python NumPy Array to Raster: 5 Effective Methods

💡 Problem Formulation: Converting a NumPy array to a raster image is a common task in Geographic Information Systems (GIS), remote sensing data analysis, and image processing. In this article, we tackle the problem of turning a NumPy array representing spatial or image data into a raster file format such as GeoTIFF or PNG. The input will be a two-dimensional or multi-dimensional NumPy array, and the desired output is a raster file that can be visualized or further processed in a variety of desktop GIS and remote sensing software.

Method 1: Using GDAL

Geospatial Data Abstraction Library (GDAL) is a powerful library for reading and writing raster and vector geospatial data formats. Using Python bindings for GDAL, we can convert a NumPy array to various raster formats. This method is widely respected for its broad format support and georeferencing capabilities.

Here’s an example:

from osgeo import gdal, osr
import numpy as np

# Define the data array
array = np.random.rand(100, 100)

# Create a driver object
driver = gdal.GetDriverByName('GTiff')

# Create a raster file with the same dimensions as the array
out_raster = driver.Create('output.tif', array.shape[1], array.shape[0], 1, gdal.GDT_Float32)

# Write the array to the raster
out_band = out_raster.GetRasterBand(1)
out_band.WriteArray(array)

# Set spatial reference
srs = osr.SpatialReference()
srs.ImportFromEPSG(4326)  # This is WGS84
out_raster.SetProjection(srs.ExportToWkt())

# Save and close datasets
out_band.FlushCache()
out_raster = None

The output file output.tif will be a GeoTIFF containing the array data.

This code snippet demonstrates creating a raster file from a NumPy array using GDAL. The random array represents data, which is written to a new GeoTIFF file. Spatial reference is set to WGS84 coordinate system, a common requirement for geospatial raster data.

Method 2: Using rasterio

rasterio simplifies the process of working with geospatial raster data. Built on top of GDAL, it handles reading, writing, and transforming data in an idiomatic Python manner. It’s an excellent choice for those who want the power of GDAL with a more approachable API.

Here’s an example:

import rasterio
from rasterio.transform import from_origin
import numpy as np

# Create a simple 2D numpy array with random data
array = np.random.rand(100, 100)

# Define transformation and metadata
transform = from_origin(-180, 90, 1, 1)
metadata = {
    'driver': 'GTiff',
    'height': array.shape[0],
    'width': array.shape[1],
    'count': 1,
    'dtype': 'float64',
    'crs': '+proj=latlong',
    'transform': transform
}

# Write to a new raster file
with rasterio.open('output_rasterio.tif', 'w', **metadata) as dst:
    dst.write(array, 1)

The output file output_rasterio.tif will be a GeoTIFF containing the array data.

Utilizing the rasterio library, the code snippet writes our NumPy array to a raster format. A transformation object and metadata dictionary are defined to associate the array with spatial reference system and georeferencing parameters.

Method 3: Using Matplotlib

Matplotlib, although primarily a plotting library, can be used to save arrays as raster images. This approach is best suited for quick visualization purposes and non-georeferenced image outputs such as PNG.

Here’s an example:

import matplotlib.pyplot as plt
import numpy as np

# Create a numpy array
array = np.random.rand(100, 100)

# Plot the array and remove axes
plt.imshow(array, cmap='gray')
plt.axis('off')

# Save the figure
plt.savefig('output_matplotlib.png', bbox_inches='tight', pad_inches=0)

The output is a PNG file output_matplotlib.png with the visual representation of the array.

This code snippet creates a grayscale image from a NumPy array using Matplotlib and saves it as a PNG file. This is a good technique for generating a simple visualization without the overhead of geospatial metadata.

Method 4: Using Pillow (PIL)

The Python Imaging Library (PIL) or its fork Pillow can be used to create images from arrays directly. Pillow works well when simplicity is preferred over geospatial accuracy and when generating standard image formats.

Here’s an example:

from PIL import Image
import numpy as np

# Create a random numpy array
array = (np.random.rand(100, 100) * 255).astype('uint8')

# Create an image object
image = Image.fromarray(array)

# Save the image
image.save('output_pillow.png')

The output file output_pillow.png will be a non-georeferenced PNG image of the array.

This code snippet demonstrates converting a two-dimensional NumPy array into an image with the grayscale color palette. The array is first scaled to the byte range and then converted to the image which is subsequently saved as a PNG file.

Bonus One-Liner Method 5: Using NumPy Only

Converting a NumPy array to an image file can be achieved with minimal code using only NumPy’s inherent I/O capabilities, suitable for raw binary output, but without any metadata or formatting.

Here’s an example:

import numpy as np

# Create a numpy array
array = (np.random.rand(100, 100) * 255).astype('uint8')

# Save the array as a raw binary file
array.tofile('output_numpy.raw')

The output is a raw binary file output_numpy.raw.

This one-liner saves the array in raw byte format. While the resulting file doesn’t contain any format-specific metadata, this is the most straightforward method for dumping array data to disk.

Summary/Discussion

Method 1: GDAL. Supports comprehensive georeferencing. Requires external GDAL library. Best for serious GIS users.
Method 2: rasterio. Simplified API over GDAL. Good balance of usability and features. Requires learning the library’s conventions.
Method 3: Matplotlib. Great for visualization. Not suitable for geospatial data. Perfect for quick image dumps.
Method 4: Pillow (PIL). Easy to use for standard image formats. Lacks support for spatial data. Ideal for image processing tasks.
Method 5: NumPy Only. Quickest and simplest method. No metadata. Best for binary data manipulation and exchange.