5 Best Ways to Save a NumPy Array to a MAT File in Python

πŸ’‘ Problem Formulation: When working with numerical data in Python, it’s common to use NumPy arrays for efficient storage and manipulation. However, there might be a need to save these arrays in MATLAB’s binary format (MAT file) for interoperability or analysis within MATLAB. This article presents several ways to do so, assuming we have a numpy array numpy_array and the goal is to save it to a file called 'data.mat' in MATLAB’s format.

Method 1: Using SciPy’s savemat Function

This method involves the use of the savemat() function from the SciPy library. Specifically, it’s found within the io module and provides a straightforward means to save one or multiple NumPy arrays to a MAT file. The function takes the file name and a dictionary, where the keys are the variable names you want to use within MATLAB.

Here’s an example:

import numpy as np
from scipy.io import savemat

numpy_array = np.array([[1, 2], [3, 4]])
savemat('data.mat', {'array': numpy_array})

Output: A MAT file named 'data.mat' that contains the NumPy array is created in the current directory.

This code creates a NumPy array and saves it to a MAT file using the SciPy’s savemat() function. The array is stored with the name ‘array’ inside the MAT file, allowing it to be accessed by that variable name in MATLAB.

Method 2: Using the Mat4py Library

If you’re looking for a method that doesn’t rely on SciPy, the Mat4py library is an alternative. This library lets you create and read MAT files with pure Python, without any dependencies. Its function mat4py.savemat() similarly accepts the file name and a dictionary of arrays.

Here’s an example:

import numpy as np
import mat4py

numpy_array = np.array([[5, 6], [7, 8]], dtype=float)
mat4py.savemat('data.mat', {'array': numpy_array.tolist()})

Output: A MAT file named 'data.mat' containing the numpy array as a list.

The Mat4py library requires the NumPy array to be converted to a nested list using the tolist() method before saving. This code snippet saves the list representation of the array to 'data.mat'.

Method 3: Using HDF5 Storage Format

For those working with newer versions of MATLAB (v7.3 or higher), NumPy arrays can be saved in HDF5 format using the h5py library. While not strictly a MAT file, MATLAB can open HDF5 files natively. This method is beneficial for larger datasets and provides advanced features such as compression.

Here’s an example:

import h5py
import numpy as np

numpy_array = np.random.rand(1000, 1000)
with h5py.File('data.h5', 'w') as file:
    file.create_dataset('array', data=numpy_array)

Output: An HDF5 file named 'data.h5' that contains the NumPy array.

This code snippet uses the h5py library to save a NumPy array into an HDF5 file, which can be accessed from MATLAB by loading the file in the same way as a .mat file. It supports multidimensional arrays efficiently and is a good choice for large datasets.

Method 4: Manual Conversion and Saving with NumPy I/O

It’s also possible to manually convert a NumPy array to MATLAB’s .mat format by using the native NumPy I/O functions. This process involves converting the array to a MATLAB-compatible format, like MX, and then writing the file as bytes. Note that this method requires deep knowledge of MATLAB’s MAT file format and is not recommended for beginners.

Here’s an example:

# This is a mock code snippet to illustrate the concept, not a functional example.

# Function to manually convert the NumPy array:
def convert_to_mat(numpy_array):
    # Convert numpy_array to MATLAB's MX format (conceptual).
    # Write array as bytes into the .mat file (conceptual).
    pass

numpy_array = np.eye(3)
convert_to_mat(numpy_array)

Output: A MAT file is ideally created, but this is a conceptual example.

This code snippet is purely conceptual and intended to illustrate the idea of manual conversion. Writing a functional implementation would involve detailed understanding of the MAT file structure and is beyond common use cases.

Bonus One-Liner Method 5: Direct MATLAB Engine Execution

Python’s MATLAB Engine for Python provides an interface to execute MATLAB code directly from Python scripts. After installing the engine using MATLAB’s installer, you can pass the NumPy array directly to MATLAB’s workspace and save it using MATLAB’s save() function.

Here’s an example:

import matlab.engine
import numpy as np

numpy_array = np.eye(3)
eng = matlab.engine.start_matlab()
eng.workspace['my_array'] = matlab.double(numpy_array.tolist())
eng.save('data.mat', 'my_array', nargout=0)

Output: A MAT file named 'data.mat' with the variable 'my_array' saved.

By using the MATLAB Engine, the NumPy array numpy_array is converted to a MATLAB double type, assigned to the variable 'my_array' in the MATLAB workspace, and then saved to a MAT file. This approach is very straightforward for users who have MATLAB installed.

Summary/Discussion

  • Method 1: Using SciPy’s savemat. Straightforward, widely used, requires SciPy. Most direct method for .mat file generation.
  • Method 2: Using the Mat4py Library. Pure Python approach, good for non-SciPy users, requires manual array conversion to lists. Useful for simpler integration without extra dependencies.
  • Method 3: Using HDF5 Storage Format. Ideal for large data sets, supports advanced features, not a .mat file but MATLAB compatible. Best for datasets where performance is a concern.
  • Method 4: Manual Conversion and Saving. Provides deep control, not user-friendly, requires in-depth understanding. An advanced method for those needing fine-grained control over the file format.
  • Bonus One-Liner Method 5: Direct MATLAB Engine Execution. Seamless integration with MATLAB, requires MATLAB installation, simplest for existing MATLAB users. Optimal for those working simultaneously with Python and MATLAB.