df, the goal is to import it into MATLAB, resulting in an equivalent MATLAB table or structure.Method 1: Direct Transfer Using MATLAB’s Engine for Python
MATLAB’s Engine API for Python allows for direct execution of MATLAB code from Python. This method involves initiating a MATLAB session from within Python, and then pushing the DataFrame directly to the MATLAB workspace. It requires the MATLAB Engine API for Python to be installed.
Here’s an example:
import pandas as pd
import matlab.engine
# Start MATLAB engine
eng = matlab.engine.start_matlab()
# Create a pandas DataFrame
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
# Convert DataFrame to MATLAB data structure and assign to variable in MATLAB
eng.workspace['my_matlab_table'] = matlab.double(df.values.tolist())
eng.eval('my_matlab_table = array2table(my_matlab_table);', nargout=0)
eng.eval('my_matlab_table.Properties.VariableNames = {\'A\', \'B\'};', nargout=0)
# Close MATLAB engine
eng.quit()Output in MATLAB: A MATLAB table with columns A and B, populated with the values from the pandas DataFrame.
This code creates a MATLAB engine session, converts the pandas DataFrame to a list of lists which is then passed into MATLAB as a double array. The double array is converted into a table and the columns are named appropriately, transferring the python DataFrame structure into MATLAB directly.
Method 2: Using the scipy.io.savemat Function
The scipy.io.savemat function lets you save Python objects as MATLAB .mat files. This method allows for the offline saving of DataFrame that can later be loaded into MATLAB manually or via code. It’s perfect for when a MATLAB session isn’t directly accessible from Python.
Here’s an example:
import pandas as pd
from scipy.io import savemat
# Create a pandas DataFrame with some example data
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
# Convert the DataFrame to a dictionary with variable names as keys
mdict = {'my_matlab_table': df.to_numpy()}
# Save the dictionary as a .mat file
savemat("dataFrame.mat", mdict)Output: A ‘dataFrame.mat’ file containing the matrix my_matlab_table that can be loaded into MATLAB.
After turning the DataFrame into a NumPy array, the array is placed into a dictionary with the desired MATLAB variable name as the key. The savemat function from scipy.io then saves this dictionary as a MATLAB file, which can later be loaded in MATLAB.
Method 3: Using HDF5 Storage Format
The Hierarchical Data Format version 5 (HDF5) is a versatile data model that can store complex data relationships and is accessible from both Python and MATLAB. Pandas provides support for HDF5 via the HDFStore class. Once saved in this format, the data can be easily read in MATLAB.
Here’s an example:
import pandas as pd
# Create pandas DataFrame
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
# Save it to an HDF5 file
df.to_hdf('data.h5', key='df', mode='w')
# Output in MATLAB would be using load command:
# S = hdf5read('data.h5', 'df')Output: An HDF5 file ‘data.h5’ that contains the DataFrame, which can be accessed in MATLAB.
The DataFrame is saved as an HDF5 file directly from pandas, without the need to convert it to a dictionary first. MATLAB’s built-in hdf5read function is then used to read the HDF5 file, and the data structure is preserved. This is convenient for large datasets.
Method 4: CSV File Interchange
Exporting a pandas DataFrame to a CSV file is one of the simplest and most reliable methods of transferring data to MATLAB. Both pandas and MATLAB have robust CSV I/O functions, making it easy to export data from pandas and then import it into MATLAB.
Here’s an example:
import pandas as pd
# Create pandas DataFrame
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
# Save DataFrame to CSV
df.to_csv('dataframe.csv', index=False)
# In MATLAB, you would use:
# T = readtable('dataframe.csv')Output: A CSV file ‘dataframe.csv’ that can be read into MATLAB as a table.
The DataFrame is converted to a CSV file, ignoring index labels to keep the data clean. MATLAB reads this CSV seamlessly into a table structure with the readtable function. This is a cross-platform and easily understood method for data exchange.
Bonus One-Liner Method 5: Convert to JSON String and Load in MATLAB
The JSON format is lightweight and widely used for data interchange. Pandas can export a DataFrame to a JSON-formatted string, which MATLAB can interpret and convert into its own data structure using the jsondecode function.
Here’s an example:
import pandas as pd
# Create a pandas DataFrame
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
# Convert the DataFrame to a JSON string
json_str = df.to_json()
# In MATLAB, parse the JSON string:
# data = jsondecode(json_str)Output: A JSON-formatted string that can be decoded in MATLAB into an equivalent data structure.
In Python, the DataFrame is converted into a JSON string using pandas’ to_json function. This JSON string can be shared with MATLAB, where it’s converted back to a table or structure using MATLAB’s jsondecode function. This method is quick for small datasets and ensures compatibility across platforms.
Summary/Discussion
- Method 1: Direct Transfer Using MATLAB’s Engine for Python. Suitable for real-time data exchange. Requires MATLAB runtime.
- Method 2: Using the scipy.io.savemat Function. Good for offline transfers, doesn’t require a live MATLAB session. .mat file handling needed on MATLAB side.
- Method 3: Using HDF5 Storage Format. Ideal for large, complex datasets. Requires understanding of HDF5 format both in Python and MATLAB.
- Method 4: CSV File Interchange. Most reliable for simple data tables. Readable by humans and machines, but accurate typing (e.g., for dates) may require extra attention in MATLAB.
- Bonus Method 5: Convert to JSON String and Load in MATLAB. Fast for smaller datasets. Limited by JSON’s ability to represent complex or very large datasets.
