5 Best Ways to Convert Python Bytes to Excel

πŸ’‘ Problem Formulation: Converting Python bytes objects to Excel format is a common task when dealing with binary Excel files read in from a source such as a web service, file system, or database blob. Input typically consists of bytes representing an Excel file, and the desired output is a working Excel file that can be manipulated, analyzed, and saved in the .xlsx or .xls format.

Method 1: Using openpyxl for .xlsx Files

An efficient method to handle Excel files in Python is to use the openpyxl library, which is specifically designed to work with .xlsx files. It allows you to load a bytes object directly into an openpyxl workbook, which can then be edited and saved as an Excel file.

Here’s an example:

from openpyxl import load_workbook
from io import BytesIO

excel_bytes = b'...'  # Your bytes object here
wb = load_workbook(filename=BytesIO(excel_bytes))
wb.save("output.xlsx")

The output of this code will be an Excel file named ‘output.xlsx’ generated in the current working directory.

This snippet begins by importing load_workbook from openpyxl and BytesIO from io, which is used to convert the bytes object into a file-like object that openpyxl can understand. The wb.save method is then called to save the workbook as an actual .xlsx file.

Method 2: Using xlrd and xlwt for .xls Files

For handling older Excel .xls files, the combination of xlrd and xlwt libraries is a preferred choice. xlrd allows for reading in the binary Excel file and xlwt can be used to write it out after manipulation.

Here’s an example:

import xlrd
import xlwt
from io import BytesIO

excel_bytes = b'...'  # Your bytes object here
book = xlrd.open_workbook(file_contents=excel_bytes)
wb = xlwt.Workbook()

# Assume you only want to copy the first sheet
sheet = book.sheet_by_index(0)
ws = wb.add_sheet(sheet.name)

for row in range(sheet.nrows):
    for col in range(sheet.ncols):
        ws.write(row, col, sheet.cell_value(row, col))

wb.save("output.xls")

The output will be an Excel file named ‘output.xls’ in the .xls format.

This code uses xlrd.open_workbook with the file_contents parameter to read the bytes. It then loops through the rows and columns of the first sheet, copying each cell’s value to a new sheet created with xlwt. Finally, it saves the workbook in the .xls format using wb.save.

Method 3: Using pandas for Both .xlsx and .xls Files

The pandas library can be used to convert a bytes object containing an Excel file into a DataFrame and then save it back to an Excel file. This method allows for quick handling of both .xlsx and .xls file formats and provides a high-level interface for manipulating the data.

Here’s an example:

import pandas as pd
from io import BytesIO

excel_bytes = b'...'  # Your bytes object here
df = pd.read_excel(BytesIO(excel_bytes))

# Manipulate your DataFrame as needed here

df.to_excel("output.xlsx", index=False)

The output would be an ‘output.xlsx’ file containing the DataFrame’s contents.

The pandas function read_excel accepts a file-like object, which is provided by wrapping the bytes in a BytesIO. After optional manipulation of the DataFrame, df.to_excel is used to save the data back to an Excel file.

Method 4: Using pyexcel for a Format-Agnostic Approach

For a more format-agnostic approach, the pyexcel library can be used. It provides a single API to read, manipulate, and write data in different Excel file formats. This method abstracts away the details of which library to use for a particular file format.

Here’s an example:

import pyexcel as p

excel_bytes = b'...'  # Your bytes object here
sheet = p.get_sheet(file_type='xlsx', file_content=excel_bytes)

# Manipulate your sheet as needed here

sheet.save_as("output.xlsx")

The output is an Excel file by the name ‘output.xlsx’.

The pyexcel.get_sheet function detects the file type and reads the bytes content. The resulting sheet object can be easily manipulated and then saved into an Excel file using sheet.save_as.

Bonus One-Liner Method 5: Quick Save with openpyxl

For those looking for a quick one-liner to save a bytes object as an Excel file using openpyxl, this method comes in handy when no data manipulation is needed.

Here’s an example:

from openpyxl import load_workbook
from io import BytesIO

# A one-liner to load and immediately save the bytes object as an Excel file
load_workbook(filename=BytesIO(b'...')).save("output.xlsx")

Output: An Excel file ‘output.xlsx’.

This concise code does everything in a single line by chaining method calls. load_workbook loads the bytes object as a workbook, and .save saves it directly to a file called ‘output.xlsx’.

Summary/Discussion

  • Method 1: openpyxl. Best for .xlsx files. It does not support the older .xls format.
  • Method 2: xlrd and xlwt. Tailored for .xls files. It does not directly support .xlsx files and might require additional steps for complex Excel files.
  • Method 3: pandas. Versatile and powerful data manipulation. Can be overkill for simple tasks and requires pandas installed.
  • Method 4: pyexcel. Format-agnostic and straightforward. It may not offer as much control over specific Excel features as the other libraries.
  • Method 5: openpyxl one-liner. Quick and convenient. Not suitable when data needs to be manipulated before saving.