5 Best Ways to Convert CSV to XLSX in Python

πŸ’‘ Problem Formulation: You have a CSV file containing data that you wish to convert into an XLSX (Excel) format – a typical task for data analysis and reporting. The input is a standard CSV file with comma-separated values. The desired output is an XLSX file preserving the tabular data structure, which is ready for advanced data manipulation or visual presentation.

Method 1: Using Pandas

Pandas is a powerful Python data analysis toolkit that simplifies data manipulation. This method involves reading the CSV file into a DataFrame and then exporting that DataFrame to an XLSX file. It is robust, supports large datasets, and provides fine control over the Excel output.

Here’s an example:

import pandas as pd

# Load the CSV data into a DataFrame
df = pd.read_csv('data.csv')

# Save the DataFrame to an Excel file
df.to_excel('data.xlsx', index=False)

The output is a file named ‘data.xlsx’ containing the organized data from ‘data.csv’.

This snippet first loads the CSV data into a DataFrame object using pd.read_csv(). Then, it writes the DataFrame to an Excel file using df.to_excel(), omitting the index column for a cleaner spreadsheet.

Method 2: Using openpyxl and csv

The openpyxl library allows you to directly work with Excel files. Paired with the csv module, you can read a CSV file row by row and write it into an Excel file. This method is suited for those who prefer working closer to the actual Excel file structure.

Here’s an example:

import csv
from openpyxl import Workbook

wb = Workbook()
ws = wb.active

with open('data.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        ws.append(row)
        
wb.save('data.xlsx')

The output is ‘data.xlsx’, which contains the CSV data.

This code creates a new Workbook object, reads the CSV file, appends each row to the active worksheet, and saves the workbook as an XLSX file.

Method 3: Using xlswriter

XlsxWriter is a Python module for writing files in the XLSX file format. It can be used to write text, numbers, formulas, and hyperlinks to multiple worksheets, and it supports features such as formatting.

Here’s an example:

import csv
import xlsxwriter

# Create a workbook and add a worksheet
workbook = xlsxwriter.Workbook('data.xlsx')
worksheet = workbook.add_worksheet()

# Open the CSV file
with open('data.csv', 'r') as csvfile:
    reader = csv.reader(csvfile)
    for r, row in enumerate(reader):
        for c, col in enumerate(row):
            worksheet.write(r, c, col)

# Close the XLSX file
workbook.close()

The output is ‘data.xlsx’, containing the data from ‘data.csv’.

In this example, xlsxwriter is used to create a new XLSX file. The CSV file is read row by row, and each cell is written into the Excel file with the worksheet.write() method.

Method 4: Using pyexcelerate

Pyexcelerate is a Python library optimized for writing large datasets to Excel files at high speeds. This approach is great for performance-critical applications.

Here’s an example:

from pyexcelerate import Workbook
import csv

data = []

# Read the CSV file and convert it into a list of lists
with open('data.csv', 'r') as f:
    reader = csv.reader(f)
    for row in reader:
        data.append(row)

# Write data to Excel
wb = Workbook()
ws = wb.new_sheet("Sheet 1", data=data)
wb.save('data.xlsx')

The output is an XLSX file called ‘data.xlsx’ populated with the CSV data.

This snippet uses pyexcelerate to read the entire CSV into memory first as a list of lists, which it then writes to an Excel file in a single operation, making it very fast.

Bonus One-Liner Method 5: Using pandas with one-liner

For Pythonistas looking for a succinct and effective one-liner to perform the conversion, Pandas offers a compact solution.

Here’s an example:

pd.read_csv('data.csv').to_excel('data.xlsx', index=False)

The output is an XLSX file named ‘data.xlsx’ with data from ‘data.csv’.

This single line of code is a condensed version of Method 1, eliminating interim variables for a swift conversion process, ideal for quick scripts and minimalistic codebases.

Summary/Discussion

  • Method 1: Pandas. Ideal for data analysis tasks. Supports a multitude of features for CSV parsing and Excel export. May be overkill for simple tasks.
  • Method 2: openpyxl with csv. Provides detailed control over the Excel file creation process. Slightly lower level, good for customization. More verbose than other methods.
  • Method 3: xlsxwriter. Perfect for complex Excel file creation with formatting and additional features. Not as fast as pyexcelerate for large datasets.
  • Method 4: pyexcelerate. Best suited for performance-sensitive applications with large volumes of data. Lacks some advanced features available in other libraries.
  • Method 5: One-liner Pandas. Quick and easy for simple scripts. Does not allow for customization or additional features usage during the conversion process.