5 Best Ways to Convert CSV to XLS in Python

πŸ’‘ Problem Formulation: Python developers often need to convert data between different formats for analysis, sharing, or data manipulation purposes. Specifically, converting from CSV (Comma Separated Values) to XLS (Excel Spreadsheet) is a common task. A user might have a file ‘data.csv’ with rows of comma-separated values that they want to convert to ‘data.xls’ to leverage Excel’s formatting and formula features.

Method 1: Using pandas

Pandas is a powerful data analysis and manipulation library in Python. It allows easy reading of a CSV file into a DataFrame and then exporting that DataFrame to an XLS file. This method is efficient and convenient for those already using pandas for data analysis.

Here’s an example:

import pandas as pd

df = pd.read_csv('data.csv')
df.to_excel('data.xls', index=False)

Output: A new file ‘data.xls’ is created with the content of ‘data.csv’.

This snippet uses pandas to read ‘data.csv’ into a DataFrame object and then saves it to an Excel file ‘data.xls’ without the DataFrame’s index. It’s a direct and straightforward way to perform the conversion.

Method 2: Using openpyxl and csv modules

Openpyxl is a library to read/write Excel 2010 xlsx/xlsm files. You can read a CSV file with Python’s built-in csv module and write the contents to an Excel file with openpyxl. This is a good choice for more control over the Excel file creation.

Here’s an example:

import csv
from openpyxl import Workbook

wb = Workbook()
ws = wb.active

with open('data.csv', 'r') as f:
    for row in csv.reader(f):
        ws.append(row)

wb.save('data.xlsx')

Output: A file ‘data.xlsx’ is created which contains the data from ‘data.csv’.

This code uses the csv module to read the CSV file and openpyxl to create and populate an Excel workbook, then saves it as ‘data.xlsx’. The csv module does the reading line by line, and openpyxl appends each row to the workbook.

Method 3: Using xlwt

xlwt is a library for writing data and formatting information to older XLS files (Excel 97-2003 format). It’s suitable when you need compatibility with older versions of Excel.

Here’s an example:

import xlwt
import csv

wb = xlwt.Workbook()
ws = wb.add_sheet('Sheet1')

with open('data.csv', 'rt') as f:
    reader = csv.reader(f)
    for r_idx, row in enumerate(reader):
        for c_idx, val in enumerate(row):
            ws.write(r_idx, c_idx, val)

wb.save('data.xls')

Output: An ‘data.xls’ file compatible with Excel 97-2003 is created with the CSV data.

The xlwt library is used here to create a new spreadsheet, then it iterates through the rows of the CSV file and writes each value into the correct cell location in this spreadsheet before saving it as ‘data.xls’.

Method 4: Using pyexcelerate

Pyexcelerate is a library optimized for speed which provides the capability of writing to XLSX files. It is particularly useful when dealing with large data sets that need to be written to Excel.

Here’s an example:

from pyexcelerate import Workbook
import csv

data = []
with open('data.csv', 'r') as f:
    for row in csv.reader(f):
        data.append(row)

wb = Workbook()
ws = wb.new_sheet('Sheet1', data=data)
wb.save('data.xlsx')

Output: A ‘data.xlsx’ is generated, which is optimized for performance with large datasets.

The code reads the data from ‘data.csv’ using the csv module, stores it in a list, and then creates a new sheet in an Excel workbook with this data using pyexcelerate. Afterward, the workbook is saved as ‘data.xlsx’.

Bonus One-Liner Method 5: Using Python’s shell command execution

This approach leverages a one-liner shell command to convert CSV to XLS using the in2csv tool from the csvkit, combined with Python’s subprocess module for execution. However, this method requires csvkit to be installed on the system.

Here’s an example:

import subprocess
subprocess.run(["in2csv", "data.csv", "|", "ssconvert", "-", "data.xls"])

Output: A new file ‘data.xls’ is created from ‘data.csv’ using command-line tools.

This one-liner uses the subprocess module to run in2csv, converting ‘data.csv’ to an Excel-readable format and piping it into ssconvert to generate ‘data.xls’.

Summary/Discussion

  • Method 1: Using pandas. Best for those using pandas in data-centric projects. Strengths: Very simple, part of the broader data manipulation workflow. Weaknesses: Requires pandas which might be heavy for simple tasks.
  • Method 2: Using openpyxl and csv modules. Good for detailed control over Excel file output. Strengths: More control over Excel features. Weaknesses: Requires openpyxl, slightly more complex.
  • Method 3: Using xlwt. Suitable for legacy Excel file format support. Strengths: Compatibility with older files. Weaknesses: Limited to .xls, not .xlsx format.
  • Method 4: Using pyexcelerate. Optimal for performance with large datasets. Strengths: Fast performance for large datasets. Weaknesses: Lesser known, additional dependency.
  • Method 5: One-Liner with Shell Command Execution. Quick for command-line enthusiasts. Strengths: Fast for small files, uses powerful command-line tools. Weaknesses: Requires csvkit, less portable across different environments.