5 Best Ways to Perform Arithmetic Operations in Excel Files Using openpyxl in Python

πŸ’‘ Problem Formulation: You have an Excel file and you need to perform arithmetic operations on its data programmatically. For instance, if you have two columns representing ‘income’ and ‘expenses’, you might want to calculate the ‘profit’ by subtracting expenses from income and write the result to a new column directly in the Excel file.

Method 1: Basic Cell Operations

Openpyxl lets you manipulate Excel files by treating each cell as an object. You can perform arithmetic operations by accessing the value property of cells, allowing you to directly read, compute, and write values in an Excel sheet.

Here’s an example:

from openpyxl import load_workbook

# Load an existing workbook
wb = load_workbook('financials.xlsx')
sheet = wb.active

# Calculate profit for each row and write it in a new column
for row in range(2, sheet.max_row + 1):
    income = sheet.cell(row=row, column=1).value
    expenses = sheet.cell(row=row, column=2).value
    profit = income - expenses
    sheet.cell(row=row, column=3).value = profit

# Save the workbook
wb.save('financials_updated.xlsx')

Output: An Excel file with a new column that contains the profit values for each row.

This code snippet assumes you have a workbook named ‘financials.xlsx’ with income in the first column and expenses in the second column. It iterates over the rows, performs subtraction, and writes the results in the third column.

Method 2: Using Formulas

Openpyxl supports writing Excel formulas into cells. Instead of calculating the result in Python, you can directly write a formula into a cell which Excel will compute when the file is opened.

Here’s an example:

from openpyxl import load_workbook

wb = load_workbook('financials.xlsx')
sheet = wb.active

# Insert formula to calculate profit in Excel
for row in range(2, sheet.max_row + 1):
    sheet.cell(row=row, column=3).value = f'=A{row}-B{row}'

wb.save('financials_with_formulas.xlsx')

Output: An Excel file with formulas in a new column to calculate profit when opened in Excel application.

This snippet inserts Excel formulas into each cell of the third column, which tells Excel to perform the arithmetic operation when the file is accessed in Excel.

Method 3: Batch Operations for Performance

When dealing with large files, performing operations in batches can significantly improve performance. Openpyxl’s optimized writer allows you to write data in batches, which is more efficient than writing cell by cell.

Here’s an example:

from openpyxl import Workbook
from openpyxl.writer.write_only import WriteOnlyCell

wb = Workbook(write_only=True)
sheet = wb.create_sheet()

# A list of tuples with financial data
data = [(1000, 500), (1500, 800), (2000, 1200)]

# Write data and arithmetic operation results in batch
rows = []
for income, expenses in data:
    profit = income - expenses
    row = [WriteOnlyCell(sheet, value=income),
           WriteOnlyCell(sheet, value=expenses),
           WriteOnlyCell(sheet, value=profit)]
    rows.append(row)

sheet.append(rows)
wb.save('financials_batched.xlsx')

Output: An Excel file created with batch inserts for faster performance.

This code example demonstrates how to perform arithmetic operations on a set of data and write the results along with the original data to an Excel file in a batch process for better performance.

Method 4: Cell References and Incrementing

When working with complex Excel files, you’ll often need to navigate through the worksheets by referring to cell positions and incrementing them. Openpyxl allows for easy cell reference manipulation, making it simple to perform arithmetic operations across varying ranges.

Here’s an example:

from openpyxl import load_workbook
from openpyxl.utils import get_column_letter

wb = load_workbook('financials.xlsx')
sheet = wb.active

# Use cell references and incrementing to insert profit calculations
for row in range(2, sheet.max_row + 1):
    col_letter = get_column_letter(row)
    income_cell = f'A{col_letter}'
    expenses_cell = f'B{col_letter}'
    profit_cell = f'C{col_letter}'
    sheet[profit_cell] = f'={income_cell}-{expenses_cell}'

wb.save('financials_dynamic_references.xlsx')

Output: An Excel file with dynamic cell references used to calculate profit.

This snippet uses the openpyxl utility to get column letters for referring to cells dynamically, which can be especially useful in complex worksheets with non-linear structures.

Bonus One-Liner Method 5: List Comprehension

Python’s list comprehension can be combined with Openpyxl for a concise method to calculate and write results, particularly when working with data that’s already loaded in memory.

Here’s an example:

from openpyxl import Workbook

# A workbook with financial data already loaded in memory
data = [(1000, 500), (1500, 800), (2000, 1200)]
wb = Workbook()
ws = wb.active

# Write data with profit calculated using list comprehension
[ws.append(row + (row[0] - row[1],)) for row in data]

wb.save('financials_list_comprehension.xlsx')

Output: An Excel file with calculated profits using a Python list comprehension.

This code leverages Python’s list comprehension to iterate over a dataset, performing an arithmetic operation on each tuple and appending the result to a new Excel file.

Summary/Discussion

  • Method 1: Basic Cell Operations. Simple and intuitive. Can become slow with very large datasets.
  • Method 2: Using Formulas. Leverages Excel’s own computational capabilities. Result is not immediately available in Python.
  • Method 3: Batch Operations for Performance. Best for large datasets. Less straightforward than writing cell by cell.
  • Method 4: Cell References and Incrementing. Great for complex sheets. Requires more careful handling of cell references.
  • Method 5: List Comprehension. Quick and pythonic. Only efficient for datasets already in memory.