Python Working with Excel – Getting Started

To provide a solid understanding of data manipulation with Excel, the Finxter Academy has created a 6-part series to take the user from beginner to advanced. This article assumes that Microsoft Excel is available on the computer.

Quick Intro

In most cases, working with large amounts of data requires some manipulation to make it more readable for the general user. How do you accomplish this?  Look no further! Python’s openpyxl library provides the tools to produce a well-formed Excel spreadsheet complete with:

  • custom heading(s) and/or sub-heading(s),
  • formatted column(s) and/or row(s),
  • sorted data,
  • simple/complex calculations, and much, much more.

Excel spreadsheets are user-friendly and do not require extensive knowledge of this software to perform the task(s) at hand.

Below are the contents of a text file. This file contains a sample representation of famous Jazz Musicians with fictitious Net Worths. 

First,Last,Birth Year,Birth State,Birth Country,Instrument,Est. Worth
Miles,Davis,1926,Illinois,USA,Trumpet,1343500
Duke,Ellington,1899,Washington,USA,Piano,1911400
John,Coltrane,1926,North  Carolina,USA,Saxophone,2575500
Dizzy,Gilespie,1917,South Carolina,USA,Trumpet,3690300
Ella,Fitzgerald,1917,Virgina,USA,Singer,8275780
Count,Basie,1904,New Jersey,USA,Piano,5145800
Fats,Waller,1904,New York,USA,Piano,3250930
Oscar,Peterson,1925,Montreal,CDA,Piano,7670945
Charlie,Parker,1920,Kansas City,USA,Saxophone,4375395
Billie,Holiday,1915,Philadelphia,USA,Singer,1560454

How to Save Data to CSV in Python?

A text file in its current format is not able to be saved directly to Excel. There is an in-between step: save the text to a CSV file.

To accomplish this for the text output above, perform the following steps:

  • Using the mouse, highlight all the Jazz Musicians (including the header row).
  • On the keyboard, press the CTRL+C key combination (copy data to clipboard).
  • Create an empty CSV file (use a text editor such as Notepad).
  • With the new CSV file open and active, press the CTRL+V key combination (paste data from clipboard).
  • Save the file as “j-greats.csv” to the current working directory.

Python Libraries to Work with Excel

Before any data manipulation can occur, two new libraries will require installation. The first library (openpyxl) provides Python access to Excel. The second library (pandas) enables the conversion from CSV to Excel.

  1. The openpyxl library
  2. The pandas library

To install these libraries, navigate to an IDE terminal. At the command prompt ($), execute the code below for each installation. For the terminal used in this example, the command prompt is a dollar sign ($). Your terminal prompt may be different.

Install openpyxl:

$ pip install openpyxl

Hit the <Enter> key on the keyboard to start the installation process.

Install pandas:

$ pip install pandas

Hit the <Enter> key on the keyboard to start the installation process.

If the installations are successful, a message displays in the terminal indicating the same.

How to Convert and Save a CSV to an Excel File in Python?

Before moving ahead, the CSV saved earlier requires conversion to Excel. The steps for this conversion are in the lines of code below.

import pandas as pd
read_file = pd.read_csv ('j-greats.csv')
read_file.to_excel ('j-greats.xlsx', index=False, header=True)
  • Line [1] imports the pandas library and assigns pd as a reference to this library.
  • Line [2] reads the contents of the CSV file and saves the contents to read_file.
  • Line [3] passes three parameters to the  to_excel() method.
  1. The first parameter is a filename (any-file-name.xlsx) and is required. For this example, the filename is j-greats.xlsx.
  2. The second parameter (index=True/False/None) is not required. If this parameter is left out, the default value is None/False. None/True means no additional column on the left with row (index) numbers are added. If True, the column displays.
  3. The third parameter (header=True/False) is not required. If this parameter is left out, the header row automatically displays.

If this code runs successfully, the current working directory now contains the j-greats.xlsx file.

Output (depending on parameters passed)

output Line [3] Possible Parameters: index=False/None, header=True

output Line [3] Possible Parameters: index=True, header=False

How to Load an Excel Workbook in Python?

To load an existing Workbook, run the code below.

  • Line [1] imports the load_workbook() method from the openpyxl library. This library is required to load the Workbook.
  • Line [2] assigns file_name as a reference to the existing j-greats.xlsx file.
  • Line [3] Loads in the Workbook (includes all associated Worksheets) and assigns the contents to w_book.
  • Line [4] closes the Workbook.

Running this code provides no visible output. However, the Workbook and Worksheet(s) are now accessible.

from openpyxl import load_workbook
file_name = 'j-greats.xlsx'
w_book   = load_workbook(filename=file_name)
w_book.close()

How to Determine the Active Openpyxl Worksheet – Single Worksheet

Each time a Workbook initially loads, the active Worksheet is, by default, assigned as the left-most Worksheet (tab 1).

Upon the creation of j-greats.xlsx, the following transpires:

  • Excel creates a Workbook named j-greats.xlsx.
  • Excel adds a Worksheet for this Workbook, Sheet1.
  • This Worksheet becomes the Active Worksheet.
  • Contents of CSV writes to Worksheet Sheet1.
  • The Workbook was saved and placed, for this example, in the current working directory.

To confirm the Active Worksheet, run the following code.

w_sheets = w_book.active
print(w_sheets.title)
w_book.close()

Output

Sheet1

Summary

In this article, we learned how to:

  • Create CSV file from plain text,
  • Install the required libraries,
  • Convert CSV to Excel,
  • Load a Workbook and,
  • Determine the Active Worksheet.

Next Up

Part 2 in the Working with Excel series will focus on how to:

  • Access Worksheet Data using Cell References
  • Access Worksheet Data via Iteration
  • Add Data to Worksheet
  • Modify Data on Worksheet
  • Delete Data from Worksheet