5 Best Ways to Convert Excel Integer to Date in Python

πŸ’‘ Problem Formulation: When dealing with Excel data in Python, it’s common to encounter date values represented as serial date integers. In Excel, dates are serialized so they can be used in calculations. In this article, we’ll go over converting an integer that represents a serialized Excel date into a Python datetime object. For example, an input of 44022 should convert to an output representing the date 2020-06-04.

Method 1: Using xlrd Library

This method involves the xlrd library, specifically its function xldate_as_tuple. This function converts Excel date format (a float or integer) to a tuple that datetime library can use to create a date object. The xlrd library is particularly useful when working with older Excel formats like .xls.

Here’s an example:

import xlrd
from datetime import datetime

excel_integer = 44022
tuple_date = xlrd.xldate_as_tuple(excel_integer, 0)
date_object = datetime(*tuple_date[:3])

print(date_object)

Output:

2020-06-04 00:00:00

This snippet converts the Excel date integer 44022 to a date object using the xlrd.xldate_as_tuple function, then constructs a datetime object that only includes the year, month, and day.

Method 2: Using Pandas to_datetime with origin

Pandas provides a convenient function to_datetime that can convert integer series to dates. By setting the origin and unit parameters, we can specify that the integers represent days starting from Excel’s date system origin (1900-01-01).

Here’s an example:

import pandas as pd

excel_integer = 44022
date_object = pd.to_datetime(excel_integer, origin='1899-12-30', unit='D')

print(date_object)

Output:

2020-06-04 00:00:00

The code uses Pandas to_datetime to handle the conversion, accounting for Excel’s default date starting point. The unit='D' indicates that the number represents days.

Method 3: Using openpyxl Library

Openpyxl is another library for reading and writing Excel files. It provides a function from_excel that can be used to convert the Excel date serial number to a datetime object directly.

Here’s an example:

from openpyxl.utils import from_excel
from datetime import datetime

excel_integer = 44022
date_object = datetime(*from_excel(excel_integer))

print(date_object)

Output:

2020-06-04 00:00:00

This example uses openpyxl.utils.from_excel to directly convert an Excel serial number to a tuple suitable for creating a datetime object.

Method 4: Manual Conversion

A manual conversion can be achieved using datetime and a reference start date matching Excel’s base date, then adding the Excel integer as a timedelta.

Here’s an example:

from datetime import datetime, timedelta

excel_integer = 44022
start_date = datetime(1899, 12, 30)  # Excel's base date
date_object = start_date + timedelta(days=excel_integer)

print(date_object)

Output:

2020-06-04 00:00:00

This snippet manually calculates the date by adding the Excel integer, representing the number of days, to Excel’s base date using timedelta.

Bonus One-Liner Method 5: Using Dateutil Parser

The dateutil library is a powerful extension to the standard datetime module. It provides a parse function that can automatically detect the correct date format.

Here’s an example:

from dateutil.parser import parse

excel_integer = '44022'
date_object = parse(excel_integer, yearfirst=True, dayfirst=False)

print(date_object)

Output:

2020-06-04 00:00:00

This example demonstrates how to use dateutil.parser.parse to interpret the Excel integer as a year string, producing a date object. Note that this method assumes the integer is passed as a string.

Summary/Discussion

  • Method 1: Using xlrd Library. Great for older Excel files. Limited to work with the deprecated .xls format. It requires installation of additional library.
  • Method 2: Using Pandas to_datetime with origin. Utilizes the robust and widely-used Pandas library. Works well for series of numbers at once. However, it is heavy if Pandas is not already being used for other purposes.
  • Method 3: Using openpyxl Library. Best suited for interacting with newer Excel files (.xlsx). Does not require manual calculations. However, similar to others, it involves an external library.
  • Method 4: Manual Conversion. Does not depend on external libraries. It’s straightforward but requires manual handling of Excel’s leap year bug.
  • Bonus One-Liner Method 5: Using Dateutil Parser. Simple and concise. The dateutil parser is very versatile but can be overkill for this specific task and might interpret data incorrectly without proper format hints.