Problem Formulation and Solution Overview
To make it more interesting, we have the following running scenario:
β₯οΈ Info: Are you AI curious but you still have to create real impactful projects? Join our official AI builder club on Skool (only $5): SHIP! - One Project Per Month
CSV File Snippet
| id | fname | lname | fee | charge_date |
| 31250 | Sammy | Drover | 11.98 | 11/21/22 |
| 31251 | Winnie | Dixie | 12.98 | 12/23/22 |
| 31252 | Marty | Smith | 11.98 | 09/14/22 |
| 31253 | Chad | Jones | 12.98 | 06/18/22 |
Preparation
import pandas as pd
After importing the Pandas library, this library is referenced by calling the shortcode (pd).
Method 1: Use Timedelta()
This example uses the timedelta() class which allows you to define a specific time interval, such as a day, and add it to a datetime expression.
df = pd.read_csv('checkers_users.csv')
df['charge_date'] = df['charge_date'].astype('datetime64[ns]')
df['charge_date'] = df.charge_date + pd.Timedelta(days=3)
print(df)The above code reads in the checkers_users.csv file into a DataFrame df.
The following line converts the DataFrame column charge_date into a datetime format. Then, three (3) days are added to the charge_date for each column entry (pd.Timedelta(days=3)). The results save back to df['charge_date'] and are output to the terminal.
| id | fname | lname | fee | charge_date |
| 31250 | Sammy | Drover | 11.98 | 2022-11-24 |
| 31251 | Winnie | Dixie | 12.98 | 2022-12-26 |
| 31252 | Marty | Smith | 11.98 | 2022-09-17 |
| 31253 | Chad | Jones | 12.98 | 2022-06-21 |
Method 2: Use to_datetime() and timedelta()
This example uses to_datetime() and timedelta() to add three (3) days to each Date entry in a DataFrame Column.
import time
from datetime import datetime, timedelta, date, time
df = pd.read_csv('checkers_users.csv')
df['charge_date'] = (pd.to_datetime(df['charge_date']) + timedelta(3))
print(df)The above code imports the time module and others from the datetime library.
Then, the checkers_users.csv file is read into a DataFrame df.
The following line adds three (3) days to the charge_date for each column entry (timedelta(3)). The results save back to df['charge_date'] and are output to the terminal.
| id | fname | lname | fee | charge_date |
| 31250 | Sammy | Drover | 11.98 | 2022-11-24 |
| 31251 | Winnie | Dixie | 12.98 | 2022-12-26 |
| 31252 | Marty | Smith | 11.98 | 2022-09-17 |
| 31253 | Chad | Jones | 12.98 | 2022-06-21 |
Method 3: Use to_datetime() and apply()
This example uses to_datetime() and apply() to add three (3) days to each Date entry in a DataFrame Column.
df = pd.read_csv('checkers_users.csv')
df['new_charge_date'] = pd.to_datetime(df['charge_date']).apply(pd.DateOffset(3))
print(df)The above code reads in the checkers_users.csv file into a DataFrame df.
The following line adds three (3) days to the charge_date for each column entry (apply(pd.DateOffset(3)). The results save to a new DataFrame column df['new_charge_date'] and are output to the terminal.
| id | fname | lname | fee | charge_date | new_charge_date |
| 31250 | Sammy | Drover | 11.98 | 11/21/22 | 2022-11-24 |
| 31251 | Winnie | Dixie | 12.98 | 12/23/22 | 2022-12-26 |
| 31252 | Marty | Smith | 11.98 | 09/14/22 | 2022-09-17 |
| 31253 | Chad | Jones | 12.98 | 06/18/22 | 2022-06-21 |
β‘Warning: As of this writing using Python 3.10.4, the following message appears when running this code:
FutureWarning: DateOffset.call is deprecated and will be removed in a future version
Method 4: Use to_datetime() and DateOffset()
This example uses to_datetime() and DateOffset() to add three (3) days to each Date entry in a DataFrame Column.
df = pd.read_csv('checkers_users.csv')
df['charge_date'] = df['charge_date'].astype('datetime64[ns]')
df['charge_date'] = pd.to_datetime(df['charge_date']) + pd.DateOffset(days=3)
print(df)The above code reads in the checkers_users.csv file into a DataFrame df.
The following line converts the DataFrame column charge_date into a datetime format. Then, three (3) days are added to the charge_date for each column entry (pd.DateOffset(days=3)). The results save back to df['charge_date'] and are output to the terminal.
| id | fname | lname | fee | charge_date | new_charge_date |
| 31250 | Sammy | Drover | 11.98 | 11/21/22 | 2022-11-24 |
| 31251 | Winnie | Dixie | 12.98 | 12/23/22 | 2022-12-26 |
| 31252 | Marty | Smith | 11.98 | 09/14/22 | 2022-09-17 |
| 31253 | Chad | Jones | 12.98 | 06/18/22 | 2022-06-21 |
Bonus: Date Change Based on Condition
CheckersTV has decided to change the customers’ bill day out three (3) days only if they fall within a specific range.
import pandas as pd
import numpy as np
df = pd.read_csv('checkers_users.csv')
df['charge_date'] = df['charge_date'].astype('datetime64[ns]')
count = 0
for c in df['charge_date']:
chrg_day = int(c.strftime("%d"))
if (chrg_day >= 1 and chrg_day <= 25):
df.iat[count, 4] = pd.to_datetime(c) + pd.DateOffset(days=3)
count += 1β¨ Finxter Challenge!
Add code to fix the year so this column isn’t
updated on a yearly basis.
Summary
This article has provided four (4) ways to find the most common element in a Pandas DataFrame column to select the best fitting for your coding requirements.
Good Luck & Happy Coding!
Programmer Humor – Blockchain
