Pandas DataFrame Arithmetic Operators – Part 1

The Pandas DataFrame has several binary operator methods. When applied to a DataFrame, these methods combine two DataFrames and return a new DataFrame with the appropriate result.

This is Part 1 of the following series on Pandas DataFrame operators:


Preparation

Before any data manipulation can occur, one (1) new library will require installation.

  • The Pandas library enables access to/from a DataFrame.

To install this library, navigate to an IDE terminal. At the command prompt ($), execute the code below. For the terminal used in this example, the command prompt is a dollar sign ($). Your terminal prompt may be different.

$ pip install pandas

Hit the <Enter> key on the keyboard to start the installation process.

If the installation was successful, a message displays in the terminal indicating the same.


Feel free to view the PyCharm installation guide for the required library.


Add the following code to the top of each code snippet. This snippet will allow the code in this article to run error-free.

import pandas as pd

DataFrame Add

The add() method takes a DataFrame and adds the value set as other parameter to each DataFrame element.

The syntax for this method is as follows:

DataFrame.add(other, axis='columns', level=None, fill_value=None)
ParameterDescription
otherThis can be any single or multiple element data structure such as a list or list-like object.
axisIf zero (0) or index is selected, apply to each column. Default is 0 (column). If zero (1) or columns, apply to each row.
levelThis parameter can be an integer or a label. This parameter is broadcast across a specified level and matches the index values on the MultiIndex level passed.
fill_valueThis parameter fills the NaN values before any computation occurs. If the data in both corresponding locations are missing, the result is missing.

For this example, we have three levels and three sub-levels of Real Estate base commissions. It is the end of the year, and their Agency has decided to increase base commissions by one (1) across the board.

Code – Example 1

agents = {'junior':  [0.5, 0.7, 0.8],
          'middle': [1.2, 1.3, 1.7],
          'senior':  [2.5, 1.9, 3.5]}

df = pd.DataFrame(agents)
result = df.add(1)
print(result)	
  • Line [1] creates a Dictionary called agents containing base commission rates for each level and sub-level.
  • Line [2] creates a DataFrame from this Dictionary and assigns this to df.
  • Line [3] adds 1 (other parameter) to each base commission and saves to the result variable.
  • Line [4] outputs the result to the terminal.

Output:

Formula Example: (junior) 0.5 + 1 = 1.5

 juniormiddlesenior
01.52.23.5
11.72.32.9
21.82.74.5

Note: Another way to perform this operation is to use: df + n. The result is identical.

With the add(n) method, you can also apply different amounts to elements using a secondary DataFrame. This example contains a second Dictionary (craise) with raises.

Code – Example 2

agents = {'junior':  [0.5, 0.7, 0.8],
          'middle':  [1.2, 1.3, 1.7],
          'senior':   [2.5, 1.9, 3.5]}

craise = {'junior':   [1.1, 1.2, 1.3],
          'middle':   [2.4, 2.5, 2.6],
          'senior':   [3.7, 3.8, 3.9]}

df1 = pd.DataFrame(agents)
df2 = pd.DataFrame(craise)
result = df1.add(df2)
print(result)
  • Line [1] creates a Dictionary called agents containing base commission rates for each level and sub-level.
  • Line [2] creates a Dictionary called craise containing the raises to be applied.
  • Line [3-4] creates DataFrames from the Dictionaries listed above.
  • Line [5] applies the craise DataFrame (df2) to the agents DataFrame (df1).
  • Line [6] outputs the result to the terminal.

Output:

Formula Example: (agents middle) 1.2 + (craise middle) 2.4 = 3.6

 juniormiddlesenior
01.63.66.2
11.93.85.7
22.14.37.4

Related Tutorial: The Python Addition Operator

DataFrame Subtract

The sub() method takes a DataFrame and subtracts the value set as other parameter from each element in the DataFrame.

The syntax for this method is as follows:

DataFrame.sub(other, axis='columns', level=None, fill_value=None)
ParameterDescription
otherThis can be any single or multiple element data structure such as a list or list-like object.
axisIf zero (0) or index is selected, apply to each column. Default is 0 (column). If zero (1) or columns, apply to each row.
levelThis parameter can be an integer or a label. This parameter is broadcast across a specified level and matches the index values on the MultiIndex level passed.
fill_valueThis parameter fills the NaN values before any computation occurs. If the data in both corresponding locations are missing, the result is missing.

For this example, we have two Real Estate Agents. Our goal is to determine how many houses and condos Agent 1 sold over Agent 2 in San Diego’s three (3) real estate boroughs.

agent1 = pd.DataFrame({'homes-sold':   [31, 55, 48],
                       'condos-sold':  [13, 12, 14]})
agent2 = pd.DataFrame({'homes-sold':  [1, 1, 7],
                       'condos-sold':  [2, 5, 13]})
result = agent1.sub(agent2)
print(result)
  • Line [1] creates a Dictionary called agent1 containing the total number of houses and condos agent1 sold.
  • Line [2] creates a Dictionary called agent2 containing the total number of houses and condos agent2 sold.
  • Line [3] subtracts these two DataFrames (element by element) and saves the output to the result variable.
  • Line [4] outputs the result to the terminal.

Output:

According to the results, Agent 1 sold more properties in the three (3) boroughs than Agent 2.

Formula Example: (agent1 homes-sold) 31 – (agent2 homes-sold) = 30

 homes-soldcondos-sold
03011
1547
2411

Note: Another way to perform this operation is to use: df – n. The result is identical.

Related Tutorial: The Python Subtraction Operator

DataFrame Multiply

The mul() method takes a DataFrame and multiplies the value set as other parameter to each element in the DataFrame.

The syntax for this method is as follows:

DataFrame.mul(other, axis='columns', level=None, fill_value=None)
ParameterDescription
otherThis can be any single or multiple element data structure such as a list or list-like object.
axisIf zero (0) or index is selected, apply to each column. Default is 0 (column). If zero (1) or columns, apply to each row.
levelThis parameter can be an integer or a label. This parameter is broadcast across a specified level and matches the index values on the MultiIndex level passed.
fill_valueThis parameter fills the NaN values before any computation occurs. If the data in both corresponding locations are missing, the result is missing.

For this example, the base commission increases for all staff members of Rivers Clothing.

Code – DataFrame 1

df = pd.DataFrame({'Alice': [1.1],
                   'Bob':   [1.8],
                   'Cindy': [1.6]})

result = df.mul(2)
print(result)
  • Line [1] creates a Dictionary containing the staff’s base commission.
  • Line [2] multiples the base commission by two (2) and saves it to the result variable.
  • Line [3] outputs the result to the terminal.

Output:

Formula Example: (Alice) 1.1 * 2 = 2.2

 AliceBobCindy
02.23.63.2

Note: Another way to perform this operation is to use: df * n. The result is identical.

For this example, a new staff member joins Rivers Clothing. No base commission for the new hire is assigned.

Code – DataFrame 2

df = pd.DataFrame({'Alice': [1.1],
                   'Bob':   [1.8],
                   'Cindy': [1.6],
                   'Micah': None})

result = df.mul(2, fill_value=1.0)
print(result)
  • Line [1] creates a Dictionary containing the staff’s current base commission, including the new hire Micah.
  • Line [2] multiples the current commission by two (2) after assigning any None values the default value.
  • Line [3] outputs the result to the terminal.

Output:

Formula Example: (Alice) 1.1 * 2 = 2.2

 AliceBobCindyMicah
02.23.63.22.0

Note: Another way to perform this operation is to use: df * n. The result is identical.

Related Tutorial: The Python Multiplication Operator

DataFrame Division

The div() method takes a DataFrame and divides the value set as other parameter to each element in the DataFrame.

The syntax for this method is as follows:

DataFrame.div(other, axis='columns', level=None, fill_value=None)
ParameterDescription
otherThis can be any single or multiple element data structure such as a list or list-like object.
axisIf zero (0) or index is selected, apply to each column. Default is 0 (column). If zero (1) or columns, apply to each row.
levelThis parameter can be an integer or a label. This parameter is broadcast across a specified level and matches the index values on the MultiIndex level passed.
fill_valueThis parameter fills the NaN values before any computation occurs. If the data in both corresponding locations are missing, the result is missing.

For this example, Rivers Clothing is having a sale on a few of its clothing items.

df = pd.DataFrame({'Tops': [15, 20, 25],
                   'Coats': [36, 88, 89],
                   'Pants':    [21, 56, 94]})

result = df.div(2)
print(result)
  • Line [1] creates a Dictionary containing the items going on sale.
  • Line [2] changes the prices by the value entered in the div() parameter.
  • Line [3] outputs the result to the terminal.

Output:

Formula Example: 15 / 2 = 7.5

 TopsCoatsPants
07.518.010.5
110.044.028.0
212.544.547.0

Note: Another way to perform this operation is to use: df / n. The result is identical.

Related Tutorial: The Python Division Operator

DataFrame True Division

The truediv() method takes a DataFrame and divides the value set as other parameter  to each element in the DataFrame.

The syntax for this method is as follows:

DataFrame.truediv(other, axis='columns', level=None, fill_value=None)
ParameterDescription
otherThis can be any single or multiple element data structure such as a list or list-like object.
axisIf zero (0) or index is selected, apply to each column. Default is 0 (column). If zero (1) or columns, apply to each row.
levelThis parameter can be an integer or a label. This parameter is broadcast across a specified level and matches the index values on the MultiIndex level passed.
fill_valueThis parameter fills the NaN values before any computation occurs. If the data in both corresponding locations are missing, the result is missing.

For this example, Rivers Clothing is having a sale on all of its clothing items. Not all items have prices.

Code – Example 1

df = pd.DataFrame({'Tops':    [15, 20, 25],
                   'Coats':   [36, 88, 89],
                   'Pants':   [21, 56, 94],
                   'Tanks':   [11, 10, None],
                   'Sweats':  [27, None, 35]})

index_ = ['Small', 'Medium', 'Large']
df.index = index_

result = df.truediv(other=2, fill_value=5)
print(result)
  • Line [1] creates a Dictionary containing the items going on sale. Not all items have prices.
  • Line [2-3] sets the index for the DataFrame.
  • Line [4] does the following:
    • Using fill_value assigns any None values to 5.
    • Changes the prices after setting None to the fill_value and applying the other parameter.
    • Save these changes to the result variable.
  • Line [5] outputs the result to the terminal.

Output:

Formula Example: 15 / (other) 2 = 7.5

 TopsCoatsPantsTanksSweats
Small7.518.010.55.513.5
Medium10.044.028.05.02.5
Large12.544.547.02.517.5

This example assigns a different Price for each item across columns.

Code – Example 2

df = pd.DataFrame({'Tops':    [15, 20, 25],
                   'Coats':   [36, 88, 89],
                   'Pants':   [21, 56, 94],
                   'Tanks':   [11, 10, None],
                   'Sweats':  [27, None, 35]})

index_ = ['Small', 'Medium', 'Large']
df.index = index_

result = df.truediv(other=[.1, .2, .3], axis=0, fill_value=.1).apply(lambda x:round(x,2))
print(result)
  • Line [1] creates a Dictionary containing the items going on sale. Not all items have prices.
  • Line [2-3] sets the index for the DataFrame.
  • Line [4] does the following:
    • Assigns a list of values to other to apply to the corresponding value in the DataFrame.
    • Axis is 0 (columns).
    • Using fill_value assigns any None values to .1.
    • Changes the prices after setting None to the fill_value and applying the other parameter.
    • Rounds the output to two (2) decimal places where applicable.
    • Save these changes to the result variable.
  • Line [5] outputs the result to the terminal.

Output:

Formula Example: 15 / (other) .1 = 150

 TopsCoatsPantsTanksSweats
Small150.00360.00210.00110.0270.00
Medium100.00330.00280.0050.00.50
Large83.33296.67313.330.33116.67

Related Tutorial: The Python True Division Operator

DataFrame Floor Division

The floordiv() method takes a DataFrame and divides the value set as other parameter to each element in the DataFrame. This method rounds down the result.

The syntax for this method is as follows:

DataFrame.floordiv(other, axis='columns', level=None, fill_value=None)
ParameterDescription
otherThis can be any single or multiple element data structure such as a list or list-like object.
axisIf zero (0) or index is selected, apply to each column. Default is 0 (column). If zero (1) or columns, apply to each row.
levelThis parameter can be an integer or a label. This parameter is broadcast across a specified level and matches the index values on the MultiIndex level passed.
fill_valueThis parameter fills the NaN values before any computation occurs. If the data in both corresponding locations are missing, the result is missing.

This example uses the same DataFrame as above for Rivers Clothing.

df = pd.DataFrame({'Tops':    [15, 20, 25],
                   'Coats':   [36, 88, 89],
                   'Pants':   [21, 56, 94],
                   'Tanks':   [11, 10, None],
                   'Sweats':  [27, None, 35]})

index_ = ['Small', 'Medium', 'Large']
df.index = index_

result = df.floordiv(2, fill_value=5)
print(result)
  • Line [1] creates a Dictionary containing the items going on sale. Not all items have prices.
  • Line [2-3] sets the index for the DataFrame.
  • Line [4] does the following:
    • Round values to 2.
    • Using fill_value assigns any None values to 5.
    • Apply price changes and round down (floor).
    • Save these changes to the result variable.
  • Line [5] outputs the result to the terminal.

Output:

Formula Example: 15 / (other) .1 = 7

 TopsCoatsPantsTanksSweats
Small718105.013.0
Medium1044285.02.0
Large1244472.017.0

Related Tutorial: The Python Floor Division Operator

DataFrame Mod

The mod() method determines the remainder using the mod(n) on each element in the DataFrame.

The syntax for this method is as follows:

DataFrame.mod(other, axis='columns', level=None, fill_value=None)
ParameterDescription
otherThis can be any single or multiple element data structure such as a list or list-like object.
axisIf zero (0) or index is selected, apply to each column. Default is 0 (column). If zero (1) or columns, apply to each row.
levelThis parameter can be an integer or a label. This parameter is broadcast across a specified level and matches the index values on the MultiIndex level passed.
fill_valueThis parameter fills the NaN values before any computation occurs. If the data in both corresponding locations are missing, the result is missing.

This example is a small representation of the available clothing items for Rivers Clothing.

df = pd.DataFrame({'Tops':    [15, 20, 25],
                   'Coats':   [36, 88, 89],
                   'Pants':   [21, 56, 94]})

index_ = ['Small', 'Medium', 'Large']
df.index = index_

result = df.mod(3)
print(result)
  • Line [1] creates a Dictionary containing a few items of Rivers Clothing,
  • Line [2-3] sets the index for the DataFrame.
  • Line [4] performs the modulo operator on each element of the DataFrame and saves it to the result variable.
  • Line [5] outputs the result to the terminal.

Output:

Formula Example: (tops medium) 20 % 3 = 2

 TopsCoatsPants
Small000
Medium212
Large121

Related Tutorial: The Python Modulo Operator

DataFrame Pow

The pow() method takes a DataFrame and applies the exponentiation pow(n) method to each element in the DataFrame.

The syntax for this method is as follows:

DataFrame.pow(other, axis='columns', level=None, fill_value=None)
ParameterDescription
otherThis can be any single or multiple element data structure such as a list or list-like object.
axisIf zero (0) or index is selected, apply to each column. Default is 0 (column). If zero (1) or columns, apply to each row.
levelThis parameter can be an integer or a label. This parameter is broadcast across a specified level and matches the index values on the MultiIndex level passed.
fill_valueThis parameter fills the NaN values before any computation occurs. If the data in both corresponding locations are missing, the result is missing.

For this example, we have stock prices taken three times/day: Morning, Mid-Day, and Evening.

Code – Example 1

df1 = pd.DataFrame({'Stock-A':  [9, 21.4, 20.4],
                    'Stock-B':   [8.7, 8.7, 8.8],
                    'Stock-C':   [21.3, 22.4, 26.5]})

df2 = pd.DataFrame({'Stock-A':  [1, 2, 2],
                    'Stock-B':   [3, 4, 5],
                    'Stock-C':   [2, 3, 1]})

result = df1.pow(df2).apply(lambda x:round(x,2))
print(result)
  • Line [1] creates a Dictionary containing Stock Prices for three stocks, three times/day.
  • Line [2-3] creates a Dictionary containing amounts to apply element-wise to DataFrame1 (df1) using pow().
  • Line [4] applies the pow() method to each element of df1 and rounds the results to two (2) decimals places.
  • Line [5] outputs the result to the terminal.

Output:

Formula Example: (Stock-A Mid-Day) 21.4 ** 2 = 457.96

 Stock-AStock-BStock-C
09.00658.50453.69
1457.965728.9811239.42
2416.16 52773.19    26.50

Related Tutorial: The Python pow() Function