Preparation
Before any data manipulation can occur, two (2) new libraries will require installation.
- The Pandas library enables access to/from a DataFrame.
- The NumPy library supports multi-dimensional arrays and matrices in addition to a collection of mathematical functions.
To install these libraries, navigate to an IDE terminal. At the command prompt ($), execute the code below. For the terminal used in this example, the command prompt is a dollar sign ($). Your terminal prompt may be different.
$ pip install pandas
Hit the <Enter> key on the keyboard to start the installation process.
$ pip install numpy
Hit the <Enter> key on the keyboard to start the installation process.
If the installations were successful, a message displays in the terminal indicating the same.
Feel free to view the PyCharm installation guide for the required libraries.
Add the following code to the top of each code snippet. This snippet will allow the code in this article to run error-free.
import pandas as pd import numpy as np
DataFrame replace()
The replace() method substitutes values in a DataFrame/Series with a different value assigned. This operation is performed dynamically on the object passed.
π‘ Note: The .loc/.iloc methods are slightly different from replace() as they require a specific location to change the said value(s).
The syntax for this method is as follows:
DataFrame.replace(to_replace=None, value=None, 
                  inplace=False, limit=None, 
                  regex=False, method='pad')| Parameter | Description | 
|---|---|
| to_replace | Determines how to locate values to replace. The following parameters are:– Numeric, String, or Regex. – List of Strings, Regex, or Numeric. – Dictionary: a Dictionary, DataFrame Dictionary, or Nested Dictionary Each one must exactly match the to_replaceparameter to cause any change. | 
| value | The value to replace any values that match. | 
| inplace | If set to True, the changes apply to the original DataFrame/Series. IfFalse, the changes apply to a new DataFrame/Series. By default,False. | 
| limit | The maximum number of elements to backward/forward fill. | 
| regex | A regex expression to match. Matches resolve to the value parameter. | 
| method | The available options for this method are pad,ffill,bfill, orNone. Specify the replacement method to use. | 
Possible Errors Raised
| Error | When Does It Occur? | 
| AssertionError | If regexis not a Boolean (True/False), or theto_replaceparameter isNone. | 
| TypeError | If to_replaceis not in a valid format, such as:– Not scalar, an array, a dictionary, or is None.– If to_replaceis a dictionary and thevalueparameter is not a list.– If multiple Booleans or date objects and to_replacefails to match thevalueparameter. | 
| ValueError | Any error returns if a list/ndarray and value are not the same length. | 
The examples below show how versatile the replace() method is. We recommend you spend some time reviewing the code and output.
In this example, we have five (5) grades for a student. Notice that one (1) grade is a failing grade. To rectify this, run the following code:
Code β Example 1
grades = pd.Series([55, 64, 52, 76, 49]) print(grades) result = grades.replace(49, 51) print(result)
- Line [1] creates a Series of Lists and saves it to grades.
- Line [2] modifies the failing grade of 49 to a passing grade of 51. The output saves to result.
- Line [3] outputs the resultto the terminal.
Output
| O | 55 | 
| 1 | 64 | 
| 2 | 52 | 
| 3 | 76 | 
| 4 | 51 | 
| dtype: int64 | 
This example shows a DataFrame of three (3) product lines for Rivers Clothing. They want the price of 11.35 changed to 12.95. Run the code below to change the pricing.
Code β Example 2
df = pd.DataFrame({'Tops':     [10.12, 12.23, 11.35],
                   'Tanks':    [11.35, 13.45, 14.98],
                   'Sweats':  [11.35, 21.85, 35.75]})
result = df.replace(11.35, 12.95)
print(result)- Line [1] creates a dictionary of lists and saves it to df.
- Line [2] replaces the value 11.35 to 12.95 for each occurrence. The output saves to result.
- Line [3] outputs the result to the terminal.
Output
| Tops | Tanks | Sweats | |
| 0 | 10.12 | 12.95 | 12.95 | 
| 1 | 12.23 | 13.45 | 21.85 | 
| 2 | 12.95 | 14.98 | 35.75 | 
Code β Example 3
This example shows a DataFrame with two (2) teams. Each team contains three (3) members. This code removes one (1) member from each team and replaces it with quit.
df = pd.DataFrame({'Team-1': ['Barb', 'Todd', 'Taylor'],
                   'Team-2': ['Arch', 'Bart', 'Alex']})
result = df.replace(to_replace=r'^Bar.$', value='quit', regex=True)
print(result)- Line [1] creates a Dictionary of Lists and saves it to df.
- Line [2] replaces any values that start with Barand contain one (1) additional character (.). This match changed to the wordquit. The output saves toresult.
- Line [3] outputs the result to the terminal.
More Pandas DataFrame Methods
Feel free to learn more about the previous and next pandas DataFrame methods (alphabetically) here:
Also, check out the full cheat sheet overview of all Pandas DataFrame methods.
