# 5 Best Ways to Replace NaN with Zero and Fill Positive Infinity Values in Python

Rate this post

π‘ Problem Formulation: Data processing in Python often requires handling of missing (NaN) or infinite (inf) values. Specifically, we may need to replace ‘NaN’ with 0 and set ‘inf’ to a finite value, such as the maximum float value, for computational purposes. For example, given an input like `[NaN, 1, inf]`, the desired output would be `[0, 1, MAX_FLOAT]`.

## Method 1: Using numpy’s nan_to_num()

This method leverages the numpy library, which provides a function `nan_to_num()` that can replace ‘NaN’ with 0 and ‘inf’ with a very large number. It is efficient and well suited for operations on numpy arrays or pandas dataframes.

Here’s an example:

```import numpy as np

arr = np.array([np.nan, 1, np.inf])
new_arr = np.nan_to_num(arr)

print(new_arr)```

Output:
`[0. 1. 1.7976931348623157e+308]`

This code snippet creates a numpy array with ‘NaN’ and ‘inf’ values. By invoking `np.nan_to_num()` on this array, ‘NaN’ is replaced with 0, and ‘inf’ is replaced with the largest possible number that can be represented, which is close to Python’s `float('inf')`.

## Method 2: List Comprehension with math.isinf()

List comprehension offers a Pythonic and readable approach to iterate over a list and replace ‘NaN’ and ‘inf’ values. Using the `math` module’s `isinf()` method combined with list comprehension, one can effectively handle these values in a list structure.

Here’s an example:

```import math

lst = [float('nan'), 1, float('inf')]
new_lst = [0 if math.isnan(x) else (maxsize if math.isinf(x) else x) for x in lst]

print(new_lst)```

Output:
`[0, 1, 9223372036854775807]`

The code uses list comprehension to iterate over each element. The ternary conditional expression inside replaces ‘NaN’ with 0 and ‘inf’ with `sys.maxsize`, which is a large integer, as a stand-in for the maximum float value.

## Method 3: pandas.DataFrame.replace()

For data scientists working with pandas dataframes, `pandas.DataFrame.replace()` is the go-to method. It easily replaces given values with specified ones and can handle both ‘NaN’ and ‘inf’ effortlessly within a pandas context.

Here’s an example:

```import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [np.nan, 1, np.inf]})
df.replace([np.nan, np.inf], [0, np.finfo('float32').max], inplace=True)

print(df)```

Output:
` A 0 0.0 1 1.0 2 3.4028235e+38`

In this snippet, a pandas dataframe is created and the `replace()` method is used to swap ‘NaN’ with 0 and ‘inf’ with the maximum float value representable by a 32-bit float.

## Method 4: Using Conditional Expressions

Conditional expressions are a more general Python feature that allows for inline replacements based on a condition. This method can be used with any iterable and is suitable for situations where numpy or pandas are not used.

Here’s an example:

```seq = [float('nan'), 1, float('inf')]
new_seq = [0 if x != x else (sys.float_info.max if x == float('inf') else x) for x in seq]

print(new_seq)```

Output:
`[0, 1, 1.7976931348623157e+308]`

Each element in `seq` is inspected: if it is ‘NaN’ (not equal to itself), it is replaced with 0; if it is ‘inf’, it is replaced with `sys.float_info.max`. Otherwise, it remains unchanged.

## Bonus One-Liner Method 5: Using lambda and map()

If you prefer functional programming, Python’s `map()` function with a lambda can be used to replace ‘NaN’ and ‘inf’ in an elegant one-liner. It’s concise but might be less readable to those not familiar with functional paradigms.

Here’s an example:

```data = [float('nan'), 1, float('inf')]
clean_data = list(map(lambda x: 0 if math.isnan(x) else (sys.maxsize if math.isinf(x) else x), data))

print(clean_data)```

Output:
`[0, 1, 9223372036854775807]`

The lambda function within `map()` transforms each item in the `data` list using the same logic as in Method 2, with ‘NaN’ becoming 0 and ‘inf’ becoming `sys.maxsize`.

## Summary/Discussion

• Method 1: numpy’s nan_to_num(). Strengths: Fast and vectorized, perfect for numpy arrays and pandas. Weaknesses: Requires numpy, not for plain Python lists.
• Method 2: List Comprehension with math.isinf(). Strengths: Pythonic and clear syntax, no external libraries required. Weaknesses: May become inefficient with very large lists.
• Method 3: pandas.DataFrame.replace(). Strengths: Designed for dataframes, powerful for data manipulation. Weaknesses: Only suitable for pandas dataframes, not regular lists.
• Method 4: Using Conditional Expressions. Strengths: General Python feature, usable with any iterables. Weaknesses: Can become complex for readers unfamiliar with ternary expressions.
• Bonus Method 5: Using lambda and map(). Strengths: Elegant one-liner, functional programming style. Weaknesses: May be harder to understand, less readable for some programmers.