Pandas – How to Find DataFrame Row Indices with NaN or Null Values

Problem Formulation and Solution Overview This article will show you how to find DataFrame row indices in Python with NaN or Null (empty) values. To make it more interesting, we have the following scenario: Rivers Clothing has given you a CSV file that requires a clean-up to make it usable for Payroll and Data Analysis. … Read more

How to Apply a Function to Each Cell in a Pandas DataFrame?

Problem Formulation Given the following DataFrame df: πŸ’¬ Challenge: How to apply a function f to each cell in the DataFrame? For example, you may want to apply a function that replaces all odd values with the value ‘odd’. Solution: DataFrame applymap() The Pandas DataFrame df.applymap() method returns a new DataFrame where the function f … Read more

Python Convert Fixed Width File to CSV

What is a Fixed-Width File? πŸ’‘ Definition: A fixed-width text file contains data that is structured in rows and columns. Each row contains one data entry consisting of multiple values (one value per column). Each column has a fixed width, i.e., the same number of characters, restricting the maximum data size per column. Example of … Read more

How to Convert Tab-Delimited File to CSV in Python?

The easiest way to convert a tab-delimited values (TSV) file to a comma-separated values (CSV) file is to use the following three lines of code: import pandas as pd df = pd.read_csv(‘my_file.txt’, sep=’\t’, header=None) df.to_csv(‘my_file.csv’, header=None) We’ll explain this and other approaches in more detail next—scroll down to Method 3 for this exact method. Problem … Read more

Python CSV to UTF-8

This article concerns the conversion and handling of CSV file formats in combination with the UTF-8 encoding standard. πŸ’‘ The Unicode Transformation Format 8-Bit (UTF-8) is a variable-width character encoding used for electronic communication. UTF-8 can encode more than 1 million (more or less weird) characters using 1 to 4 byte code units. Example UTF-8 … Read more