Pandas DataFrame Methods [Cheat Sheet]

5/5 - (1 vote)

The following table provides you with an overview of Pandas DataFrame methods — and where you can learn more about the specific method.

ALL LINKS OPEN IN A NEW TAB!

df.abs()Return a Series/DataFrame with absolute numeric value of each element.
df.add_prefix()Prefix labels with string prefix.
df.add_suffix()Suffix labels with string suffix.
df.align()Align two objects on their axes with the specified join method.
df.all()Return whether all elements are True, potentially over an axis.
df.any()Return whether any element is True, potentially over an axis.
df.append()Append rows of other to the end of caller, returning a new object.
df.asfreq()Convert time series to specified frequency.
df.asof()Return the last row(s) without any NaNs before where.
df.assign()Assign new columns to a DataFrame.
df.at_time()Select values at particular time of day (e.g., 9:30AM).
df.backfill()Synonym for DataFrame.fillna with method='bfill'.
df.between_time()Select values between particular times of the day (e.g., 9:00-9:30 AM).
df.clip()Trim values at input threshold(s).
df.compare()Compare to another DataFrame and show the differences.
df.corr()Compute pairwise correlation of columns, excluding NA/null values.
df.corrwith()Compute pairwise correlation.
df.count()Count non-NA cells for each column or row.
df.cov()Compute pairwise covariance of columns, excluding NA/null values.
df.cummax()Return cumulative maximum over a DataFrame or Series axis.
df.cummin()Return cumulative minimum over a DataFrame or Series axis.
df.cumprod()Return cumulative product over a DataFrame or Series axis.
df.cumsum()Return cumulative sum over a DataFrame or Series axis.
df.describe()Generate descriptive statistics.
df.diff()First discrete difference of element.
df.drop_duplicates()Return DataFrame with duplicate rows removed.
df.droplevel()Return Series/DataFrame with requested index / column level(s) removed.
df.drop()Drop specified labels from rows or columns.
df.dropna()Remove missing values.
df.duplicated()Return boolean Series denoting duplicate rows.
df.equals()Test whether two objects contain the same elements.
df.eval()Evaluate a string describing operations on DataFrame columns.
df.explode()Transform each element of a list-like to a row, replicating index values.
df.fillna()Fill NA/NaN values using the specified method.
df.filter()Subset the dataframe rows or columns according to the specified index labels.
df.first_valid_index()Return index for first non-NA value or None, if no NA value is found.
df.first()Select initial periods of time series data based on a date offset.
df.from_dict()Construct DataFrame from dict of array-like or dicts.
df.from_records()Convert structured or record ndarray to DataFrame.
df.head()Return the first n rows.
df.interpolate()Fill NaN values using an interpolation method.
df.isna()Detect missing values.
df.join()Join columns of another DataFrame.
df.kurtosis()Return unbiased kurtosis over requested axis.
df.last_valid_index()Return index for last non-NA value or None, if no NA value is found.
df.last()Select final periods of time series data based on a date offset.
df.mad()Return the mean absolute deviation of the values over the requested axis.
df.max()Return the maximum of the values over the requested axis.
df.mean()Return the mean of the values over the requested axis.
df.median()Return the median of the values over the requested axis.
df.melt()Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.
df.merge()Merge DataFrame or named Series objects with a database-style join.
df.min()Return the minimum of the values over the requested axis.
df.mode()Get the mode(s) of each element along the selected axis.
df.nlargest()Return the first n rows ordered by columns in descending order.
df.notna()Detect existing (non-missing) values.
df.nsmallest()Return the first n rows ordered by columns in ascending order.
df.pad()Synonym for DataFrame.fillna with method='ffill'.
df.pct_change()Percentage change between the current and a prior element.
df.pivot_table()Create a spreadsheet-style pivot table as a DataFrame.
df.pivot()Return reshaped DataFrame organized by given index/column values.
df.plot.area()Draw a stacked area plot.
df.plot.bar()Vertical bar plot.
df.plot.barh()Make a horizontal bar plot.
df.plot.box()Make a box plot of the DataFrame columns.
df.plot.density()Generate Kernel Density Estimate plot using Gaussian kernels.
df.plot.hexbin()Generate a hexagonal binning plot.
df.plot.hist()Draw one histogram of the DataFrame’s columns.
df.plot.pie()Generate a pie plot.
df.plot()| PlotAccessor(data)
df.prod()Return the product of the values over the requested axis.
df.quantile()Return values at the given quantile over requested axis.
df.rank()Compute numerical data ranks (1 through n) along axis.
df.reorder_levels()Rearrange index levels using input order. May not drop or duplicate levels.
df.replace()Replace values given in to_replace with value.
df.resample()Resample time-series data.
df.reset_index()Reset the index, or a level of it.
df.round()Round a DataFrame to a variable number of decimal places.
df.sample()Return a random sample of items from an axis of object.
df.set_axis()Assign desired index to given axis.
df.set_index()Set the DataFrame index using existing columns.
df.shift()Shift index by desired number of periods with an optional time freq.
df.slice_shift()Equivalent to shift without copying data.
df.sort_index()Sort object by labels (along an axis).
df.sort_values()Sort by the values along either axis.
df.squeeze()Squeeze 1 dimensional axis objects into scalars.
df.stack()Stack the prescribed level(s) from columns to index.
df.swapaxes()Interchange axes and swap values axes appropriately.
df.swaplevel()Swap levels i and j in a MultiIndex.
df.transpose()Transpose index and columns.
df.take()Return the elements in the given positional indices along an axis.
df.to_bgq()
df.to_clipboard()Copy object to the system clipboard.
df.to_coo()
df.to_csv()Write object to a comma-separated values (csv) file.
df.to_dict()Convert the DataFrame to a dictionary.
df.to_excel()Write object to an Excel sheet.
df.to_feather()Write a DataFrame to the binary Feather format.
df.to_hdf()Write the contained data to an HDF5 file using HDFStore.
df.to_html()Render a DataFrame as an HTML table.
df.to_json()Convert the object to a JSON string.
df.to_latex()Render object to a LaTeX tabular, longtable, or nested table/tabular.
df.to_markdown()Print DataFrame in Markdown-friendly format.
df.to_parquet()Write a DataFrame to the binary parquet format.
df.to_period()Convert DataFrame from DatetimeIndex to PeriodIndex.
df.to_pickles()
df.to_records()Convert DataFrame to a NumPy record array.
df.to_sql()Write records stored in a DataFrame to a SQL database.
df.to_stata()Export DataFrame object to Stata dta format.
df.to_string()Render a DataFrame to a console-friendly tabular output.
df.to_timestamp()Cast to DatetimeIndex of timestamps, at beginning of period.
df.to_xarray()Return an xarray object from the pandas object.
df.to_xml()Render a DataFrame to an XML document.
df.truncate()Truncate a Series or DataFrame before and after some index value.
df.tz_convert()Convert tz-aware axis to target time zone.
df.tz_localize()Localize tz-naive index of a Series or DataFrame to target time zone.
df.unstack()Pivot a level of the (necessarily hierarchical) index labels.
df.update()Modify in place using non-NA values from another DataFrame.

Reference: