The following table provides you with an overview of Pandas DataFrame methods — and where you can learn more about the specific method.
ALL LINKS OPEN IN A NEW TAB!
df.abs() | Return a Series/DataFrame with absolute numeric value of each element. |
df.add_prefix() | Prefix labels with string prefix . |
df.add_suffix() | Suffix labels with string suffix . |
df.align() | Align two objects on their axes with the specified join method. |
df.all() | Return whether all elements are True, potentially over an axis. |
df.any() | Return whether any element is True, potentially over an axis. |
df.append() | Append rows of other to the end of caller, returning a new object. |
df.asfreq() | Convert time series to specified frequency. |
df.asof() | Return the last row(s) without any NaNs before where . |
df.assign() | Assign new columns to a DataFrame. |
df.at_time() | Select values at particular time of day (e.g., 9:30AM). |
df.backfill() | Synonym for DataFrame.fillna with method='bfill' . |
df.between_time() | Select values between particular times of the day (e.g., 9:00-9:30 AM). |
df.clip() | Trim values at input threshold(s). |
df.compare() | Compare to another DataFrame and show the differences. |
df.corr() | Compute pairwise correlation of columns, excluding NA/null values. |
df.corrwith() | Compute pairwise correlation. |
df.count() | Count non-NA cells for each column or row. |
df.cov() | Compute pairwise covariance of columns, excluding NA/null values. |
df.cummax() | Return cumulative maximum over a DataFrame or Series axis. |
df.cummin() | Return cumulative minimum over a DataFrame or Series axis. |
df.cumprod() | Return cumulative product over a DataFrame or Series axis. |
df.cumsum() | Return cumulative sum over a DataFrame or Series axis. |
df.describe() | Generate descriptive statistics. |
df.diff() | First discrete difference of element. |
df.drop_duplicates() | Return DataFrame with duplicate rows removed. |
df.droplevel() | Return Series/DataFrame with requested index / column level(s) removed. |
df.drop() | Drop specified labels from rows or columns. |
df.dropna() | Remove missing values. |
df.duplicated() | Return boolean Series denoting duplicate rows. |
df.equals() | Test whether two objects contain the same elements. |
df.eval() | Evaluate a string describing operations on DataFrame columns. |
df.explode() | Transform each element of a list-like to a row, replicating index values. |
df.fillna() | Fill NA/NaN values using the specified method. |
df.filter() | Subset the dataframe rows or columns according to the specified index labels. |
df.first_valid_index() | Return index for first non-NA value or None, if no NA value is found. |
df.first() | Select initial periods of time series data based on a date offset. |
df.from_dict() | Construct DataFrame from dict of array-like or dicts. |
df.from_records() | Convert structured or record ndarray to DataFrame. |
df.head() | Return the first n rows. |
df.interpolate() | Fill NaN values using an interpolation method. |
df.isna() | Detect missing values. |
df.join() | Join columns of another DataFrame. |
df.kurtosis() | Return unbiased kurtosis over requested axis. |
df.last_valid_index() | Return index for last non-NA value or None, if no NA value is found. |
df.last() | Select final periods of time series data based on a date offset. |
df.mad() | Return the mean absolute deviation of the values over the requested axis. |
df.max() | Return the maximum of the values over the requested axis. |
df.mean() | Return the mean of the values over the requested axis. |
df.median() | Return the median of the values over the requested axis. |
df.melt() | Unpivot a DataFrame from wide to long format, optionally leaving identifiers set. |
df.merge() | Merge DataFrame or named Series objects with a database-style join. |
df.min() | Return the minimum of the values over the requested axis. |
df.mode() | Get the mode(s) of each element along the selected axis. |
df.nlargest() | Return the first n rows ordered by columns in descending order. |
df.notna() | Detect existing (non-missing) values. |
df.nsmallest() | Return the first n rows ordered by columns in ascending order. |
df.pad() | Synonym for DataFrame.fillna with method='ffill' . |
df.pct_change() | Percentage change between the current and a prior element. |
df.pivot_table() | Create a spreadsheet-style pivot table as a DataFrame. |
df.pivot() | Return reshaped DataFrame organized by given index/column values. |
df.plot.area() | Draw a stacked area plot. |
df.plot.bar() | Vertical bar plot. |
df.plot.barh() | Make a horizontal bar plot. |
df.plot.box() | Make a box plot of the DataFrame columns. |
df.plot.density() | Generate Kernel Density Estimate plot using Gaussian kernels. |
df.plot.hexbin() | Generate a hexagonal binning plot. |
df.plot.hist() | Draw one histogram of the DataFrame’s columns. |
df.plot.pie() | Generate a pie plot. |
df.plot() | | PlotAccessor(data) |
df.prod() | Return the product of the values over the requested axis. |
df.quantile() | Return values at the given quantile over requested axis. |
df.rank() | Compute numerical data ranks (1 through n) along axis. |
df.reorder_levels() | Rearrange index levels using input order. May not drop or duplicate levels. |
df.replace() | Replace values given in to_replace with value . |
df.resample() | Resample time-series data. |
df.reset_index() | Reset the index, or a level of it. |
df.round() | Round a DataFrame to a variable number of decimal places. |
df.sample() | Return a random sample of items from an axis of object. |
df.set_axis() | Assign desired index to given axis. |
df.set_index() | Set the DataFrame index using existing columns. |
df.shift() | Shift index by desired number of periods with an optional time freq . |
df.slice_shift() | Equivalent to shift without copying data. |
df.sort_index() | Sort object by labels (along an axis). |
df.sort_values() | Sort by the values along either axis. |
df.squeeze() | Squeeze 1 dimensional axis objects into scalars. |
df.stack() | Stack the prescribed level(s) from columns to index. |
df.swapaxes() | Interchange axes and swap values axes appropriately. |
df.swaplevel() | Swap levels i and j in a MultiIndex . |
df.transpose() | Transpose index and columns. |
df.take() | Return the elements in the given positional indices along an axis. |
df.to_bgq() | |
df.to_clipboard() | Copy object to the system clipboard. |
df.to_coo() | |
df.to_csv() | Write object to a comma-separated values (csv) file. |
df.to_dict() | Convert the DataFrame to a dictionary. |
df.to_excel() | Write object to an Excel sheet. |
df.to_feather() | Write a DataFrame to the binary Feather format. |
df.to_hdf() | Write the contained data to an HDF5 file using HDFStore. |
df.to_html() | Render a DataFrame as an HTML table. |
df.to_json() | Convert the object to a JSON string. |
df.to_latex() | Render object to a LaTeX tabular, longtable, or nested table/tabular. |
df.to_markdown() | Print DataFrame in Markdown-friendly format. |
df.to_parquet() | Write a DataFrame to the binary parquet format. |
df.to_period() | Convert DataFrame from DatetimeIndex to PeriodIndex. |
df.to_pickles() | |
df.to_records() | Convert DataFrame to a NumPy record array. |
df.to_sql() | Write records stored in a DataFrame to a SQL database. |
df.to_stata() | Export DataFrame object to Stata dta format. |
df.to_string() | Render a DataFrame to a console-friendly tabular output. |
df.to_timestamp() | Cast to DatetimeIndex of timestamps, at beginning of period. |
df.to_xarray() | Return an xarray object from the pandas object. |
df.to_xml() | Render a DataFrame to an XML document. |
df.truncate() | Truncate a Series or DataFrame before and after some index value. |
df.tz_convert() | Convert tz-aware axis to target time zone. |
df.tz_localize() | Localize tz-naive index of a Series or DataFrame to target time zone. |
df.unstack() | Pivot a level of the (necessarily hierarchical) index labels. |
df.update() | Modify in place using non-NA values from another DataFrame. |
Reference: