Before any data manipulation can occur, one (1) new library will require installation.
- The Pandas library enables access to/from a DataFrame.
To install this library, navigate to an IDE terminal. At the command prompt (
$), execute the code below. For the terminal used in this example, the command prompt is a dollar sign (
$). Your terminal prompt may be different.
💡 Note: The
pytz comes packaged with pandas and does not require installation. However, this library is needed for the
tz_ localize() and
tz_convert() methods to work.
$ pip install pandas
<Enter> key on the keyboard to start the installation process.
If the installation was successful, a message displays in the terminal indicating the same.
Feel free to view the PyCharm installation guide for the required library.
Add the following code to the top of each code snippet. This snippet will allow the code in this article to run error-free.
import pandas as pd import pytz
resample() method is useful for manipulating the frequency and time-series data.
This DataFrame/Series must contain a
datetime-like index, for example:
- the class must pass a date-like series/index to the
The syntax for this method is as follows:
DataFrame.resample(rule, axis=0, closed=None, label=None, convention='start', kind=None, loffset=None, base=None, on=None, level=None, origin='start_day', offset=None)
|This parameter is the offset (string/object) representing a target conversion.|
|If zero (0) or index is selected, apply to each column. Default 0.|
If one (1) apply to each row.
|This parameter determines which side of the bin interval is closed. Default |
|This parameter determines which bin edge to label bucket. Default |
|This parameter is the |
|This parameter is a timestamp/period and is for the |
|Not in use since v1.1.0. Add this to |
|Not in use since v1.1.0. Use |
|If a DataFrame, the |
|A datetime level in a |
|The timestamp to adjust the grouping. The origin time-zone must match the index. If a string, one of the following: |
|This parameter is the offset |
Rivers Clothing is having a 3-hour blow-out sale for a new line they have introduced, scarfs. This example resamples the sales data and adds up the total number of scarf sales per hour.
df = pd.read_csv('rivers.csv', parse_dates=['date'], index_col=['date']) print(df) result = df.resample('1H').sum() print(result)
- Line  reads in a CSV file, parses the date column, and sets this column as the index. The output saves to
- Line  outputs the DataFrame to the terminal.
- Line  resamples the data by grouping the total scarf sales by the hour. The output saves to
- Line  outputs the result to the terminal.
More Pandas DataFrame Methods
Feel free to learn more about the previous and next pandas DataFrame methods (alphabetically) here:
Also, check out the full cheat sheet overview of all Pandas DataFrame methods.
At university, I found my love of writing and coding. Both of which I was able to use in my career.
During the past 15 years, I have held a number of positions such as:
In-house Corporate Technical Writer for various software programs such as Navision and Microsoft CRM
Corporate Trainer (staff of 30+)
Implementation Specialist for Navision and Microsoft CRM
Senior PHP Coder