37. Time Series in Pandas and Python
By Bernd Klein. Last modified: 26 Apr 2023.
Introduction
Our next chapter of our Pandas Tutorial deals with time series. A time series is a series of data points, which are listed (or indexed) in time order. Usually, a time series is a sequence of values, which are equally spaced points in time. Everything which consists of measured data connected with the corresponding time can be seen as a time series. Measurements can be taken irregularly, but in most cases time series consist of fixed frequencies. This means that data is measured or taken in a regular pattern, i.e. for example every 5 milliseconds, every 10 seconds, or very hour. Often time series are plotted as line charts.
In this chapter of our tutorial on Python with Pandas, we will introduce the tools from Pandas dealing with time series. You will learn how to cope with large time series and how modify time series.
Before you continue reading it might be useful to go through our tutorial on the standard Python modules dealing with time processing, i.e. datetime, time and calendar:
Live Python training
Enjoying this page? We offer live Python training courses covering the content of this site.
Time Series in Pandas and Python
We could define a Pandas Series, which is built with an index consisting of time stamps.
import numpy as np
import pandas as pd
from datetime import datetime, timedelta as delta
ndays = 10
start = datetime(2017, 3, 31)
dates = [start - delta(days=x) for x in range(0, ndays)]
values = [25, 50, 15, 67, 70, 9, 28, 30, 32, 12]
ts = pd.Series(values, index=dates)
ts
OUTPUT:
2017-03-31 25 2017-03-30 50 2017-03-29 15 2017-03-28 67 2017-03-27 70 2017-03-26 9 2017-03-25 28 2017-03-24 30 2017-03-23 32 2017-03-22 12 dtype: int64
Let's check the type of the newly created time series:
type(ts)
OUTPUT:
pandas.core.series.Series
What does the index of a time series look like? Let's see:
ts.index
OUTPUT:
DatetimeIndex(['2017-03-31', '2017-03-30', '2017-03-29', '2017-03-28', '2017-03-27', '2017-03-26', '2017-03-25', '2017-03-24', '2017-03-23', '2017-03-22'], dtype='datetime64[ns]', freq=None)
We will create now another time series:
values2 = [32, 54, 18, 61, 72, 19, 21, 33, 29, 17]
ts2 = pd.Series(values2, index=dates)
It is possible to use arithmetic operations on time series like we did with other series. We can for example add the two previously created time series:
ts + ts2
OUTPUT:
2017-03-31 57 2017-03-30 104 2017-03-29 33 2017-03-28 128 2017-03-27 142 2017-03-26 28 2017-03-25 49 2017-03-24 63 2017-03-23 61 2017-03-22 29 dtype: int64
Arithmetic mean between both Series, i.e. the values of the series:
(ts + ts2) / 2
OUTPUT:
2017-03-31 28.5 2017-03-30 52.0 2017-03-29 16.5 2017-03-28 64.0 2017-03-27 71.0 2017-03-26 14.0 2017-03-25 24.5 2017-03-24 31.5 2017-03-23 30.5 2017-03-22 14.5 dtype: float64
As with other series the indices don't have to be the same.
import pandas as pd
from datetime import datetime, timedelta as delta
ndays = 10
start = datetime(2017, 3, 31)
dates = [start - delta(days=x) for x in range(0, ndays)]
start2 = datetime(2017, 3, 26)
dates2 = [start2 - delta(days=x) for x in range(0, ndays)]
values = [25, 50, 15, 67, 70, 9, 28, 30, 32, 12]
values2 = [32, 54, 18, 61, 72, 19, 21, 33, 29, 17]
ts = pd.Series(values, index=dates)
ts2 = pd.Series(values2, index=dates2)
ts + ts2
OUTPUT:
2017-03-17 NaN 2017-03-18 NaN 2017-03-19 NaN 2017-03-20 NaN 2017-03-21 NaN 2017-03-22 84.0 2017-03-23 93.0 2017-03-24 48.0 2017-03-25 82.0 2017-03-26 41.0 2017-03-27 NaN 2017-03-28 NaN 2017-03-29 NaN 2017-03-30 NaN 2017-03-31 NaN dtype: float64
Create Date Ranges
The date_range method of the pandas module can be used to generate a DatetimeIndex:
import pandas as pd
index = pd.date_range('12/24/1970', '01/03/1971')
index
OUTPUT:
DatetimeIndex(['1970-12-24', '1970-12-25', '1970-12-26', '1970-12-27', '1970-12-28', '1970-12-29', '1970-12-30', '1970-12-31', '1971-01-01', '1971-01-02', '1971-01-03'], dtype='datetime64[ns]', freq='D')
We have passed a start and an end date to date_range in our previous example. It is also possible to pass only a start or an end date to the function. In this case, we have to determine the number of periods to generate by setting the keyword parameter 'periods':
index = pd.date_range(start='12/24/1970', periods=4)
print(index)
OUTPUT:
DatetimeIndex(['1970-12-24', '1970-12-25', '1970-12-26', '1970-12-27'], dtype='datetime64[ns]', freq='D')
index = pd.date_range(end='12/24/1970', periods=3)
print(index)
OUTPUT:
DatetimeIndex(['1970-12-22', '1970-12-23', '1970-12-24'], dtype='datetime64[ns]', freq='D')
We can also create time frequencies, which consists only of business days for example by setting the keyword parameter 'freq' to the string 'B':
index = pd.date_range('2017-04-07', '2017-04-13', freq="B")
print(index)
OUTPUT:
DatetimeIndex(['2017-04-07', '2017-04-10', '2017-04-11', '2017-04-12', '2017-04-13'], dtype='datetime64[ns]', freq='B')
In the following example, we create a time frequency which contains the month ends between two dates. We can see that the year 2016 contained the 29th of February, because it was a leap year:
index = pd.date_range('2016-02-25', '2016-07-02', freq="M")
index
OUTPUT:
DatetimeIndex(['2016-02-29', '2016-03-31', '2016-04-30', '2016-05-31', '2016-06-30'], dtype='datetime64[ns]', freq='M')
Other aliases:
Alias | Description |
---|---|
B | business day frequency |
C | custom business day frequency (experimental) |
D | calendar day frequency |
W | weekly frequency |
M | month end frequency |
BM | business month end frequency |
MS | month start frequency |
BMS | business month start frequency |
Q | quarter end frequency |
BQ | business quarter endfrequency |
QS | quarter start frequency |
BQS | business quarter start frequency |
A | year end frequency |
BA | business year end frequency |
AS | year start frequency |
BAS | business year start frequency |
H | hourly frequency |
T | minutely frequency |
S | secondly frequency |
L | milliseonds |
U | microseconds |
index = pd.date_range('2017-02-05', '2017-04-13', freq="W-Mon")
index
OUTPUT:
DatetimeIndex(['2017-02-06', '2017-02-13', '2017-02-20', '2017-02-27', '2017-03-06', '2017-03-13', '2017-03-20', '2017-03-27', '2017-04-03', '2017-04-10'], dtype='datetime64[ns]', freq='W-MON')
Live Python training
Enjoying this page? We offer live Python training courses covering the content of this site.
Upcoming online Courses