Time Series Analysis & Visualization in Python

Last Updated : 18 May, 2025

Time series data consists of sequential data points recorded over time which is used in industries like finance, pharmaceuticals, social media and research. Analyzing and visualizing this data helps us to find trends and seasonal patterns for forecasting and decision-making. In this article, we will see more about Time Series Analysis and Visualization in depth.

What is Time Series Data Analysis?

Time series data analysis involves studying data points collected in chronological time order to identify current trends, patterns and other behaviors. This helps extract actionable insights and supports accurate forecasting and decision-making.

Key Concepts in Time Series Analysis

Trend: It represents the general direction in which a time series is moving over an extended period. It checks whether the values are increasing, decreasing or staying relatively constant.
Seasonality: Seasonality refers to repetitive patterns or cycles that occur at regular intervals within a time series corresponding to specific time units like days, weeks, months or seasons.
Moving average: It is used to smooth out short-term fluctuations and highlight longer-term trends or patterns in the data.
Noise: It represents the irregular and unpredictable components in a time series that do not follow a pattern.
Differencing: It is used to make the difference in values of a specified interval. By default it’s 1 but we can specify different values for plots.
Stationarity: A stationary time series is statistical properties such as mean, variance and autocorrelation remain constant over time.
Order: The order of differencing refers to the number of times the time series data needs to be differenced to achieve stationarity.
Autocorrelation: Autocorrelation is a statistical method used in time series analysis to quantify the degree of similarity between a time series and a lagged version of itself.
Resampling: Resampling is a technique in time series analysis that is used for changing the frequency of the data observations.

Types of Time Series Data

Time series data can be classified into two sections:

Continuous Time Series: Data recorded at regular intervals with a continuous range of values like temperature, stock prices, Sensor Data, etc.
Discrete Time Series: Data with distinct values or categories recorded at specific time points like counts of events, categorical statuses, etc.

Visualization Approaches

Use line plots or area charts for continuous data to highlight trends and fluctuations.
Use bar charts or histograms for discrete data to show frequency or distribution across categories.

Practical Time Series Visualization with Python

We will be using the stock dataset which you can download from here. Lets implement this step by step:

Step 1: Installing and Importing Libraries

We will be using Numpy, Pandas, seaborn and Matplotlib libraries.

Python

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from statsmodels.graphics.tsaplots import plot_acf
from statsmodels.tsa.stattools import adfuller

Step 2: Loading the Dataset

Here we will load the dataset and use the parse_dates parameter to convert the Date column to the DatetimeIndex format.

Python

df = pd.read_csv("/content/stock_data.csv", 
                 parse_dates=True, 
                 index_col="Date")
df.head()

Output:

Step 3: Cleaning of Data

We will drop columns from the dataset that are not important for our visualization.

Python

df.drop(columns='Unnamed: 0', inplace =True)
df.head()

Output:

Step 4: Plotting High Stock Prices

Since the volume column is of continuous data type we will use line graph to visualize it.

sns.lineplot(data=df, x=df.index, y='High', label='High Price', color='blue'): Plots High prices over time using the datetime index on x-axis.

Python

sns.set(style="whitegrid") 

plt.figure(figsize=(12, 6))
sns.lineplot(data=df, x='Date', y='High', label='High Price', color='blue')

plt.xlabel('Date')
plt.ylabel('High')
plt.title('Share Highest Price Over Time')

plt.show()

Output:

Step 5: Resampling Data

To better understand the trend of the data we will use the resampling method which provide a clearer view of trends and patterns when we are dealing with daily data.

df_resampled = df.resample('M').mean(numeric_only=True): Resamples data to monthly frequency and calculates the mean of all numeric columns for each month.

Python

df_resampled = df.resample('M').mean(numeric_only=True) 

sns.set(style="whitegrid") 

plt.figure(figsize=(12, 6))  
sns.lineplot(data=df_resampled, x=df_resampled.index, y='High', label='Month Wise Average High Price', color='blue')

plt.xlabel('Date (Monthly)')
plt.ylabel('High')
plt.title('Monthly Resampling Highest Price Over Time')

plt.show()

Output:

Step 6: Detecting Seasonality with Autocorrelation

We will detect Seasonality using the autocorrelation function (ACF) plot. Peaks at regular intervals in the ACF plot suggest the presence of seasonality.

Python

if 'Date' not in df.columns:
    print("'Date' is already the index or not present in the DataFrame.")
else:
    df.set_index('Date', inplace=True)

plt.figure(figsize=(12, 6))
plot_acf(df['Volume'], lags=40) 
plt.xlabel('Lag')
plt.ylabel('Autocorrelation')
plt.title('Autocorrelation Function (ACF) Plot')
plt.show()

Output:

time5 — Detecting Seasonality Using Auto Correlation

There is no seasonality in our data.

Step 7: Testing Stationarity with ADF test

We will perform the ADF test to formally test for stationarity.

Python

from statsmodels.tsa.stattools import adfuller

result = adfuller(df['High'])
print('ADF Statistic:', result[0])
print('p-value:', result[1])
print('Critical Values:', result[4])

Output:

Based on the ADF Statistic we accept the null hypothesis and check that the data does not appear to be stationary according to the Augmented Dickey-Fuller test.
This suggests that differencing or other transformations may be needed to achieve stationarity before applying certain time series models.

Step 8: Differencing to Achieve Stationarity

Differencing involves subtracting the previous observation from the current observation to remove trends or seasonality.

Python

df['high_diff'] = df['High'].diff()

plt.figure(figsize=(12, 6))
plt.plot(df['High'], label='Original High', color='blue')
plt.plot(df['high_diff'], label='Differenced High', linestyle='--', color='green')
plt.legend()
plt.title('Original vs Differenced High')
plt.show()

Output:

Step 9: Smoothing Data with Moving Average

df['High'].diff(): helps in calculating the difference between consecutive values in the High column. This differencing operation is used to transform a time series into a new series that represents the changes between consecutive observations.

Python

window_size = 120
df['high_smoothed'] = df['High'].rolling(window=window_size).mean()

plt.figure(figsize=(12, 6))

plt.plot(df['High'], label='Original High', color='blue')
plt.plot(df['high_smoothed'], label=f'Moving Average (Window={window_size})', linestyle='--', color='orange')

plt.xlabel('Date')
plt.ylabel('High')
plt.title('Original vs Moving Average')
plt.legend()
plt.show()

Output:

This calculates the moving average of the High column with a window size of 120(A quarter), creating a smoother curve in the high_smoothed series. The plot compares the original High values with the smoothed version.

Step10: Original Data Vs Differenced Data

Printing the original and differenced data side by side we get:

Python

df_combined = pd.concat([df['High'], df['high_diff']], axis=1)

print(df_combined.head())

Output:

time9 — Original Data Vs Differenced Data

Hence the high_diff column represents the differences between consecutive high values. The first value of high_diff is NaN because there is no previous value to calculate the difference.

As there is a NaN value we will drop that proceed with our test:

Python

df.dropna(subset=['high_diff'], inplace=True)
df['high_diff'].head()

Output:

time10 — Differences between consecutive high values

After that if we conduct the ADF test:

Python

from statsmodels.tsa.stattools import adfuller

result = adfuller(df['high_diff'])
print('ADF Statistic:', result[0])
print('p-value:', result[1])
print('Critical Values:', result[4])

Output:

Based on the ADF Statistic we reject the null hypothesis and conclude that we have enough evidence to reject the null hypothesis.

Mastering these visualization and analysis techniques is an important step in working effectively with time-dependent data.

You can download the source code from here.

How to deal with missing values in a Timeseries in Python?

neelutiwari

Improve

Article Tags :

Time Series Analysis & Visualization in Python

What is Time Series Data Analysis?

Key Concepts in Time Series Analysis

Types of Time Series Data

Visualization Approaches

Practical Time Series Visualization with Python

Step 1: Installing and Importing Libraries

Step 2: Loading the Dataset

Step 3: Cleaning of Data

Step 4: Plotting High Stock Prices

Step 5: Resampling Data

Step 6: Detecting Seasonality with Autocorrelation

Step 7: Testing Stationarity with ADF test

Step 8: Differencing to Achieve Stationarity

Step 9: Smoothing Data with Moving Average

Step10: Original Data Vs Differenced Data

Similar Reads

Introduction to Data Analysis

Data Analysis Libraries

Data Visulization Libraries

Exploratory Data Analysis (EDA)

Data Preprocessing

Data Transformation

Time Series Data Analysis

Case Studies and Projects

Thank You!

What kind of Experience do you want to share?