Unexpected change in predictions on close to identical data

The monthly forecast results change by several tens of percents when the training data has changed by only 10^-11.

I wanted to predict the next 12 months of a monthly time series from a 48-months history.
The order of magnitude of the data is between 10^6 and 10^9, with 6 digits after the decimal point.

Although reducing the numerical data precision by 1 digit almost does not change the values of the data, it unexpectedly changes the forecast results by several tens of percents. The Prophet model is way too sensitive to the numerical data precision, there is no reason for such a behavior.

Here is a table results to sum up the observations :

Change in numerical data precision	Resulting relative difference in the data	Change in the forecast
From 6-decimals to 5-decimals	at most 10^-12	upward change : +12% in average, up to 24%
From 5-decimals to 4-decimals	at most 10^-11	downward change : -9% in average, up to -15%

Steps to reproduce the bug :
Python 3.10.12
dependencies =
prophet==1.1.7

Run the following Python code to reproduce the results (with the input file Series_6-decimals.csv) :

import pandas as pd
from prophet import Prophet
import os

filename = "Series_6-decimals.csv"

params = {
    'growth': 'linear',
    'seasonality_mode': 'multiplicative',
    'changepoint_prior_scale': 0.05,
    'seasonality_prior_scale': 10.0,
    'weekly_seasonality': False,
    'daily_seasonality': False,
    'yearly_seasonality': 'auto',
}


df = pd.read_csv(os.path.join(os.path.dirname(__file__), filename), sep=';')
df_5 = df.round(5)
df_4 = df.round(4)

print("maximum absolute difference between df_5 and df:", max(abs(df_5['y']-df['y']))) # maximum absolute difference
print("maximum absolute relative difference between df_5 and df:", max(abs((df_5['y']-df['y'])/df['y']))) # maximum absolute relative difference
print("maximum absolute difference between df_4 and df:", max(abs(df['y']-df_4['y']))) # maximum absolute difference
print("maximum absolute relative difference between df_4 and df:", max(abs((df_4['y']-df['y'])/df['y']))) # maximum absolute relative difference


m = Prophet(**params)
m.fit(df, seed=1000)
future = m.make_future_dataframe(periods = 12, freq='MS', include_history=False)
forecast = m.predict(future)
y = forecast[['yhat']].values
print("sum of 'y' forecast:", y.sum()) # 2062620171.0109873
print("1st point of 'y' forecast:", y[0]) # 61602096.36060803


m = Prophet(**params)
m.fit(df_5, seed=1000)
future = m.make_future_dataframe(periods = 12, freq='MS', include_history=False)
forecast = m.predict(future)
y_5 = forecast[['yhat']].values
print("sum of 'y_5' forecast:", y_5.sum()) # 2305764926.4040384 --> FIXME : +12% compared to the sum of the original forecast ?!
print("1st point of 'y_5' forecast:", y_5[0]) # 76086131.36506273 --> FIXME : +23% compared to the original y[0] ?!


m = Prophet(**params)
m.fit(df_4, seed=1000)
future = m.make_future_dataframe(periods = 12, freq='MS', include_history=False)
forecast = m.predict(future)
y_4 = forecast[['yhat']].values
print("sum of 'y_4' forecast: ", y_4.sum()) # 2096602967.4861932
print("1st point of 'y_4' forecast: ", y_4[0]) # 64983152.06722656


print("relative difference in sum of predictions between y_5 and y in %:", (y_5.sum()-y.sum())/y.sum()*100, "%") # +11.8%
print("relative difference in sum of predictions between y_4 and y_5 in %:", (y_4.sum()-y_5.sum())/y_5.sum()*100, "%") # -9.1%
print("relative difference in prediction of 1st point between y_5 and y in %:",  (y_5[0]-y[0])/y[0]*100, "%") # +23.5%
print("relative difference in prediction of 1st point between y_4 and y_5 in %:", (y_4[0]-y_5[0])/y_5[0]*100, "%") # -14.6%

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unexpected change in predictions on close to identical data #2676

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unexpected change in predictions on close to identical data #2676

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions