Time Series forecasting in PBI is based on the thumb technique of smoothening time series prediction called Exponential Smoothening (ES). ES of time series data assigns exponentially decreasing weights for newest to oldest observations. ES is also be used for time series with trend and seasonality. This model is usually used to make short term forecasts, as longer-term forecasts using this technique can be quite unreliable. Collectively, the methods are sometimes referred to as ETS models, referring to explicit modeling for errors, Trend and Seasonality.

**Types of Exponential Smoothening models in PBI **

- Simple exponential smoothening : – uses a weighted moving average with exponentially decreasing weights
- Holt’s trend-corrected double exponential smoothening :- usually more reliable for handling data that shows trends, compared to the single procedure
- Triple exponential smoothening :- usually more reliable for parabolic trends or data that shows trends and seasonality

** **

**Handling the missing values**

In some cases, your timeline might be missing some historical values. Does this pose a problem?

Not usually – the forecasting chart can automatically fill in some values to provide a forecast. If the total number of missing values is less than 40% of the total number of data points, the algorithm will perform linear interpolation prior to performing the forecast.

If more than 40% of your values are missing, try to fill in more data, or perhaps aggregate values into larger time units, to ensure that a more complete data series is available for analysis.

Reference Code ::-

import requests

import pandas as pd

import json

import matplotlib.pyplot as plt

import matplotlib.dates as mdates

from statsmodels.tsa.holtwinters import SimpleExpSmoothing, Holt

import numpy as np

%matplotlib inline

plt.style.use(‘Solarize_Light2’)

r = requests.get(‘https://datamarket.com/api/v1/list.json?ds=22qx’)

jobj = json.loads(r.text[18:-1])

data = jobj[0][‘data’]

df = pd.DataFrame(data, columns=[‘time’,’data’]).set_index(‘time’)

train = df.iloc[100:-10, :]

test = df.iloc[-10:, :]

train.index = pd.to_datetime(train.index)

test.index = pd.to_datetime(test.index)

pred = test.copy()

model = SimpleExpSmoothing(np.asarray(train[‘data’]))

model._index = pd.to_datetime(train.index)

fit1 = model.fit()

pred1 = fit1.forecast(9)

fit2 = model.fit(smoothing_level=.2)

pred2 = fit2.forecast(9)

fit3 = model.fit(smoothing_level=.5)

pred3 = fit3.forecast(9)

fig, ax = plt.subplots(figsize=(12, 6))

ax.plot(train.index[150:], train.values[150:])

ax.plot(test.index, test.values, color=”gray”)

for p, f, c in zip((pred1, pred2, pred3),(fit1, fit2, fit3),(‘#ff7823′,’#3c763d’,’c’)):

ax.plot(train.index[150:], f.fittedvalues[150:], color=c)

ax.plot(test.index, p, label=”alpha=”+str(f.params[‘smoothing_level’])[:3], color=c)

plt.title(“Simple Exponential Smoothing”)

plt.legend();

model = Holt(np.asarray(train[‘data’]))

model._index = pd.to_datetime(train.index)

fit1 = model.fit(smoothing_level=.3, smoothing_slope=.05)

pred1 = fit1.forecast(9)

fit2 = model.fit(optimized=True)

pred2 = fit2.forecast(9)

fit3 = model.fit(smoothing_level=.3, smoothing_slope=.2)

pred3 = fit3.forecast(9)

fig, ax = plt.subplots(figsize=(12, 6))

ax.plot(train.index[150:], train.values[150:])

ax.plot(test.index, test.values, color=”gray”)

for p, f, c in zip((pred1, pred2, pred3),(fit1, fit2, fit3),(‘#ff7823′,’#3c763d’,’c’)):

ax.plot(train.index[150:], f.fittedvalues[150:], color=c)

ax.plot(test.index, p, label=”alpha=”+str(f.params[‘smoothing_level’])[:4]+”, beta=”+str(f.params[‘smoothing_slope’])[:4], color=c)

plt.title(“Holt’s Exponential Smoothing”)

plt.legend();

**Evaluating the Forecast**

Hindcasting and adjusting confidence intervals are two good ways to evaluate the quality of the forecast.

Hindcast is one way to verify whether the model is doing a good job If the observed value doesn’t exactly match the predicted value, it does not mean the forecast is all wrong – instead, consider both the amount of variation and the direction of the trend line. Predictions are a matter of probability and estimation, so if the predicted value is close to but not exactly the same as the real value, it could be a better indicator of prediction quality than if the value exactly matched the real result. In general, when a model too closely mirrors the values and trends within the input dataset, it might be overfitted, meaning it likely won’t provide good predictions on new data.

You are the best judge of how reliable the input data is, and what the real range of possible predictions might be.

Credit: Data Science Central By: Vigneswaran S