Time series forecasting for Dow Jones Industrial Average (using Facebook Prophet)

In this post, we try to examine the usefulness of Facebook Prophet in forcasting stock market indices.

library(xts)
library(quantmod)
library(prophet)
library(ggplot2)
library(forecast)
library(lubridate)

Read historical data for Dow Jones Industrial Average (adjusted closing prices)

setwd("F:/My Files/R Studio/DOW S&P500")
dow_data=read.csv('dow_data_v3.csv',header = T,sep = ',')
head(dow_data)
tail(dow_data)

Standardize the dates and prepare data for conversion to time series xts format, using from year 2008 onward

dow_data$Date=dmy(dow_data$X) #Standardize the dates using lubridate package

df=data.frame(dow_data$DJI) #Prepare time series format
rownames(df)=dow_data$Date
colnames(df)=c('DJI')

df_xts=as.xts(df) #Convert to xts format
df_xts=df_xts['2008/'] #extract data frm 2008 onward

Plot the chart series

We observe an upward multiplicative (exponential) trend. As prophet library makes time series forecast based on additive regression trend model, we need to do a log transformation to linearize the data using log transformation.

chartSeries(df_xts,
            theme=chartTheme('white'))

Prepare dataframe with variables ds and y assigned for prophet model, perform log transformation

df=data.frame(df_xts)
df$ds=rownames(df)
ds=df$ds
y=df$DJI
df=data.frame(ds,y)
df$ds=as.Date(df$ds)

y=log(df$y)
df=data.frame(ds,y)
df$ds=as.Date(df$ds)

qplot(ds,y,data=df,
      main='Dow Jones Industrial Average in log scale')

Split the data to training and test set

Let's predict and validate for the last 252 trading days

training_length=length(df$y)-252
test_start=training_length+1

df_training=df[1:training_length,]
df_test=df[test_start:length(df$y),]

Specify the forecasting model, and make furture prediction using training data

m=prophet()
m=fit.prophet(m,df_training) 

future=make_future_dataframe(m,periods=252)
forecast=predict(m,future)

Forecast Components

We observe a long term upward trend, as well as seasonality, where the index peaks in the months of May and August and bottoms out in the months of March and November. There is no weekly pattern being observed in the above plots.

prophet_plot_components(m,forecast)

Model Performance

We observe an overall linear upward trend, with R-squared at 0.9825. Up to 98.25% of the variation in predicted values can be explained by the variation in actual values in the prediction model. However, this measures the goodness-of-fit and does not provide information on the accuracy of the model.

pred=forecast$yhat
actuals=df$y
plot(actuals,pred) 
abline(lm(pred~actuals),col='red',lwd=2)
summary(lm(pred~actuals))

Plot for forecast and residuals

#Create dataframe for residuals
residuals_m=df$y-forecast$yhat
df_residuals=data.frame(df$ds,residuals_m)
colnames(df_residuals)=c('ds','residuals') 

#Forecast plot
plot(m,forecast)

#Residual plot
qplot(ds,residuals,data=df_residuals,
      main='Plot of residuals in log scale')+
  geom_vline(xintercept = as.numeric(ymd(df_test[1,1])), 
             color = "blue")

Model accuracy

We notice that the accuracy metrics of the model deteriorates when we compare the validation set against the training set, suggestive of overfitting. This could be due to the highly volatile nature of the stock index, where the variance of the residuals are not constant, rendering the model which uses the least squares method less accurate. Facebook Prophet is not that useful for predicting stock index in times of high volatility, also, the accuracy for forecasts deteriorates as we project further into the future (medium to long-term).

#Performance metrics for training set
predicted_train=forecast$yhat[1:(length(forecast$yhat)-252)]
actuals_train=df_training$y
round(accuracy(predicted_train,actuals_train),2)

#Performance metrics for test/validation set
predicted_v=forecast$yhat[(length(forecast$yhat)-252+1):length(forecast$yhat)]
actuals_v=df_test$y
round(accuracy(predicted_v,actuals_v),2)

Accuracy of model in actual scale (test data)

round(accuracy(exp(predicted_v),exp(actuals_v),2))

Let's rescale back our data and do a plot

df$dowjones=exp(df$y)
df$predicted=exp(forecast$yhat)

df_residuals$Residuals=df$dowjones-df$predicted

ggplot(df, aes(ds)) + 
  geom_line(aes(y = dowjones, color='dow')) + 
  geom_line(aes(y = predicted, color='predicted'))+
  geom_vline(xintercept = as.numeric(ymd(df_test[1,1])), 
             color = "blue")

qplot(ds,Residuals,data=df_residuals,
      main='Plot of residuals')+
  geom_vline(xintercept = as.numeric(ymd(df_test[1,1])), 
             color = "blue")

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
R code		R code
data		data
pic		pic
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Time series forecasting for Dow Jones Industrial Average (using Facebook Prophet)

Read historical data for Dow Jones Industrial Average (adjusted closing prices)

Standardize the dates and prepare data for conversion to time series xts format, using from year 2008 onward

Plot the chart series

Prepare dataframe with variables ds and y assigned for prophet model, perform log transformation

Split the data to training and test set

Specify the forecasting model, and make furture prediction using training data

Forecast Components

Model Performance

Plot for forecast and residuals

Model accuracy

Accuracy of model in actual scale (test data)

Let's rescale back our data and do a plot

About

Releases

Packages

Languages

Royston-Soh/dow-facebook-prophet

Folders and files

Latest commit

History

Repository files navigation

Time series forecasting for Dow Jones Industrial Average (using Facebook Prophet)

Read historical data for Dow Jones Industrial Average (adjusted closing prices)

Standardize the dates and prepare data for conversion to time series xts format, using from year 2008 onward

Plot the chart series

Prepare dataframe with variables ds and y assigned for prophet model, perform log transformation

Split the data to training and test set

Specify the forecasting model, and make furture prediction using training data

Forecast Components

Model Performance

Plot for forecast and residuals

Model accuracy

Accuracy of model in actual scale (test data)

Let's rescale back our data and do a plot

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages