This is a part one of ARIMA forecasting for STOXX50 index. This part going to show some exploratory Data analysis which shows preparing R with required packages, importing data and converting data to time series data.

load all required packages:

library(ggplot2)
library(forecast)
library(tseries)
library(tidyverse)
library(rio)
library(readxl)
library(tidyquant)

Import data from the locally downloaded file:

data.price<-import("C:/Users/ashiq/Desktop/pilot Study/price 1.csv")
head(data.price)

Now convert the data into a date class (must be in yyyy-mm-dd format from the Excel or you need to use different function):

data.price$DATES=as.Date(data.price$Date)

Plot the STOXX50 to show data pattern over time:

ggplot(data.price,aes(DATES, STOXX50))+geom_line()+ scale_x_date(date_labels = "%b %y", date_breaks = "6 months")+ylab("Daily STOXX50 Price")+xlab("")

arima1

Plot month over month to see the range and outerliners

ggplot(data.price,aes(DATES,STOXX50))+geom_point(color="navyblue")+facet_wrap(~Month)+geom_line()+ylab("Daily STOXX50 Price")+xlab("")

arima 2

Create a time series object based on STOXX50 to pass to tsclean()

count_TSObject=ts(data.price[,c("STOXX50")]) 

data.price$clean_count=tsclean(count_TSObject) # tsclean function to ID and replace outliners and input missing values if they exist. 

# Graph cleaned data
ggplot()+geom_line(data = data.price,aes(x=DATES, y=clean_count))+ylab("Cleaned Count")+ scale_x_date(date_labels = "%b %y", date_breaks = "6 months")

Then I computed both weekly Moving Average and Monthly Moving Average to cleaned daily data which still has a lot of variance and volatility in it.

data.price$cnt_ma=ma(data.price$clean_count,order = 7) # weekly Moving Average (MA) with clean count no outliners.
data.price$cnt_ma30=ma(data.price$clean_count,order = 30) # Monthly Moving Average (MA) with clean count no outliners.
ggplot()+geom_line(data = data.price,aes(x=DATES,y=clean_count, colour="STOXX50"))+geom_line(data = data.price,aes(x=DATES,y=cnt_ma, colour="Weekly Moving Average"))+geom_line(data = data.price,aes(x=DATES, y=cnt_ma30, colour="Monthly Moving Average"))+ylab("STOXX50")+ scale_x_date(date_labels = "%b %y", date_breaks = "6 months")

arima 3

References