How to convert dataframe into time series?
I have one csv file in which I have 2 closing prices of stock(on daily basis)
Dates Bajaj_close Hero_close
3/14/2013 1854.8 1669.1
3/15/2013 1850.3 1684.45
3/18/2013 1812.1 1690.5
3/19/2013 1835.9 1645.6
3/20/2013 1840 1651.15
3/21/2013 1755.3 1623.3
3/22/2013 1820.65 1659.6
3/25/2013 1802.5 1617.7
3/26/2013 1801.25 1571.85
3/28/2013 1799.55 1542
I want to convert above data into time series format. (start date is 3/14/2013
and end date is 3/13/2015
) I have tried this but its giving me some weird output
values <- bajaj_hero[, -1] (excluded first column i.e date in real dataset)
bajaj_hero_timeseries <- ts(values,start=c(2013,1),end=c(2015,3),frequency=365)
Output is:
Bajaj_close Hero_close
2013.000 1854.80 1669.10
2013.003 1850.30 1684.45
2013.005 1812.10 1690.50
2013.008 1835.90 1645.60
2013.011 1840.00 1651.15
2013.014 1755.30 1623.30
2013.016 1820.65 1659.60
2013.019 1802.50 1617.70
2013.022 1801.25 1571.85
Solution 1:
R has multiple ways of represeting time series. Since you're working with daily prices of stocks, you may wish to consider that financial markets are closed on weekends and business holidays so that trading days and calendar days are not the same. However, you may need to work with your times series in terms of both trading days and calendar days. For example, daily returns are calculated from sequential daily closing prices regardless of whether a weekend intervenes. But you may also want to do calendar-based reporting such as weekly price summaries. For these reasons the xts package, an extension of zoo, is commonly used with financial data in R. An example of how it could be used with your data follows.
Assuming the data shown in your example is in the dataframe df
library(xts)
stocks <- xts(df[,-1], order.by=as.Date(df[,1], "%m/%d/%Y"))
#
# daily returns
#
returns <- diff(stocks, arithmetic=FALSE ) - 1
#
# weekly open, high, low, close reports
#
to.weekly(stocks$Hero_close, name="Hero")
which gives the output
Hero.Open Hero.High Hero.Low Hero.Close
2013-03-15 1669.1 1684.45 1669.1 1684.45
2013-03-22 1690.5 1690.50 1623.3 1659.60
2013-03-28 1617.7 1617.70 1542.0 1542.00
Solution 2:
Input. We will start with the text of the input shown in the question since the question did not provide the csv input:
Lines <- "Dates Bajaj_close Hero_close
3/14/2013 1854.8 1669.1
3/15/2013 1850.3 1684.45
3/18/2013 1812.1 1690.5
3/19/2013 1835.9 1645.6
3/20/2013 1840 1651.15
3/21/2013 1755.3 1623.3
3/22/2013 1820.65 1659.6
3/25/2013 1802.5 1617.7
3/26/2013 1801.25 1571.85
3/28/2013 1799.55 1542"
zoo. "ts"
class series normally do not represent date indexes but we can create a zoo series that does (see zoo package):
library(zoo)
z <- read.zoo(text = Lines, header = TRUE, format = "%m/%d/%Y")
Alternately, if you have already read this into a data frame DF
then it could be converted to zoo as shown on the second line below:
DF <- read.table(text = Lines, header = TRUE)
z <- read.zoo(DF, format = "%m/%d/%Y")
In either case above z
ia a zoo series with a "Date"
class time index. One could also create the zoo series, zz
, which uses 1, 2, 3, ... as the time index:
zz <- z
time(zz) <- seq_along(time(zz))
ts. Either of these could be converted to a "ts"
class series:
as.ts(z)
as.ts(zz)
The first has a time index which is the number of days since the Epoch (January 1, 1970) and will have NAs for missing days and the second will have 1, 2, 3, ... as the time index and no NAs.
Monthly series. Typically "ts"
series are used for monthly, quarterly or yearly series. Thus if we were to aggregate the input into months we could reasonably represent it as a "ts"
series:
z.m <- as.zooreg(aggregate(z, as.yearmon, mean), freq = 12)
as.ts(z.m)
Solution 3:
Late to the party, but the tsbox package is designed to perform conversions like this. To convert your data into a ts
-object, you can do:
dta <- data.frame(
Dates = c("3/14/2013", "3/15/2013", "3/18/2013", "3/19/2013"),
Bajaj_close = c(1854.8, 1850.3, 1812.1, 1835.9),
Hero_close = c(1669.1, 1684.45, 1690.5, 1645.6)
)
dta
#> Dates Bajaj_close Hero_close
#> 1 3/14/2013 1854.8 1669.10
#> 2 3/15/2013 1850.3 1684.45
#> 3 3/18/2013 1812.1 1690.50
#> 4 3/19/2013 1835.9 1645.60
library(tsbox)
ts_ts(ts_long(dta))
#> Time Series:
#> Start = 2013.1971293045
#> End = 2013.21081883954
#> Frequency = 365.2425
#> Bajaj_close Hero_close
#> 2013.197 1854.8 1669.10
#> 2013.200 1850.3 1684.45
#> 2013.203 NA NA
#> 2013.205 NA NA
#> 2013.208 1812.1 1690.50
#> 2013.211 1835.9 1645.60
It automatically parses the dates, detects the frequency and makes the missing values at the weekends explicit. With ts_<class>
, you can convert the data to any other time series class.
Solution 4:
With library fpp
, you can easily create time series with date format:
time_ser=ts(data,frequency=4,start=c(1954,2))
here we start at the 2nd quarter of 1954 with quarter fequency.