5.1 R: Time Series Objects
As you’ve seen, R has a special format (an object class) for time series data
called a ts
object. If the data are not already in that format, you can
create a ts
object with the ts()
function. Besides the data, it
requires two pieces of information.
The first is frequency
. The name is a bit of a misnomer because it
does not refer to the number of cycles per unit of time but rather
the number of observations/samples per cycle.
We typically work with one day or one year as the cycle. So, if the
data were collected each hour of the day, then frequency = 24
. If the data
is collected annually, frequency = 1
; quarterly data should have frequency = 4
;
monthly data should have frequency = 12
; weekly data should have
frequency = 52
.
The second piece of information is start,
and it specifies the time of the first
observation in terms of (cycle, frequency). In most use cases, it is (day, hour), (year, month), (year, quarter), etc. So, for example,
if the data were collected monthly beginning in November of 1969, then
frequency = 12
and start = c(1969, 11)
. If the data were collected
annually, you specify start as a scalar (e.g.,
start = 1991
) and omit frequency (i.e., R will set frequency = 1
by
default).
This is a useful format for us because the plot.ts()
or plot(ts())
functions will automatically correctly label the x-axis according to
time. Additionally, there are special functions that work on ts objects
such as decompose()
that visualizes the basic decomposition of the series
into trend, season, and error.
If you have multiple characteristics or variables measured over time, we
could combine them in one ts
object by considering the intersection
(overlapping periods) with ts.intersect()
or the union (all
times) of the two time series with ts.union()
.
As you’ve seen above, it may also be useful to have data in a
data.frame()
format instead of a ts
object if you want to use
ggplot()
or lm()
. You should be familiar with both data formats to go back and forth as necessary.