5.1 R: Time Series Objects

As you’ve seen, R has a special format (an object class) for time series data called a ts object. If the data are not already in that format, you can create a ts object with the ts() function. Besides the data, it requires two pieces of information.

The first is frequency. The name is a bit of a misnomer because it does not refer to the number of cycles per unit of time but rather the number of observations/samples per cycle.

We typically work with one day or one year as the cycle. So, if the data were collected each hour of the day, then frequency = 24. If the data is collected annually, frequency = 1; quarterly data should have frequency = 4; monthly data should have frequency = 12; weekly data should have frequency = 52.

The second piece of information is start, and it specifies the time of the first observation in terms of (cycle, frequency). In most use cases, it is (day, hour), (year, month), (year, quarter), etc. So, for example, if the data were collected monthly beginning in November of 1969, then frequency = 12 and start = c(1969, 11). If the data were collected annually, you specify start as a scalar (e.g., start = 1991) and omit frequency (i.e., R will set frequency = 1 by default).

This is a useful format for us because the plot.ts() or plot(ts()) functions will automatically correctly label the x-axis according to time. Additionally, there are special functions that work on ts objects such as decompose() that visualizes the basic decomposition of the series into trend, season, and error.

If you have multiple characteristics or variables measured over time, we could combine them in one ts object by considering the intersection (overlapping periods) with ts.intersect() or the union (all times) of the two time series with ts.union().

As you’ve seen above, it may also be useful to have data in a data.frame() format instead of a ts object if you want to use ggplot() or lm(). You should be familiar with both data formats to go back and forth as necessary.