Overview of Date and Date-Time Objects in R

There are several ways to represent a time index (sequence of dates or date-times) in R. Table 1 summarizes the main time index classes available in R.

Class Package Description
chron chron Represent calendar dates and times within the day as the (signed) number of seconds since the beginning of 1970 as a numeric vector. Does not control for time zones.
Date base Represent calendar dates as the number of days since 1970-01-01
yearmon zoo Represent monthly data. Internally it holds the data as year plus 0 for January, 1/12 for February, 2/12 for March and so on in order that its internal representation is the same as ts class with frequency = 12.
yearqtr zoo Represent quarterly data. Internally it holds the data as year plus 0 for Quarter 1, ¼ for Quarter 2 and so on in order that its internal representation is the same as ts class with frequency = 4.
POSIXct base Represent calendar dates and times within the day as the (signed) number of seconds since the beginning of 1970 as a numeric vector. Supports various time zone specifications (e.g. GMT, PST, EST etc.)
POSIXlt Base Represents local dates and times within the day as named list of vectors with date-time components.
timeDate (Sv4) timeDate The Rmetrics timeDate Sv4 class fulfils the conventions of the ISO 8601 standard as well as of the ANSI C and POSIX standards. Beyond these standards Rmetrics has added the “Financial Center” concept which allows to handle data records collected in different time zones and mix them up to have always the proper time stamps with respect to your personal financial center, or alternatively to the GMT reference time. timeDate is almost compatible with the timeDate class in Tibco's S-PLUS.

The base R Date class handles dates (without times), and is the recommended class for representing financial data that are observed on discrete dates without regard to the time of day (e.g., daily closing prices). The base R POSIXct and POSIXlt classes allow for dates and times with control for time zones. This is the recommended class for representing dates associated with financial data observed at particular times within a day (e.g., prices or quotes observed during the trading hours of a day). The chron class is similar but is not used as often as the POSIXt classes. The yearmon and yearqtr classes from the zoo package are convenient for representing regularly spaced monthly and quarterly data, respectively, when it is not necessary to specify exactly when during the month or quarter the data is observed. The Rmetrics timeDate class is an Sv4 class very similar to the S-PLUS timeDate class , is based on the POSIX standards, and is used throughout the Rmetrics suite of packages.

Throughout this tuturial, I will use the following R options

options(digits = 4, width = 70)

The Date Class (base R)

Use the Date class to represent a time index only involving dates but not times within a day. The Date class by default represents dates internally as the number of days since January 1, 1970. You create Date objects from a character string representing a date using the as.Date() function. The default format is “YYYY/m/d” or “YYYY-m-d”“, where YYYY represents the four digit year, m represents the month digit and d represents the day digit. For example,

my.date = as.Date("1970/1/1")
my.date
## [1] "1970-01-01"
class(my.date)
## [1] "Date"
as.numeric(my.date)
## [1] 0
myDates = c("2013-12-19", "2003-12-20")
as.Date(myDates)
## [1] "2013-12-19" "2003-12-20"

Use the format argument to specify the input format of the date if it is not in the default format

as.Date("1/1/1970", format = "%m/%d/%Y")
## [1] "1970-01-01"
as.Date("January 1, 1970", format = "%B %d, %Y")
## [1] "1970-01-01"
as.Date("01JAN70", format = "%d%b%y")
## [1] "1970-01-01"

Notice that the output format is always in the form "YYYY-m-d” regardless of the input format. To change the displayed output format of a date use the format() function

format(my.date, "%b %d, %Y")
## [1] "Jan 01, 1970"

Some date formats provide insufficient information to be unambiguously represented as a Date object. For example,

as.Date("Jan 1970", format = "%b %Y")
## [1] NA

Table 2 below gives the standard date format codes.

Code Value Example
%d Day of the month (decimal number) 23
%m Month (decimal number) 11
%b Month (abbreviated) Jan
%B Month (full name) January
%y Year (2 digit) 90
%Y Year (4 digit) 1990

Recall, dates are internally recorded as the (integer) number of days since 1970-01-01. As a result, you can also create a Date object from integer data. One way to convert an integer variable to a Date object is to use the class() function

my.date = 0
class(my.date) = "Date"
my.date
## [1] "1970-01-01"

Another way is to use the as.Date() function with optional argument origin if the origin date is different than the default 1970-01-01. For example, to determine the date that is 32500 days from 1900-01-01 use

as.Date(32500, origin = as.Date("1900-01-01"))
## [1] "1988-12-25"

Extracting Information from Date objects

Consider the Date object

my.date
## [1] "1970-01-01"

Suppose I want to extract the year component from this object as a character string or as an integer. I can do this using the format() function

myYear = format(my.date, "%Y")
myYear
## [1] "1970"
class(myYear)
## [1] "character"
as.numeric(myYear)
## [1] 1970
as.numeric(format(my.date, "%Y"))
## [1] 1970

By specifying different format codes in the format() function, I can extract other components of the date such as the month or day.

Additionally, the weekdays(), months(), quarters() and julian() functions can be used to extract specific components of Date objects

weekdays(my.date)
## [1] "Thursday"
months(my.date)
## [1] "January"
quarters(my.date)
## [1] "Q1"
julian(my.date, origin = as.Date("1900-01-01"))
## [1] 25567
## attr(,"origin")
## [1] "1900-01-01"

Manipulating Date Objects

Having a numeric representation for dates allows for some simple date arithmetic. For example,

my.date
## [1] "1970-01-01"
my.date + 1
## [1] "1970-01-02"
my.date - 1
## [1] "1969-12-31"
my.date + 31
## [1] "1970-02-01"

Logical comparisons can also be made

my.date
## [1] "1970-01-01"
my.date1 = as.Date("1980-01-01")
my.date1 > my.date
## [1] TRUE

Subtracting two Date objects creates a difftime object and shows the number of days between the two dates

diff.date = my.date1 - my.date
diff.date
## Time difference of 3652 days
class(diff.date)
## [1] "difftime"
as.numeric(diff.date)
## [1] 3652
my.date + diff.date
## [1] "1980-01-01"

Creating Date Sequences

Very often sequences of dates are required in the construction of time series objects. The base R function seq() (with method function seq.Date() for objects of class Date) can create many types of date sequences. The arguments to seq.Date() are

args(seq.Date)
## function (from, to, by, length.out = NULL, along.with = NULL, 
##     ...) 
## NULL

where from specifies the starting date, to specifies the ending date and by specifies the increment of the sequence. The by increment is a character string, containing one of “day”, “week”, “month” or “year”, and can be preceded by a (positive or negative) integer and a space, or followed by “s”. For example, to create a bi-monthly sequence of Date objects starting 1993-03-01 and ending in 2003-03-01 use

my.dates = seq(as.Date("1993/3/1"), as.Date("2003/3/1"), "2 months")
head(my.dates)
## [1] "1993-03-01" "1993-05-01" "1993-07-01" "1993-09-01" "1993-11-01"
## [6] "1994-01-01"
tail(my.dates)
## [1] "2002-05-01" "2002-07-01" "2002-09-01" "2002-11-01" "2003-01-01"
## [6] "2003-03-01"

Alternatively, use

my.dates = seq(from = as.Date("1993/3/1"), by = "2 months", length.out = 61)

The seq() function can also be used to determine the date that is a specified number of days, weeks, months or years from a given date. For example, to find the date that is 5 months away from today's date use

Sys.Date()
## [1] "2014-05-13"
seq(from = Sys.Date(), by = "5 months", length.out = 2)[2]
## [1] "2014-10-13"

While the above is a clever solution, it is not very intuitive. The lubridate package, described later on, provides a much easier solution.

Plotting Date Objects

Given a data set of Date objects, it is possible to graphically summarize the distribution of dates using the hist() function. For example, the following code simulates 500 random dates between 2013-01-01 and 2014-01-01 and plots a histogram summarizing the number of dates within each month

rint = round(runif(500) * 365)
startDate = as.Date("2013-01-01")
myDates = startDate + rint
head(myDates)
## [1] "2013-01-30" "2013-09-06" "2013-08-31" "2013-01-18" "2013-07-10"
## [6] "2013-01-06"
hist(myDates, breaks = "months", freq = TRUE, main = "Distribution of Dates by Month", 
    col = "slateblue1", xlab = "", format = "%b %Y", las = 2)

plot of chunk chunk18