Automating update of the Christiano, Motto and Rostagno (2014) database for the United-States

Our purpose is to write a program to automatically update a database similar to the one used in the bayesian estimation of the DSGE model developed in Christiano et al. (2014) (hereafter CMR) for the US.

Historical data (since the 80’s)

We need 12 series for the estimation of the CMR model:

  1. GDP
  2. Deflator of GDP
  3. Consumption
  4. Investment
  5. Deflator of investment
  6. Wages
  7. Hours worked
  8. Loans to non financial corporations
  9. Short-term interest rate
  10. Net worth
  11. Credit spread
  12. Term premium

To those 12 series we added 2 others:

  1. Loans to households and non-profit institutions serving households
  2. House prices

To create this database, many sources are used (BEA, FED’s Board of Governors, OECD, BIS, BLS). Mostly, we take data directly from DBnomics. For hours and wages, which come from BLS, and for Wilshire index, we use data from FRED (Saint-Louis Fed) using the plugin function FredR from Janko Cizel, which is updated.

The DBnomics API can be accessed through R thanks to the RSDMX package written by Blondel (2016). All the following code is written in R, thanks to the [@RCT16] and the RStudioTeam (2016).

Special case of consumption

Private consumption is a special case : the series of consumption in level provided by the Bureau of Economic Analysis do not exist before 1999, we need to use the series of growth rates available before 1999 to deduce past consumption levels. As in Christiano et al. (2014), aggregate consumption is the sum of Non Durable Goods and Services. Durable Goods are associated with Investment. The following figure shows the three deduced series of consumption.

rawdata <- rbind(  plugin_dbnomics("BEA/data/nipa-section2-20306-q/DNDGRX1.Q", "conso_nd"), # Real Personal Consumption Expenditures: Nondurable Goods, unit : billions 2009 $, SA annual rate
                   plugin_dbnomics("BEA/data/nipa-section2-20306-q/DSERRX1.Q", "conso_s"), # Real Personal Consumption Expenditures: Services, unit : billions 2009 $, SA annual rate
                   plugin_dbnomics("BEA/data/nipa-section2-20306-q/DDURRX1.Q", "conso_d"), # Real Personal Consumption Expenditures: Durable Goods, unit : billions 2009 $, SA annual rate
                   plugin_dbnomics("BEA/data/nipa-section2-20301-q/DNDGRL1.Q", "dlconso_nd"), # Real Personal Consumption Expenditures: Goods: Nondurable Goods, SA annual rate
                   plugin_dbnomics("BEA/data/nipa-section2-20301-q/DSERRL1.Q", "dlconso_s"), # Real Personal Consumption Expenditures: Services, SA annual rate
                   plugin_dbnomics("BEA/data/nipa-section2-20301-q/DDURRL1.Q", "dlconso_d") # Real Personal Consumption Expenditures: Goods: Durable Goods, SA annual rate
)

var_conso <- c("conso_nd", "conso_d", "conso_s")
dlvar_conso <- paste0("dl", var_conso)

val99_conso <- rawdata %>%
  filter(var %in% var_conso, date == "1999-01-01") %>%
  transmute(val99 = value,
            var_conso = var,
            dlvar_conso = factor(paste0("dl", var)))

rawdata %>%
  filter(var %in% dlvar_conso, date <= "1999-01-01") %>% 
  full_join(val99_conso, by = c("var" = "dlvar_conso")) %>%
  group_by(var) %>%
  arrange(desc(date)) %>% 
  mutate(value = val99 / lag(cumprod((1 + value/100)^(1/4)))) %>%
  ungroup() %>%
  transmute(value, var=var_conso, date) -> conso

rawdata <- bind_rows(rawdata, conso) %>% 
  filter(!(is.na(value) | var %in% c("dlconso_nd","dlconso_d","dlconso_s")))

plot_conso <- rawdata %>%
  mutate(date = as.Date(date),
         var = ifelse(var == "conso_nd", "Nondurable Goods",
                      ifelse(var == "conso_s","Services","Durable Goods")))

p <- ggplot(plot_conso,aes(date,value))+
  geom_line(colour=blueObsMacro)+
  facet_wrap(~var,ncol=3,scales = "free_y")+
  scale_x_date(expand = c(0.01,0.01)) +
  theme + xlab(NULL) + ylab(NULL)+
  theme(strip.text=element_text(size=10),
        axis.text=element_text(size=9))+
  ggtitle("Real Personal Consumption Expenditures")
p

plot of chunk unnamed-chunk-2

rawdata <- rbind(rawdata,
                 plugin_dbnomics("BEA/data/nipa-section1-10106-q/A191RX1.Q", "gdp"), 
                 # Real Gross Domestic Product, unit : billions 2009 $, SA annual rate
                 plugin_dbnomics("BEA/data/nipa-section1-10109-q/A191RD3.Q", "defgdp"), 
                 # Gross Domestic Product: Implicit Price Deflator, SA
                 plugin_dbnomics("FED/data/H15/129.FF.O", "shortrate", TRUE), 
                 # Effective Federal Funds Rate, NSA (source: Fed)
                 plugin_dbnomics("OECD/data/MEI/USA.IRLTLT01.ST.Q", "longrate"), 
                 # Long-Term Government Bond Yields: 10-year: Main (Including Benchmark) for the United States, NSA (source: OECD, MEI)
                 plugin_dbnomics("OECD/data/MEI/USA.LFWA64TT.STSA.Q", "pop"), 
                 # unit: 1000 person (source: OECD, MEI)
                 plugin_dbnomics("BIS/data/CNFS/Q.US.N.A.M.XDC.A", "loans_nfc"), 
                 # Quarterly - United States - Non-financial corporations - All sectors - Market value - Domestic currency (incl. conversion to current currency made using a fix parity) - Adjusted for breaks
                 plugin_dbnomics("BIS/data/CNFS/Q.US.H.A.M.XDC.A", "loans_hh"), 
                 #Quarterly - United States - Households & NPISHs - All sectors - Market value - Domestic currency (incl. conversion to current currency made using a fix parity) - Adjusted for breaks
                 plugin_dbnomics("BIS/data/PP-SS/Q.US.N.628", "houseprice"), 
                 #Quarterly - United States - Households & NPISHs - All sectors - Market value - Domestic currency (incl. conversion to current currency made using a fix parity) - Adjusted for breaks
                 data.frame(fred$series.observations(series_id = 'GPDIC1'), var="inves"), 
                 # real, unit : billions 2009 $, SA annual rate
                 data.frame(fred$series.observations(series_id = 'A006RD3Q086SBEA'), var="definves"), 
                 # Gross private domestic investment (implicit price deflator), SA
                 data.frame(fred$series.observations(series_id = 'HOANBS'), var="hours"), 
                 # unit : 2009=100 Nonfarm Business Sector: Hours of All Persons, SA (source: BLS)
                 data.frame(fred$series.observations(series_id = 'COMPNFB'), var="wage"), 
                 # unit : 2009=100, SA (source: BLS)
                 data.frame(fred$series.observations(series_id = 'BAA', frequency="q", aggregation='avg'), var="riskrate"), 
                 # Moody's Seasoned Baa Corporate Bond Yield, NSA (source: Fed)
                 data.frame(fred$series.observations(series_id = 'WILL5000IND', frequency="q", aggregation='avg'), var="networth")
                 # Wilshire 5000 Total Market Index, NSA (source: Wilshire)
               ) %>%
  filter(year(date) >= 1980) %>% 
  transmute(time = as.Date(date), 
              values = as.numeric(value), 
              var)
rawdata %>%
  spread(key = var, value = values) %>%
  write.csv(file = "US_CMR_rawdata.csv", row.names= FALSE)

Raw data can be downloaded directly from this link.

Final database and normalization

We can check the last date available for each variable.

maxDate <- rawdata %>% 
  group_by(var) %>% 
  summarize(maxdate=max(time)) %>% 
  arrange(maxdate)
kable(maxDate)
var maxdate
conso_d 2016-10-01
conso_nd 2016-10-01
conso_s 2016-10-01
defgdp 2016-10-01
gdp 2016-10-01
loans_hh 2016-10-01
loans_nfc 2016-10-01
definves 2017-04-01
hours 2017-04-01
houseprice 2017-04-01
inves 2017-04-01
longrate 2017-04-01
pop 2017-04-01
wage 2017-04-01
networth 2017-07-01
riskrate 2017-07-01
shortrate 2017-07-01
minmaxDate <- min(maxDate$maxdate)
rawdata %<>% filter(time <= minmaxDate)

So we filter the database until 2016 Q4.

Then data are normalized by capita and price if needed. Eventually we have the 14 series : the 12 series similar to CMR plus loans to households and house price series.

US_CMR_data <- rawdata %>%
  spread(key = var, value = values) %>% 
  transmute(time=time,
            gdp_rpc=1e+9*gdp/(1000*pop),
            conso_rpc=1e+9*(conso_nd+conso_s)/(1000*pop),
            inves_rpc=1e+9*(inves+conso_d)/(1000*pop),
            defgdp = defgdp/100,
            wage_rph=wage/defgdp,
            hours_pc=1e+9*hours/(1000*pop),
            pinves_defl=definves/defgdp,
            loans_nfc_rpc=1e+9*loans_nfc/(1000*pop)/defgdp,
            loans_hh_rpc=1e+9*loans_hh/(1000*pop)/defgdp,
            houseprice_defl=houseprice/defgdp,
            networth_rpc=1e+9*networth/(1000*pop)/defgdp,
            re=shortrate/100,
            slope=(longrate - shortrate)/100,
            creditspread=(riskrate - longrate)/100)

plot_US_CMR_data <- US_CMR_data %>% 
  gather(var, values, - time) %>%
  mutate(time = as.Date(time),
         var = as.factor(var))
levels(plot_US_CMR_data$var)<-listVar

p <- ggplot(plot_US_CMR_data,aes(time,values))+
  geom_line(colour=blueObsMacro)+
  facet_wrap(~var,ncol=3,scales = "free_y")+
  scale_x_date(expand = c(0.01,0.01)) +
  theme + xlab(NULL) + ylab(NULL)+
  theme(strip.text=element_text(size=10),
        axis.text=element_text(size=9))+
  ggtitle("CMR data for the US")
p

plot of chunk unnamed-chunk-7

US_CMR_data %>%
mutate(time=gsub(" ","",as.yearqtr(as.Date(time)))) %>%
  na.omit() %>%
  write.csv("US_CMR_data.csv", row.names=FALSE)

You can also download ready-to-use (normalized) data for the estimation on Dynare here.

Appendix

To download data, we make a call to the DBnomics API. Data is downloaded in SDMX format then coerced into the right format for R.

plugin_dbnomics <- function(series, var_name, average = FALSE) {
  url <- paste0("https://api.db.nomics.world/api/v1/sdmx/", series)
  data_sd <- readSDMX(url)
  df <- as_tibble(data_sd)

  data <- df %>%
    select(TIME_PERIOD, OBS_VALUE) %>%
    rename(date= TIME_PERIOD,
           value = OBS_VALUE) %>%
    mutate(var = var_name,
      value = as.numeric(value),
      realtime_start = "2016-05-27",
      realtime_end = "2016-05-27")
if (average) {
  data %<>%
    mutate(quarter = as.Date(as.yearqtr(date, format="%Y-%m-%d"))) %>%
    group_by(quarter) %>%
    summarize(count = n(),
      value =ifelse(count==3,mean(value),NA)) %>%
    mutate(quarter = as.character(quarter),
           var = var_name,
           realtime_start = "2016-05-27",
           realtime_end = "2016-05-27") %>%
    select(quarter, value, var, realtime_start,realtime_end) %>%
    rename(date= quarter)

} else if (nchar(data$date)[1] == 7) {
  data %<>%
    mutate(date = as.character(as.Date(as.yearqtr(data$date,format="%Y-Q%q"))))
} else {
    data %<>%
    mutate(date = as.character(as.Date(as.yearqtr(data$date))))
}
  return(data)
}

Bibliography

Emmanuel Blondel. rsdmx: Tools for Reading SDMX Data and Metadata. 2016. R package version 0.5-3. URL: https://CRAN.R-project.org/package=rsdmx.

L Christiano, Roberto Motto, and Massimo Rostagno. Risk shocks. American Economic Review, 104(1):27–65, 2014. URL: http://www.aeaweb.org/articles.php?doi=10.1257/aer.104.1.27, doi:10.1257/aer.104.1.27. 1 2

RStudio Team. RStudio: Integrated Development Environment for R. RStudio, Inc., Boston, MA, 2016. URL: http://www.rstudio.com/.

Comments

comments powered by Disqus