Automating update of the Christiano, Motto and Rostagno (2014) database for the United States

Our purpose is to write a program to automatically update a database similar to the one used in the bayesian estimation of the DSGE model developed in [@Chri14a] (hereafter CMR) for the United States.

We need 12 series for the estimation of the CMR model:

  1. GDP
  2. Deflator of GDP
  3. Consumption
  4. Investment
  5. Deflator of investment
  6. Wages
  7. Hours worked
  8. Loans to non financial corporations
  9. Short-term interest rate
  10. Net worth
  11. Credit spread
  12. Term premium

To those 12 series we added 2 others:

  1. Loans to households and non-profit institutions serving households
  2. House prices

To create this database, many sources are used (BEA, BIS, BLS, OECD). Mostly, we take data directly from DBnomics, through the rdbnomics package. For Wilshire 5000 Total Market Index and Moody’s Seasoned Baa Corporate Bond Yield, we use data from FRED (from Saint-Louis Fed) using the plugin function FredR from Janko Cizel, which is updated. All the following code is written in R, thanks to the [@Rct16] and the [@RStu16].

Raw data from BEA, BIS, BLS and OECD

We use DBnomics to retrieve data from BEA, BIS, BLS and OECD.

df <- rdb(ids=c("BEA/NIPA-T10106/A191RX-Q",
                "BEA/NIPA-T10109/A191RD-Q",
                "BEA/NIPA-T10106/A006RX-Q",
                "BEA/NIPA-T10109/A006RD-Q",
                "BIS/CNFS/Q.US.N.A.M.XDC.A",
                "BIS/CNFS/Q.US.H.A.M.XDC.A",
                "BIS/PP-SS/Q.US.N.628",
                "BLS/pr/PRS85006033",
                "BLS/pr/PRS85006103",
                "OECD/MEI/USA.IRLTLT01.ST.Q",
                "OECD/MEI/USA.LFWA64TT.STSA.Q")) %>% 
  mutate(series_name=case_when(str_detect(series_code,"RD-") ~ paste("Deflator,",series_name),
                               str_detect(series_code,"RX-") ~ paste("Real,",series_name),
                               str_detect(series_code,"Q.US.N.A.M.XDC.A") ~ paste("Loans to non-financial corporations,",series_name),
                               str_detect(series_code,"Q.US.H.A.M.XDC.A") ~ paste("Loans to households and NPISHs,",series_name),
                               str_detect(series_code,"Q.US.N.628") ~ paste("Property prices,",series_name),
                               TRUE ~ series_name)) %>% 
  select(var_name=series_name,
         var_code=series_code,
         value,
         period)

df %<>% 
  mutate(var_code=
           case_when(var_code=="A191RX-Q" ~ "gdp",
                     var_code=="A006RX-Q" ~ "inves",
                     var_code=="A191RD-Q" ~ "defgdp",
                     var_code=="A006RD-Q" ~ "definves",
                     var_code=="Q.US.H.A.M.XDC.A" ~ "loans_hh",
                     var_code=="Q.US.N.A.M.XDC.A" ~ "loans_nfc",
                     var_code=="Q.US.N.628" ~ "houseprice",
                     var_code=="PRS85006033" ~ "hours",
                     var_code=="PRS85006103" ~ "wage",
                     var_code=="USA.LFWA64TT.STSA.Q" ~ "pop",
                     var_code=="USA.IRLTLT01.ST.Q" ~ "longrate"))

We create a quarterly series of short-term interest rate taking the mean of the monthly series.

shortrate <- 
  rdb(ids="FED/H15/129.FF.O") %>% 
  mutate(period=paste(year(period),quarter(period),sep="-")) %>% 
  group_by(period) %>% 
  summarise(value=mean(value)) %>% 
  mutate(var_code="shortrate",
         var_name="Monthly – Federal funds – Overnight",
         period=yq(period))

Special case of consumption

Private consumption is a special case : the series of consumption in level provided by the Bureau of Economic Analysis do not exist before 1999, we need to use the series of growth rates available before 1999 to deduce past consumption levels. As in [@Chri14a], aggregate consumption is the sum of Non Durable Goods and Services. Durable Goods are associated with Investment. The following figure shows the three deduced series of consumption.

conso_level <- 
  rdb(ids=c("BEA/NIPA-T20306/DDURRX-Q",
            "BEA/NIPA-T20306/DNDGRX-Q",
            "BEA/NIPA-T20306/DSERRX-Q")) %>% 
  select(period,
         value,
         var_name=Concept)

conso_rate <- 
  rdb(ids=c("BEA/NIPA-T20301/DDURRL-Q",
            "BEA/NIPA-T20301/DNDGRL-Q",
            "BEA/NIPA-T20301/DSERRL-Q")) %>% 
  select(period,
         value,
         var_name=Concept)

conso_level_99 <-
  conso_level %>% 
  filter(period=="1999-01-01")

conso <-
  conso_rate %>% 
  filter(period <= "1999-01-01") %>% 
  full_join(conso_level_99,by="var_name") %>% 
  group_by(var_name) %>% 
  arrange(desc(period.x)) %>% 
  mutate(value = value.y / lag(cumprod((1 + value.x/100)^(1/4)))) %>%
  ungroup() %>% 
  transmute(period=period.x,
            var_name,
            value) %>% 
  na.omit() %>% 
  bind_rows(conso_level) %>% 
  filter(period >= "1980-01-01")

ggplot(conso,aes(period,value))+
  geom_line(colour=blueObsMacro)+
  facet_wrap(~var_name,ncol=3,scales = "free_y")+
  scale_x_date(expand = c(0.01,0.01)) +
  theme + xlab(NULL) + ylab(NULL)+
  ggtitle("Real Personal Consumption Expenditures")

plot of chunk unnamed-chunk-4

conso %<>% 
  mutate(
    var_code=case_when(
      var_name=="Durable goods" ~ "conso_d",
      var_name=="Nondurable goods" ~ "conso_nd",
      var_name=="Services" ~ "conso_s"
    ),
    var_name=paste("Real Personal Consumption Expenditures,",var_name))

Financial data from FRED

fred_data <- 
  bind_rows(data.frame(fred$series.observations(series_id = 'BAA',
                                                frequency="q", 
                                                aggregation='avg'), 
                       var_name="Moody's Seasoned Baa Corporate Bond Yield",
                       var_code="riskrate"), 
            data.frame(fred$series.observations(series_id = 'WILL5000IND', 
                                                frequency="q", 
                                                aggregation='avg'), 
                       var_name="Wilshire 5000 Total Market Index",
                       var_code="networth")) %>% 
  transmute(period=ymd(date),
            value=as.numeric(value),
            var_code,var_name)

Final database and normalization

rawdata <- 
  bind_rows(conso,df,shortrate,fred_data) %>%
  filter(year(period) >= 1980)
var_names <- unique(rawdata$var_name)
var_names <- gsub("Expenditures,.*","",var_names) %>% unique()

We can check the last date available for each variable.

maxDate <- 
  rawdata %>% 
  group_by(var_code) %>% 
  summarize(maxdate=max(period)) %>% 
  arrange(maxdate)
kable(maxDate)
var_code maxdate
loans_hh 2017-07-01
loans_nfc 2017-07-01
conso_d 2017-10-01
conso_nd 2017-10-01
conso_s 2017-10-01
defgdp 2017-10-01
definves 2017-10-01
gdp 2017-10-01
houseprice 2017-10-01
inves 2017-10-01
hours 2018-01-01
longrate 2018-01-01
pop 2018-01-01
wage 2018-01-01
networth 2018-04-01
riskrate 2018-04-01
shortrate 2018-04-01
minmaxDate <- min(maxDate$maxdate)
rawdata %<>% filter(period <= minmaxDate) %>% select(-var_name)
rawdata %>%
  spread(var_code,value) %>%
  write.csv(file = "US_CMR_rawdata.csv", row.names= FALSE)

So we filter the database until 2017 Q3. Raw data can be downloaded directly here.

Then data are normalized by capita and price if needed. Eventually we have the 14 series : the 12 series similar to CMR plus loans to households and house price series.

US_CMR_data <- 
  rawdata %>%
  spread(var_code,value) %>% 
  transmute(period,
            gdp_rpc=1e+9*gdp/(1000*pop),
            conso_rpc=1e+9*(conso_nd+conso_s)/(1000*pop),
            inves_rpc=1e+9*(inves+conso_d)/(1000*pop),
            defgdp = defgdp/100,
            wage_rph=wage/defgdp,
            hours_pc=1e+9*hours/(1000*pop),
            pinves_defl=definves/defgdp,
            loans_nfc_rpc=1e+9*loans_nfc/(1000*pop)/defgdp,
            loans_hh_rpc=1e+9*loans_hh/(1000*pop)/defgdp,
            houseprice_defl=houseprice/defgdp,
            networth_rpc=1e+9*networth/(1000*pop)/defgdp,
            re=shortrate/100,
            slope=(longrate - shortrate)/100,
            creditspread=(riskrate - longrate)/100)

US_CMR_data %>%
  mutate(period=gsub(" ","",as.yearqtr(as.Date(period)))) %>%
  write.csv("US_CMR_data.csv", row.names=FALSE)
plot_US_CMR_data <- 
  US_CMR_data %>% 
  gather(var, value, - period) %>%
  mutate(var = as.factor(var))
levels(plot_US_CMR_data$var)<-listVar

ggplot(plot_US_CMR_data,aes(period,value))+
  geom_line(colour=blueObsMacro)+
  facet_wrap(~var,ncol=3,scales = "free_y")+
  scale_x_date(expand = c(0.01,0.01)) +
  theme + xlab(NULL) + ylab(NULL)+
  theme(strip.text=element_text(size=10),
        axis.text=element_text(size=9))+
  ggtitle("CMR data for the US")

plot of chunk unnamed-chunk-12

You can also download ready-to-use (normalized) data for the estimation here.

Comments

comments powered by Disqus