Last exercise set

Table of critical values or quantiles from standard normal and chi-squared distributions

qnorm(c(0.005, 0.025, 0.05, 0.95, 0.975, 0.995))
[1] -2.575829 -1.959964 -1.644854  1.644854  1.959964  2.575829
qchisq(c(0.9, 0.95, 0.99), 1)
[1] 2.705543 3.841459 6.634897
qchisq(c(0.9, 0.95, 0.99), 2)
[1] 4.605170 5.991465 9.210340
qchisq(c(0.9, 0.95, 0.99), 3)
[1]  6.251389  7.814728 11.344867
qchisq(c(0.9, 0.95, 0.99), 11)
[1] 17.27501 19.67514 24.72497
qchisq(c(0.9, 0.95, 0.99), 12)
[1] 18.54935 21.02607 26.21697
qchisq(c(0.9, 0.95, 0.99), 13)
[1] 19.81193 22.36203 27.68825

Exercise 1: Quality control and the CLT, based on Dekking et al (2005)

A factory produces links for heavy metal chains. The research lab of the factory models the length (in cm) of a link by the random variable \(X\), with \(\mathbb{E}\left(X\right)=5\) and \(\mathsf{Var}\left(X\right)=0.04\). The length of a link is defined in such a way that the length of a chain is equal to the sum of the lengths of its links. The factory sells chains of 50 meters; to be on the safe side 1002 links are used for such chains. The factory guarantees that the chain is not shorter than 5000 cm. If by chance a chain is too short, the customer is reimbursed, and a new chain is given for free.

  1. What is the probability that for a chain of at least 5000 cm more than 1002 links are needed? For what percentage of the chains does the factory have to reimburse clients and provide free chains?

  2. The sales department of the factory notices that it has to hand out a lot of free chains and asks the research lab what is wrong. After further investigations the research lab reports to the sales department that the expectation value 5 is incorrect, and that the correct value is 4.99 (cm). Do you think that it was necessary to report such a minor change of this value?

  3. What commands would you have used if you answer the previous two questions using R?

Exercise 2: Conceptual understanding of what “good” estimation procedures possess

Suppose you are provided two estimators \(S\) and \(T\) for a parameter \(\theta\). Let \(n\) be the sample size. Suppose \(\mathbb{E}\left(S\right)=\theta\) and \(\mathbb{E}\left(T\right)=\theta+3/\sqrt{n}\). Furthermore, \(\mathsf{Var}\left(S\right)=4\) and \(\mathsf{Var}\left(T\right)=40/n\).

  1. Are \(S\) and \(T\) unbiased estimators of \(\theta\)? No explanation is needed.

  2. Are \(S\) and \(T\) consistent estimators of \(\theta\)? Show why or why not.

  3. Using your results, shed light on whether unbiasedness implies consistency and whether consistency implies unbiasedness.

Exercise 3: Confidence intervals, based on Stine and Foster (2004)

Hoping to lure more shoppers downtown, a city builds a new public parking garage in the central business district. The city plans to pay for the structure through parking fees. During the past 44 weekdays, daily fees collected averaged $1264 with a standard deviation of $150.

  1. What assumptions and conditions would you need to make in order to use these statistics for inference?

  2. Describe the parameter of interest to the city. Be specific.

  3. Compute a 95% confidence interval for the parameter in item 2.

  4. The consultant who advised the city on this project predicted that parking revenues would average $1300 per day. On the basis of your confidence interval, do you think the consultant was correct? Why or why not?

  5. Do you think the true value of the parameter in item 2 is in the confidence interval you computed in item 3?

Exercise 4: Reading R code and conceptual understanding of sampling distributions and confidence intervals

Read the following R code and answer the following questions. In the code, you will see two digits preceding every line. These two digits represent the line number. Whenever possible, be very specific with your answers.

Let \(W_{1},\ldots,W_{15}\overset{\mathsf{IID}}{\sim}Exp\left(4\right)\). Note that \(\mathbb{E}\left(W_t\right)=1/4\) and \(\mathsf{Var}\left(W_t\right)=1/16\).

reps <- 10^4
wbar <- numeric(reps) 
temp <- matrix(NA, nrow = reps, ncol = 2) 
for (i in 1:reps) 
{ 
    w <- rexp(15, 4) 
    wbar[i] <- mean(w)
    temp[i, ] <- c(mean(w)-1.96*sd(w)/sqrt(15), mean(w)+1.96*sd(w)/sqrt(15))
} 
hist(wbar)
mean(wbar)
sd(wbar)
mean(temp[, 1] <= 1/4 & temp[, 2] >= 1/4)
mean(wbar-1/4 >= 0.1)
  1. Describe what specific statistical object line 8 is supposed to represent.

  2. Describe what specific statistical object line 10 is supposed to represent.

  3. What output do you expect line 11 to produce?

  4. What output do you expect line 12 to produce?

  5. What output do you expect line 13 to produce?

  6. What output do you expect line 14 to produce?

Exercise 5: Detecting mechanical defects using hypothesis testing, partially based on Stock and Watson (2019)

With a perfectly balanced roulette wheel, in the long run red numbers should turn up 18 times in 38. To test its wheel, one casino records the results of 3800 plays. Here you will set up the statistical model and conduct a test of hypothesis. Let \(X_{t}=1\) when a red number turns up and \(X_t=0\) when a red number does not turn up.

  1. Based on what is described, \(X_{1},\ldots,X_{3800}\) are IID draws from what distribution?

  2. Under the assumption that in the long run red numbers should turn up 18 times in 38, what would be the mean \(\mu\) and variance \(\sigma^{2}\) of the distribution in the previous item?

  3. If there are really too many or too few reds, what can you say about the mean \(\mu\) of the distribution in item 1?

  4. You are asked to determine if the roulette wheel is working fine or the roulette wheel has a mechanical problem (too many or too few reds). Based on your previous answers, write down the null hypothesis (in mathematical form). Write down the alternative hypothesis (in mathematical form).

  5. Suppose that you observe 1890 red numbers in 3800 plays. What would you conclude?

  6. What R commands would you use to give a \(p\)-value?

Exercise 6: I SEE THE MOUSE

Recall our example on I SEE THE MOUSE and the definition of \(Y\) and \(X_1\) for that example. You already know that the CEF of \(Y\) given \(X_1\) is different from the BLP of \(Y\) given \(X_1\).

  1. We can always write \(Y=\mathbb{E}\left(Y|X_1\right)+\varepsilon\). Find the distribution of the CEF error \(\varepsilon\). Find \(\mathbb{E}\left(\varepsilon|X_1\right)\) and \(\mathsf{Var}\left(\varepsilon|X_1\right)\).
  2. We can also always write \(Y=2+X_1+\varepsilon\). Find the distribution of the BLP error \(\varepsilon\). Find \(\mathbb{E}\left(\varepsilon|X_1\right)\) and \(\mathsf{Var}\left(\varepsilon|X_1\right)\).
  3. Is \(Y=2+X_1+\varepsilon\), where \(\varepsilon\) is a BLP error, a correctly specified linear regression model? Explain.
  4. Suppose you have data randomly sampled from the joint distribution of \(\left(X_1, Y\right)\), and we run a least squares regression of \(Y\) on \(X_1\) using the default settings of lm(). After that, we test the null hypothesis that \(\beta_1=1\) against the alternative that \(\beta_1\neq 1\) at the 5% level. Below you find the R code to evaluate the properties of the procedure previously described. Discuss the Monte Carlo findings evaluating the procedure used to implement the test.
source <- matrix(c(1,3,3,5,0,2,1,1), ncol=2)
source # see what it looks like
     [,1] [,2]
[1,]    1    0
[2,]    3    2
[3,]    3    1
[4,]    5    1
get.test.stat <- function(n)
{
  # Follow the joint distribution of I SEE THE MOUSE
  data <- source[sample(nrow(source),size=n,replace=TRUE),]
  temp <- lm(data[,1]~data[,2])
  beta1hat <- coef(temp)[[2]]
  se.beta1hat <- sqrt(diag(vcov(temp)))[[2]]
  teststat <- (beta1hat-1)/se.beta1hat
  return(teststat)
}
results <- replicate(10^4, get.test.stat(50))
mean(results > 1.96 |  results < -1.96)
[1] 0.0293
results <- replicate(10^4, get.test.stat(800))
mean(results > 1.96 |  results < -1.96)
[1] 0.0169

Exercise 7: The linear probability model

Recall that the CEF can be equal to the BLP when you have a saturated model. This is the case where your \(Y\) could be continuous or discrete and the \(X\)’s all have to be binary or dummy variables.

Consider the workplace smoking ban dataset described here. You will try to make sense of what a linear probability model is all about, how to interpret the coefficients of a linear probability model, how they relate to summaries of the data, and what a saturated model really means. The latter is usually used to justify the use of linear regression in settings where \(Y\) is discrete.

Let \(Y\) be the variable smoker, \(X_1\) be the variable smkban, and \(X_2\) be the variable female. All variables are dummies.

Below you will find R code related to this exercise.

library(foreign)
smoking <- read.dta("https://www.princeton.edu/~mwatson/Stock-Watson_3u/Students/EE_Datasets/Smoking.dta")
lm(smoker ~ 1, data = smoking)
mean(smoking$smoker)
lm(smoker ~ smkban, data = smoking)
tapply(smoking$smoker, smoking$smkban, mean)
lm(smoker ~ female, data = smoking)
tapply(smoking$smoker, smoking$female, mean)
lm(smoker ~ smkban + female, data = smoking)
lm(smoker ~ smkban + female + smkban:female, data = smoking)
tapply(smoking$smoker, list(smoking$smkban, smoking$female), mean)

Here is the R output:


Call:
lm(formula = smoker ~ 1, data = smoking)

Coefficients:
(Intercept)  
     0.2423  
[1] 0.2423

Call:
lm(formula = smoker ~ smkban, data = smoking)

Coefficients:
(Intercept)       smkban  
    0.28960     -0.07756  
        0         1 
0.2895951 0.2120367 

Call:
lm(formula = smoker ~ female, data = smoking)

Coefficients:
(Intercept)       female  
    0.25762     -0.02718  
        0         1 
0.2576209 0.2304417 

Call:
lm(formula = smoker ~ smkban + female, data = smoking)

Coefficients:
(Intercept)       smkban       female  
    0.29877     -0.07538     -0.01864  

Call:
lm(formula = smoker ~ smkban + female + smkban:female, data = smoking)

Coefficients:
  (Intercept)         smkban         female  smkban:female  
      0.31045       -0.09676       -0.04236        0.03965  
          0         1
0 0.3104493 0.2680895
1 0.2136860 0.2109795
  1. Since \(Y\)is a binary 0/1 variable, what does \(\mathbb{E}\left(Y\right)\) represent and how does this help you in interpreting the results in lines 3 and 4?
  2. The conditional expectation \(\mathbb{E}\left(Y|X_1\right)\) is really a subgroup population average. What does this object represent? Justify the name “linear probability model”.
  3. Use the R output to determine if least squares be used to recover subgroup averages? How would you interpret the coefficients from the corresponding lm() command?
  4. Repeat Item 3 for \(\mathbb{E}\left(Y|X_2\right)\).
  5. Now consider \(\mathbb{E}\left(Y|X_1, X_2\right)\). Use the R output (lines 9 to 11) to determine how you should apply least squares so that you can recover this particular CEF.
  6. Based on what you have seen so far, what is now your understanding of a saturated model?
  7. Can you conclude that the causal effect of the smoking ban is represented by \(\mathbb{E}\left(Y|X_1=1,X_2=x\right)-\mathbb{E}\left(Y|X_1=0,X_2=x\right)\)?
  8. What if there were another dummy variable, say \(X_3\) for black? How would you apply least squares so that you can recover \(\mathbb{E}\left(Y|X_1, X_2, X_3\right)\)?

Exercise 8: SAT scores, based on Wooldridge (2019)

Below you will find R code estimating the model \[\mathrm{sat} = \beta_0+\beta_1 \mathrm{hsize}+\beta_2\mathrm{hsize}^2+\beta_3 \mathrm{female}+\beta_4\mathrm{black}+\beta_5\mathrm{female}\times\mathrm{black}+\varepsilon \] The description1 of the dataset could be found here.

Suppose that the conditions of Version 2 of the correctly specified linear regression model hold.

library(foreign)
gpa2 <- read.dta("http://fmwww.bc.edu/ec-p/data/wooldridge/gpa2.dta")
summary(gpa2$hsize)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   0.03    1.65    2.51    2.80    3.68    9.40 
result <- lm(sat ~ hsize + hsizesq + female + black + female:black, data=gpa2)
result

Call:
lm(formula = sat ~ hsize + hsizesq + female + black + female:black, 
    data = gpa2)

Coefficients:
 (Intercept)         hsize       hsizesq        female         black  
    1028.097        19.297        -2.195       -45.091      -169.813  
female:black  
      62.306  
library(sandwich)
vcovHC(result)
             (Intercept)        hsize       hsizesq        female         black
(Intercept)    40.560861 -20.31005096  2.3502359605 -9.052500e+00  -11.59581004
hsize         -20.310051  14.62999282 -1.9129657400  8.208576e-02    1.28851117
hsizesq         2.350236  -1.91296574  0.2764387415 -6.278926e-04   -0.08077265
female         -9.052500   0.08208576 -0.0006278926  1.789903e+01    8.83660577
black         -11.595810   1.28851117 -0.0807726456  8.836606e+00  242.04401938
female:black    8.248203   0.05581902  0.0326601173 -1.785279e+01 -241.54439938
              female:black
(Intercept)     8.24820279
hsize           0.05581902
hsizesq         0.03266012
female        -17.85279148
black        -241.54439938
female:black  379.20953866
library(car)
Loading required package: carData
linearHypothesis(result, c("hsize", "hsizesq"), rhs = 0, vcov = vcovHC(result), test = "Chisq")
Linear hypothesis test

Hypothesis:
hsize = 0
hsizesq = 0

Model 1: restricted model
Model 2: sat ~ hsize + hsizesq + female + black + female:black

Note: Coefficient covariance matrix supplied.

  Res.Df Df  Chisq Pr(>Chisq)    
1   4133                         
2   4131  2 29.552  3.826e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
linearHypothesis(result, c("hsize + 3 * hsizesq"), rhs = 0, vcov = vcovHC(result), test = "Chisq")
Linear hypothesis test

Hypothesis:
hsize  + 3 hsizesq = 0

Model 1: restricted model
Model 2: sat ~ hsize + hsizesq + female + black + female:black

Note: Coefficient covariance matrix supplied.

  Res.Df Df  Chisq Pr(>Chisq)    
1   4132                         
2   4131  1 28.653  8.656e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  1. Show that \[\begin{eqnarray*}&&\mathbb{E}\left(\mathrm{sat}|\mathrm{hsize}=2, \mathrm{female}=0, \mathrm{black}=1\right)-\mathbb{E}\left(\mathrm{sat}|\mathrm{hsize}=1, \mathrm{female}=0, \mathrm{black}=1\right)\\ &=&\beta_1+3\beta_2\end{eqnarray*}\]
  2. Use Item 1 along with the R output to provide a possible, but valid interpretation of the predicted differences in SAT scores for subgroups whose high school graduating class differ in size.
  3. Compare what you have calculated in Item 1 to \[\frac{\partial \mathbb{E}\left(\mathrm{sat}|\mathrm{hsize}=x, \mathrm{female}=0, \mathrm{black}=1\right)}{\partial x}\] Comment on the discrepancies if there are any.
  4. Using the given information, test the null hypothesis that \(\beta_1+3\beta_2=0\) against the alternative that \(\beta_1+3\beta_2\neq 0\) at the 5% level using the following approach2:
    1. Express in words what this null hypothesis means.
    2. Compute \(\widehat{\beta}_1+3\widehat{\beta}_2\).
    3. Find \(\mathsf{Var}\left(\widehat{\beta}_1+3\widehat{\beta}_2\right)\). Use the formula for the variance of a sum you derived before here.
    4. Compute the standard error of \(\widehat{\beta}_1+3\widehat{\beta}_2\).
    5. Calculate the relevant test statistic.
    6. Communicate your finding.
  5. The R output above also has a calculation of the test of the null that \(\beta_1+3\beta_2=0\). Do both approaches point to the same finding? What do you notice about how the test statistic from the computer output is related to what you found in Item 4e?
  6. What type of standard errors were constructed in the R output?

Exercise 9: Effects of assuming independence

Consider the following equation \(Y=\beta_0+\beta_1X+\varepsilon\). You have encountered different types of assumptions on \(\varepsilon\).

A \(\varepsilon\) independent of \(X\) and \(\mathbb{E}\left(\varepsilon\right)=0\)

B \(\mathbb{E}\left(\varepsilon|X\right)=0\)

C \(\mathbb{E}\left(\varepsilon\right)=0\) and \(\mathbb{E}\left(X\varepsilon\right)=0\)

  1. Show that Assumption A implies Assumption B. Show that Assumption B implies Assumption C.
  2. Show that Assumption A implies that \(\mathsf{Var}\left(\varepsilon|X\right)=\mathsf{Var}\left(\varepsilon\right)\).
  3. Use item 2 to justify why MRW assumed conditional homoscedasticity when computing standard errors.

Exercise 10: Estimating elasticities

So far, you have encountered multiple applications where the target parameter is an elasticity, or at least you have encountered applications where the variables may be logarithmically transformed. Another potential issue with logarithmic transformation is explored in this exercise. Recall that the elasticity of \(Y\) with respect to \(X\) is the ratio of the percentage difference of \(Y\) to the percentage difference in \(X\), i.e. \[\frac{\partial \log Y}{\partial \log X}=\frac{\dfrac{1}{Y}\partial Y}{\dfrac{1}{X}\partial X}=\frac{X}{Y}\cdot\frac{\partial Y}{\partial X}\]

Consider a stylized situation where \(Z=\alpha_{1}W^{\alpha_{2}}V^{\alpha_{3}}\eta\) and \(\mathbb{E}\left(\eta|W, V\right)=1\). The target parameter of interest is \(\alpha_{2}\).

  1. Show that the CEF of \(Z\) given \(W\) and \(V\) is \(\mathbb{E}\left(Z|W,V\right)=\alpha_{1}W^{\alpha_{2}}V^{\alpha_{3}}\).
  2. Show that \[\dfrac{\partial\log\mathbb{E}\left(Z|W,V\right)}{\partial\log W} = \alpha_{2}\]
  3. Note that you can write \(Z=\alpha_{1}W^{\alpha_{2}}V^{\alpha_{3}}+v\) and \(v\) is a CEF error. Does this mean that A2 and A3 in Version 2 of the correctly specified linear regression model are satisfied?
  4. Given item 3, it is not surprising that many have tried to log-linearize this model, i.e., \[\log Z=\log\alpha_{1}+\alpha_{2}\log W+\alpha_3\log V+\log\eta\] Show that \[\begin{aligned}\dfrac{\partial\mathbb{E}\left(\log Z|W,V\right)}{\partial\log W} = \alpha_{2}+\dfrac{\partial\mathbb{E}\left(\log\eta|W,V\right)}{\partial\log W}=\alpha_{2}+W\cdot\dfrac{\partial\mathbb{E}\left(\log\eta|W,V\right)}{\partial W}\end{aligned}\]
  5. Think back to the context of MRW. Think of \(\eta\) as deviations from the initial level of technology, \(W\) as the saving rate, and \(V\) as the population growth rate. Use the results you have obtained and properties of the conditional expectation to explain why MRW impose an independence assumption which enables them to recover elasticities correctly by applying OLS to the log-linearized model.

Exercise 11: Is it a demand curve or a supply curve?

A model inspired by the theory of demand and supply could be written as: \[\begin{eqnarray}q_t^d &=&\alpha_0+\alpha_1 p_t+u_t \\ q_t^s &=&\beta_0+\beta_1 p_t+v_t\\ q_t^d &=& q_t^s=q_t\end{eqnarray}\] Assume IID sampling over \(t\), \(\mathbb{E}\left(u_t\right)=0\), \(\mathbb{E}\left(v_t\right)=0\), \(\mathsf{Var}\left(u_t\right)=\sigma^2_u\), \(\mathsf{Var}\left(v_t\right)=\sigma^2_v\), and \(\mathsf{Cov}\left(u_t,v_t\right)=0\). The variables \((q_t, p_t)\) for all \(t\) are observable, but \((u_t,v_t)\) are unobservable.

  1. Show that, in equilibrium, we have \[\begin{eqnarray} p_t &=& \frac{\beta_0-\alpha_0}{\alpha_1-\beta_1}+\frac{v_t-u_t}{\alpha_1-\beta_1} \\ q_t &=& \frac{\alpha_1\beta_0-\alpha_0\beta_1}{\alpha_1-\beta_1}+\frac{\alpha_1 v_t-\beta_1 u_t}{\alpha_1-\beta_1}.\end{eqnarray}\]
  2. Are the previous equations reduced forms? Explain.
  3. Suppose a researcher specifies \(q_t=\gamma_0+\gamma_1 p_t+\varepsilon_t\). Show that applying least squares to a regression of \(q_t\) on \(p_t\) will give an estimator for the slope of price having the following asymptotic behavior: \[\widehat{\gamma}_1\overset{p}{\to} \frac{\alpha_1\sigma^2_v+\beta_1\sigma^2_u}{\sigma^2_u+\sigma^2_v}.\]
  4. Notice that \(\widehat{\gamma}_1\) in large samples behaves like a weighted average of two objects. What are those two objects? Do you learn the slope of the demand curve? Do you learn the slope of the supply curve?
  5. There are no available instrumental variables in the setting described. If you really want to learn the slope of the demand curve from applying least squares to a regression of \(q_t\) on \(p_t\), what assumption can you impose to guarantee this?

Exercise 12: Demand and supply, redux

In the previous exercise, you encountered a situation where we are unable to use an instrumental variables strategy. Consider a modification of the model as follows: \[\begin{eqnarray}q_t^d &=&\alpha_0+\alpha_1 p_t+u_t \\ q_t^s &=&\beta_0+\beta_1 p_t+\beta_2 s_t + v_t\\ q_t^d &=& q_t^s=q_t\end{eqnarray}\] Assume IID sampling over \(t\), \(\mathbb{E}\left(u_t\right)=0\), \(\mathbb{E}\left(v_t\right)=0\), \(\mathsf{Var}\left(u_t\right)=\sigma^2_u\), \(\mathsf{Var}\left(v_t\right)=\sigma^2_v\), and \(\mathsf{Cov}\left(u_t,v_t\right)=0\). The variables \((q_t, p_t, s_t)\) for all \(t\) are observable, but \((u_t,v_t)\) are unobservable. Finally, assume that \(\mathbb{E}\left(s_tu_t\right)=0\) and \(\mathbb{E}\left(s_tv_t\right)=0\).

  1. Show that, in equilibrium, we have \[\begin{eqnarray} p_t &=& \frac{\beta_0-\alpha_0}{\alpha_1-\beta_1}+\frac{\beta_2}{\alpha_1-\beta_1}s_t+\frac{v_t-u_t}{\alpha_1-\beta_1} \\ q_t &=& \frac{\alpha_1\beta_0-\alpha_0\beta_1}{\alpha_1-\beta_1}+\frac{\alpha_1\beta_2}{\alpha_1-\beta_1}s_t+\frac{\alpha_1 v_t-\beta_1 u_t}{\alpha_1-\beta_1}.\end{eqnarray}\]
  2. Are the previous equations reduced forms? Explain.
  3. Suppose the researcher wants to estimate the parameters of the demand curve \(\alpha_0\) and \(\alpha_1\). Show that \(\mathbb{E}\left(p_tu_t\right)\neq 0\). What does this finding imply when we use OLS to estimate the parameters of the demand curve?
  4. Propose an instrumental variables strategy to allow you to estimate the parameters of the demand curve. Write down the set of moment conditions you have to impose so that you can estimate the said parameters.
  5. Specify the null hypothesis if you want to test whether or not the excluded instruments are weak.
  6. In a few sentences, describe what you think is the message of this exercise. How can we apply an instrumental variables strategy in a supply-demand context and our interest centers on estimating the parameters of the demand curve? Try to compare your findings with the previous exercise where \(s_t\) is not present in the model.

Exercise 13: Empirical application on OLS and IV, partially based on Stock and Watson (2019)

During the 1880s, a cartel known as the Joint Executive Committee (JEC) controlled the rail transport of grain from the Midwest to eastern cities of the United States. The cartel preceded the Sherman Antiturst Act of 1890, and it legally operated to increase the price of grain above what would have been the competitive price. From time to time, cheating by members of the cartel brought about a temporary collapse of the collusive price-setting agreement. In This exercise, you will use variations in supply associated with the cartel’s collapses to estimate the elasticity of demand for rail transport of grain.

Data on weekly observations on the rail shipping price and other factors from 1880 to 1886 are loaded in R as the object JEC. The description of the variables may be found here.

A model for the demand for rail transport was specified as \[\log Q_t = \beta_0+\beta_1\log P_t + \beta_2 \mathrm{ice}_t + \sum_{j=1}^{12} \beta_{2+j} \mathrm{seas}_{jt}+\varepsilon_t\] and empirical results based on the estimation of such a model are presented below along with the R code and a summary of the output.

library(foreign)
JEC <- read.dta("https://www.princeton.edu/~mwatson/Stock-Watson_3u/Students/EE_Datasets/JEC.dta")
hist(JEC$price)

JEC$logQ <- log(JEC$quantity)
JEC$logP <- log(JEC$price)
result.1 <- lm(logQ ~ logP + ice, data = JEC)
result.2 <- lm(logQ ~ logP + ice + seas1 + seas2 + seas3 + seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + seas11 + seas12, data = JEC)
library(AER)
result.3 <- ivreg(logQ ~ logP + ice + seas1 + seas2 + seas3 + seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + seas11 + seas12 | cartel + ice + seas1 + seas2 + seas3 + seas4 + seas5 + seas6 + seas7 + seas8 + seas9 + seas10 + seas11 + seas12, data = JEC)
result.4 <- lm(logP ~ cartel + ice + seas1 + seas2 + seas3 + seas4 + seas5 +     seas6 + seas7 + seas8 + seas9 + seas10 + seas11 + seas12, data = JEC)
library(tibble)
rows <- tribble(~label, ~result.1, ~result.2, ~result.3, ~result.4, 
                'Seasonal dummies', 'No', 'Yes', 'Yes', 'Yes',
                'Sum of squared residuals', format(sum(residuals(result.1)^2), digits = 4), format(sum(residuals(result.2)^2), digits = 4), format(sum(residuals(result.3)^2), digits = 4), format(sum(residuals(result.4)^2), digits = 4))
attr(rows, 'position') <- c(7, 8)
library(modelsummary)
results <- list()
results[["result.1"]] <- result.1
results[["result.2"]] <- result.2
results[["result.3"]] <- result.3
results[["result.4"]] <- result.4
modelsummary(results, coef_map = c("logP", "ice", "cartel"), gof_map = c("nobs"), add_rows = rows)
result.1 result.2 result.3 result.4
logP −0.634 −0.639 −0.867
(0.082) (0.082) (0.132)
ice 0.410 0.448 0.423 0.035
(0.048) (0.120) (0.122) (0.064)
cartel 0.358
(0.025)
Seasonal dummies No Yes Yes Yes
Sum of squared residuals 54.54 49.4 50.6 13.99
Num.Obs. 328 328 328 328
  1. What do you think is the base category used for the seasonal dummies?
  2. Suppose that the assumptions for the validity of OLS in this application hold.
    1. Provide a valid interpretation of the elasticity of demand in result.2.
    2. What type of standard errors do you think were computed in this application?
    3. Test the null hypothesis that price cannot predict demand against the alternative that prices can predict demand at the 5% level. Communicate your finding.
    4. Suppose the researcher is interested in testing the null hypothesis that seasonal variations cannot predict demand at the 5% level. How many restrictions would there be?
    5. Do you have enough information to test the null hypothesis that seasonal variations cannot predict demand? If you can, calculate the relevant statistic and communicate your finding. If you cannot, what information would you need to be able to implement the test (giving the R commands for this is not necessary)?
  3. Do you think the IID assumption is plausible in this application? Explain.
  4. The interaction of supply and demand could make the OLS estimator of the elasticity of demand biased and inconsistent. What assumption do you think is violated by this concern?
  5. The researcher used the dummy variable cartel as part of an instrumental variables strategy. Consult the R code related to how the strategy was implemented.
    1. Provide an argument as to why cartel could be exogenous with respect to \(\varepsilon_t\).
    2. Provide an argument as to why cartel would satisfy instrument relevance.
    3. Is there evidence that cartel is a weak instrument? If you can, show your calculations and communicate your finding. If not, what information would you need to be able to determine whether cartel is a weak instrument.
    4. What are \(L\) and \(K\) for this instrumental variables strategy?
    5. Are you able to test for overidentifying restrictions? If you can, compute the Sargan statistic and communicate your finding. If you cannot, what additional information would you need to be able to implement this test?
  6. Does the evidence suggest that the cartel was charging the profit-maximizing monopoly price?3

Footnotes

  1. Of course, descriptions of the variables will be made available in the exam.↩︎

  2. Note that this question could have been reframed in terms of a confidence interval.↩︎

  3. Click the links to review price elasticity of demand and monopoly pricing.↩︎