Exercise Set 04

Technical exercise 1: practice with interaction terms

Let \(Y_t\) be the regressand and \(X_t^\prime=\left(1,X_{1t},X_{2t},X_{1t}*X_{2t}\right)\) be the list of regressors, so that the regression line is given by \[\widehat{Y}_t=\widehat{\beta}_0+\widehat{\beta}_1X_{1t}+\widehat{\beta}_2X_{2t}+\widehat{\beta}_3 X_{1t}*X_{2t}\] Suppose \(X_{1t}\) and \(X_{2t}\) are both continuous regressors.

  1. What would be the fitted value if \(X_{1t}\) is fixed at \(x_1\) and \(X_{2t}\) is fixed at \(x_2\)? Call this fitted value \(\widehat{Y}_A\).
  2. What would be the fitted value if \(X_{1t}\) is fixed at the same value \(x_1\) but \(X_{2t}\) is fixed at a different value \(x_2^\prime=x_2+1\)? Call this fitted value \(\widehat{Y}_B\).
  3. Obtain a simplified expression for \(\widehat{Y}_B-\widehat{Y}_A\). What happens to this expression when \(x_1=0\)? What happens when \(x_1\neq 0\)?
  4. In this situation, is it possible to provide an interpretation for the coefficient of the interaction term alone? Explain why or why not.
  5. Mapping \(Y\) to stndfnl, \(X_1\) to atndrte, and \(X_2\) to priGPA, along with what you have seen in the previous items, provide the best interpretation (for communication purposes) of the coefficient on priGPA alone.
  6. If compare students who have the same attendance rate of 80 percent, how do those students who have prior college GPA higher by 1 point compare in terms of their standardized final exam scores?

Technical exercise 2: more on interaction terms

Let \(Y_t\) be the regressand and \(X_t^\prime=\left(1,X_{1t},X_{2t},X_{1t}*X_{2t}\right)\) be the list of regressors, so that the regression line is given by \[\widehat{Y}_t=\widehat{\beta}_0+\widehat{\beta}_1X_{1t}+\widehat{\beta}_2X_{2t}+\widehat{\beta}_3 X_{1t}*X_{2t}\] Suppose \(X_{1t}\) and \(X_{2t}\) are both dummy variables. Note that there would be four subgroups:

  • Those \(t\) for which \(X_{1t} =X_{2t}=1\)
  • Those \(t\) for which \(X_{1t} =X_{2t}=0\)
  • Those \(t\) for which \(X_{1t}=1\), but \(X_{2t}=0\)
  • Those \(t\) for which \(X_{1t}=0\), but \(X_{2t}=1\)
  1. Discuss how you are going to interpret \(\widehat{\beta}_0\), \(\widehat{\beta}_1\), \(\widehat{\beta}_2\), and \(\widehat{\beta}_3\). Repeat Item 4 of Technical exercise 1 for this case.
  2. Return to the article entitled, “Is Economics a Good Major for Future Lawyers? Evidence from Earnings Data”. Refer to Table 3 of that paper. Do you think it makes sense to create interaction terms for the majors (like the dummy variable electrical engineering multiplied by the dummy variable chemistry)? Explain why or why not.

Authoring your third Quarto document

You will create your third Quarto document containing your attempt to analyze the data related to Hamermesh and Parker (2005).

Your only task is to provide your own data analysis of the course evaluation dataset. You can calculate numerical summaries of the data. You can compute as many linear regressions as you want but one of them has to include interaction terms.

In the end, you should provide a report with the best interpretations of the reported coefficients for the purpose of communication. But you are not expected to make “nice-looking” tables.

What you will be expected to do

You will be submitting to my email a zip file (not rar, not 7z) with filename surname_exset04.zip, replacing surname with your actual surname, and making sure it contains

  1. Scanned PDF solutions to the technical exercises (do be mindful of the size of the file, keep under 15 MB if possible) with filename surname_tech04.pdf
  2. Your qmd file with filename surname_exset04.qmd and
  3. The HTML file associated with your qmd file.