Hamermesh and Parker (2005)

Author

Andrew Pua

Published

October 26, 2022

Preamble

Here I load the foreign package and the look at the names of the variables. I also calculate some summary statistics, but leave out the interpretation.

Code
rm(list=ls())
library(foreign)
ratings <- read.dta("TeachingRatings.dta")
names(ratings)
[1] "minority"    "age"         "female"      "onecredit"   "beauty"     
[6] "course_eval" "intro"       "nnenglish"  
Code
apply(ratings, 2, mean)
    minority          age       female    onecredit       beauty  course_eval 
1.382289e-01 4.836501e+01 4.211663e-01 5.831533e-02 4.754221e-08 3.998272e+00 
       intro    nnenglish 
3.390929e-01 6.047516e-02 
Code
apply(ratings[, c("age", "beauty", "course_eval")], 2, sd)
        age      beauty course_eval 
  9.8027420   0.7886477   0.5548656 

Remarks about summary statistics

In these solutions, I will focus on the interpretation of the regression coefficients and report the summary statistics. Note that standard deviations of dummy variables are harder to interpret. But the means of dummy variables are really proportions. It is also possible to show some histograms.

General remarks about the solutions

You were asked to provide the best interpretations possible. This does not necessarily mean that you have to talk about statistical significance. For those who have not encountered this phrase, skip this remark. For those who have encountered it, statistical significance requires assumptions which may not be satisfied by this application. Furthermore, interpreting coefficients and deciding whether the coefficients are significantly different from zero (which is not the most correct terminology either) are two different tasks!

Regression 1

I recentered age at its mean to make the regression coefficients more interpretable.

Code
ratings$c.age <- ratings$age - mean(ratings$age)
lm(course_eval ~ beauty + c.age, data = ratings)

Call:
lm(formula = course_eval ~ beauty + c.age, data = ratings)

Coefficients:
(Intercept)       beauty        c.age  
  3.9982721    0.1340634    0.0002868  
  1. Intercept: The average course evaluation for instructors roughly 48 years old having an average beauty rating is 4 points.

  2. Coefficient of beauty: When we compare instructors with the same age but whose beauty ratings differ by a standard deviation, the instructors who have a higher beauty rating have average course evaluations that are higher by 0.11 points.

  3. Coefficient of c.age: When we compare instructors with the same beauty rating but whose ages differ by 10 years, the average course evaluation of older instructors are higher by 0.003 points.

Remarks

  1. beauty is centered at 0 from the description. So beauty = 0 means that the beauty rating is at the average.
  2. The instructors you are comparing need not have to be c.age = 0. In fact, the resulting comparison will still be the same even if you fix c.age = 10 for example. Try it for yourself. One standard deviation for beauty ratings is about 0.79 and you multiply this by the coefficient of beauty to get the result.
  3. The instructors you are comparing need not have to be beauty = 0. I chose 10 years here because the coefficient of age is very small in magnitude and 10 years is roughly a standard deviation in the age distribution.

Regression 2

Code
lm(course_eval ~ beauty + c.age + beauty:c.age, data = ratings)

Call:
lm(formula = course_eval ~ beauty + c.age + beauty:c.age, data = ratings)

Coefficients:
 (Intercept)        beauty         c.age  beauty:c.age  
   4.0215963     0.1517305     0.0006434     0.0101498  
  1. Intercept: The average course evaluation for instructors roughly 48 years old having an average beauty rating is 4 points.
  2. Coefficient of beauty: When we compare instructors roughly 48 years old but whose beauty ratings differ by a standard deviation, the instructors who have a higher beauty rating have average course evaluations that are higher by 0.12 points.
  3. Coefficient of c.age: When we compare instructors with average beauty rating but whose ages differ by 10 years, the average course evaluation of older instructors are higher by 0.01 points.
  4. Coefficient of beauty:c.age: When we compare instructors roughly 58 years old but whose beauty ratings differ by a standard deviation, the instructors who have a higher beauty rating have average course evaluations that are higher by 0.2 points.

Remarks on the interaction term

  1. Based on the solutions to the related Technical Exercises, it is extremely difficult to interpret the coefficient of the interaction term alone.
  2. I chose 58 years old because this is one standard deviation above the average age.
  3. To obtain the difference in average course evaluations, apply the result in the Technical Exercise if you wish. It is \((0.15+0.01\times(58-48))\times 0.79 \approx 0.2\). The value 0.79 is the standard deviation for beauty ratings.
  4. To express in words the interpretation of the coefficient of the interaction term alone is extremely difficult because you will be comparing two comparisons (see the related Technical Exercise for this exercise set).
  5. Some students used a causal interpretation to get around this difficulty. They state something like “The effect on course evaluation of a one point increase in beauty increases with age”. This is ok if a causal interpretation could be justified, but in our data analysis so far, we are only interpreting what the data is telling us without any additional assumptions.
  6. Another way to interpret the coefficient of beauty:c.age is as follows: When we compare instructors with beauty rating is a standard deviation above the average but whose ages differ by 10 years, older instructors have average course evaluations that are higher by 0.09 points.