Exercise Set 05

Technical exercise 1: more on logarithmic transformations

Prove that \[\log_b x = \frac{\log_a x}{\log_a b}\] for \(x>0\) and bases \(a>0\), \(b>0\). Start by letting \(y=\log_b x\) and then re-express this as an equation in terms of a power of \(b\). After that, take logarithms of both sides of that equation with respect to a base (figure out which base) and then apply properties of logarithms to derive what needs to be proved.
Refer to the slides on the logarithmic transformations. Given #1 and Technical Exercise 2 of Exercise Set 3, show why the regression slope of log10NetSales in the linear regression of log10TotalComp on log10NetSales is the same as the regression slope of logNetSales in the linear regression of logTotalComp on logNetSales.

Technical exercise 2: a complication with logarithms

Recall from this slide that comparing logarithms of a variable can approximately be thought of as relative changes of that variable on the original scale. Your task is to get a sense of the resulting approximation error.

Let \(x_1\) and \(x_2\) be the old level and new level of some variable, respectively. Suppose we are considering a comparison between those two levels after a logarithmic transformation, i.e. let \(p=\log x_2-\log x_1\). In other words, the logarithms differ by \(p\). Show that \[\left(e^p-1\right)\times 100\%=\left(\frac{x_2-x_1}{x_1}\right)\times 100\%\] This is an exact value of the relative change.
In contrast, the slides use an approximation of the form: \[p \times 100\% \approx \left(\frac{x_2-x_1}{x_1}\right)\times 100\%\] Create a table containing the exact and approximate relative changes for different values of \(p\) (from 0 to 1). Discuss the severity of the approximation error, if there is any.

Technical exercise 3: applying the formula for \(\widehat{\beta}\)

Your task is to work on showing what \(\widehat{\beta}\) looks like for the special cases found in the slides.

Start from \[\widehat{\beta}=\left(\frac{1}{n}\sum_{t=1}^n X_tX_t^\prime\right)^{-1}\left(\frac{1}{n}\sum_{t=1}^n X_tY_t\right)\] and work on the two special cases. For the first special case, you should show that \(\widehat{\beta}=\overline{Y}\). For the second special case, you should show that along with Technical Exercise 1 of Exercise Set 2, you will have the formulas found here.

Technical exercise 4: where did the formula for \(\widehat{\beta}\) come from?

You are going to be working out the details of regression with only one regressor with an intercept (really, simple linear regression). Let \(Y_t\) be the \(t\)th observation of the regressand. Recall that lm() is OLS and that we are minimizing a sum of squared residuals.

Since our regression line for this case is just \(\widehat{Y}_t=\widehat{\beta}_0+\widehat{\beta}_1X_{1t}\), where \(\widehat{\beta}_0\) and \(\widehat{\beta}_1\) need to be determined, you should be able to use what you learned in mathematical economics to minimize \[\sum_{t=1}^n \left(Y_t-\widehat{Y}_t\right)^2=\sum_{t=1}^n \left(Y_t-\widehat{\beta}_0-\widehat{\beta}_1X_{1t}\right)^2 \tag{1}\] with respect to both \(\widehat{\beta}_0\) and \(\widehat{\beta}_1\).

Unlike the solutions to the similar technical exercises from before, expanding would be more complicated. For example, you can use the fact that the derivative of a sum is the sum of the derivatives: \[\frac{d}{d\widehat{\beta}_1} \sum_{t=1}^n \left(Y_t-\widehat{\beta}_1X_{1t}\right)^2 = \sum_{t=1}^n \frac{d}{d\widehat{\beta}_1}\left(Y_t-\widehat{\beta}_1X_{1t}\right)^2=\sum_{t=1}^n 2\left(Y_t-\widehat{\beta}_1X_{1t}\right)\left(-X_{1t}\right)\]

Show that the two first-order conditions for finding the optimal values of \(\widehat{\beta}_0\) and \(\widehat{\beta}_1\) are given by: \[\begin{eqnarray} \sum_{t=1}^n \left(Y_t-\widehat{\beta}_0-\widehat{\beta}_1X_{1t}\right) &=& 0 \\ \sum_{t=1}^n X_{1t}\left(Y_t-\widehat{\beta}_0-\widehat{\beta}_1X_{1t}\right) &=& 0 \end{eqnarray}\]
Show that you can rewrite these two equations \[\begin{eqnarray} \sum_{t=1}^n Y_t &=& n\widehat{\beta}_0 + \widehat{\beta}_1 \sum_{t=1}^n X_{1t} \\ \sum_{t=1}^n X_{1t}Y_t &=& \widehat{\beta}_0\sum_{t=1}^n X_{1t}+ \widehat{\beta}_1 \sum_{t=1}^n X_{1t}^2 \end{eqnarray}\]
Next, show that you can express these two equations in matrix form by putting in the appropriate entries into the entries marked by question marks: \[\begin{eqnarray}\begin{pmatrix}\ ? & \ ? \\ \ ? & \ ?\end{pmatrix} \begin{pmatrix} \widehat{\beta}_0 \\ \widehat{\beta}_1 \end{pmatrix}= \begin{pmatrix}\ ?\\ \ ? \end{pmatrix}\end{eqnarray}\]
Use what you have seen so far in this exercise to show that
1. the mean of the residuals has to be zero.
2. the correlation coefficient between the regressor \(X_{1}\) and the residuals is zero.
3. the mean of the actual values of \(Y\) is equal to the mean of the fitted values.

Authoring your fourth Quarto document

You will create your fourth Quarto document containing your attempt to reproduce the results in Hamermesh and Parker (2005). The complete dataset can be found in the following csv file. The description of the variables could be found here. The information is not very complete so you may need to do some “detective work” with the data, the paper, and the descriptions to gain a sense of what is happening. The variable names can be suggestive.

Your task is to apply your R skills to reproduce the descriptive statistics found in Table 1 and only the regression coefficients in Table 3.

You may need to introduce weights into your calculation, based on the following:

The notes to Table 1 state that “All statistics except for those describing the number of students, the percent evaluating the instructor and the lower–upper division distinction are weighted by the number of students completing the course evaluation forms.” The source of the weights is didevaluation.
The authors state in page 372 that “Table 3 presents weighted least-squares estimates of the equations describing the average course evaluations. As weights we use the number of students completing the evaluation forms in each class, …”

To compute a weighted average, you may want to consult weighted.mean(). Unfortunately, there is no weighted standard deviation. So you may need to do this computation manually. A weighted average is calculated in the following way: Let \(w_t\) be the weight for the \(t\)th observation and \(X_t\) be the \(t\)th observation. Then the weighted average of the \(X_t\)’s is given by \[\overline{X}_w=\frac{\sum_{t=1}^n w_t X_t}{\sum_{t=1}^n w_t}\] The standard deviation is the square root of the variance and the variance is also an average. Specifically, the variance is the average of the squared deviations from the mean. Therefore, the weighted variance is the weighted average of the squared deviations from the weighted mean. Taking the square root gives you the weighted standard deviation.

Below you will find lines of code which illustrate how to implement the formula for the weighted mean. From here, you should be able to write your own code to calculate the weighted standard deviation.¹

# Demonstration of weighted mean
x <- c(1, 2, 3)
w <- c(2, 1, 4)
sum(w*x)/sum(w)
weighted.mean(x, w)

For lm(), you need to specify weights = didevaluation as part of your lm() command. It is also possible to use subsets of the data using lm() directly. Consult the help file for lm() for more.

Develop the codes necessary for you to achieve this task. Document which portions cannot be reproduced very well. No data analysis and no interpretation of results are required of you. You are not expected to make “nice-looking” tables.

What you will be expected to do

You will be submitting to my email a zip file (not rar, not 7z) with filename surname_exset05.zip, replacing surname with your actual surname, and making sure it contains

Scanned PDF solutions to the technical exercises (do be mindful of the size of the file, keep under 15 MB if possible) with filename surname_tech05.pdf
Your qmd file with filename surname_exset05.qmd and
The HTML file associated with your qmd file.

Footnotes

If you are ambitious, you may try to create your own function to calculate the weighted standard deviation. You can start from Lesson 16 of fasteR.↩︎