## Transcribed Text

C2 The data set in CEOSAL2 contains information on chief executive officers for U.S. corporations. The
variable salary is annual compensation, in thousands of dollars, and ceoten is prior number of years as
company CEO.
(i)
Find the average salary and the average tenure in the sample.
(ii)
How many CEOs are in their first year as CEO (that is, ceoten 0)? What is the longest tenure
as CEO?
(iii) Estimate the simple regression model
log(salary) B0 Biceoten
u,
and report your results in the usual form. What is the (approximate) predicted percentage
increase in salary given one more year as CEO?
C9 Use the data in COUNTYMURDERS to answer these questions Use only the data for 1996.
(i) How many counties had zero murders in 1996? How many counties had at least one execution?
What is the largest number of executions?
(ii) Estimate the equation
murders B0 B,execs
u
by OLS and report the results in the usual way, including sample size and R-squared
(iii) Interpret the slope coefficient reported in part (ii). Does the estimated equation suggest deter-
rent effect of capital punishment?
(iv) What is the smallest number of murders that can be predicted by the equation? What is the
residual for county with zero executions and zero murders?
(v)
Explain why simple regression analysis is not well suited for determining whether capital pun-
ishment has deterrent effect on murders.
C2 Use the data in HPRICEI to estimate the model
price = B0 B1sqrft ß2bdrms + u.
where price is the house price measured in thousands of dollars.
(i)
Write out the results in equation form.
(ii)
What is the estimated increase in price for house with one more bedroom, holding square
footage constant?
(iii)
What the estimated increase in price for house with an additional bedroom that is
140 square feet in size? Compare this to your answer in part (ii).
(iv)
What percentage of the variation in price is explained by square footage and number
of bedrooms?
(v)
The first house in the sample has sqrft = 2,438 and bdrms = 4. Find the predicted selling price
for this house from the OLS regression line.
(vi)
The actual selling price of the first house in the sample was $300,000 (so price = 300). Find
the residual for this house. Does it suggest that the buyer underpaid or overpaid for the house?
C10 Use the data in HTV to answer this question. The data set includes information on wages, education,
parents education, and several other variables for 1,230 working men in 1991
(i) What is the range of the educ variable in the sample? What percentage of men completed
twelfth grade but no higher grade? Do the men or their parents have, on average, higher levels
of education?
(ii) Estimate the regression model
educ = B0 ßimotheduc ß2fatheduc +
u
by OLS and report the results in the usual form. How much sample variation in educ is
explained by parents' education? Interpret the coefficient on motheduc
(iii) Add the variable abil (a measure of cognitive ability) to the regression from part (ii), and report
the results in equation form. Does "ability" help to explain variations in education, even after
controlling for parents' education? Explain.
(iv) (Requires calculus) Now estimate an equation where abil appears in quadratic form:
educ Bo ß,motheduc ß2fatheduc ß,abil abil²
u.
Using the estimates , and B4 use calculus to find the value of abil, call abil`, where educ
is minimized (The other coefficients and values of parents' education variables have no effect;
we are holding parents' education fixed. Notice that abil is measured so that negative values
are permissible. You might also verify that the second derivative is positive so that you do
indeed have minimum
(v)
Argue that only small fraction of men in the sample have "ability" less than the value calcu-
lated in part (iv). Why is this important?
(vi) If you have access to statistical program that includes graphing capabilities, use the estimates
in part (iv) to graph the relationship between the predicted education and abil. Set motheduc and
fatheduc at their average values in the sample, 12 18 and 12.45, respectively.
C13 Use the data in GPA1 to answer this question We can compare multiple regression estimates, where
we control for student achievement and background variables, and compare our findings with the
difference in means estimate in Computer Exercise C11 in Chapter 2.
(i)
In the simple regression equation
obtain Bo and B1. Interpret these estimates
(ii)
Now add the controls hsGPA and ACT-that is, run the regression colGPA on PC, hsGPA, and
ACT. Does the coefficient on PC change much from part (ii)? Does BhrGPA make sense?
(iii)
In the estimation from part (ii), what is worth more: Owning PC or having 10 more points on
the ACT score?
(iv) Now to the regression in part (ii) add the two binary indicators for the parents being college
graduates. Does the estimate of B1 change much from part (ii)? How much variation are you
explaining in colGPA?
(v)
Suppose someone looking at your regression from part (iv) says to you, "The variables hsGPA
and ACT are probably pretty highly correlated, so you should drop one of them from the
regression. How would you respond?
C3
Refer to Computer Exercise C2 in Chapter 3. Now, use the log of the housing price as the dependent
variable:
log(price) B0 Bisqrft ß2bdrms u
(i)
You are interested in estimating and obtaining confidence interval for the percentage
change in price when 150 )-square- foot bedroom is added to house. In decimal form, this is
0, = 150ß, ß2. Use the data in HPRICE1 to estimate 0,
(ii)
Write ß2 in terms of 01 and ß1 and plug this into the log(price) equation
(iii) Use part (ii) to obtain standard error for 0, and use this standard error to construct 95%
confidence interval.
C6 Use the data in WAGE2 for this exercise.
(i) Consider the standard wage equation
log(wage) = Ba B,educ ß2exper Batenure u.
State the null hypothesis that another year of general workforce experience has the same effect
on log(wage) as another year of tenure with the current employer.
(ii) Test the null hypothesis in part (i) against two-sided alternative, at the 5% significance level,
by constructing 95% confidence interval. What do you conclude?
C11 Use the data in HTV to answer this question. See also Computer Exercise C10 in Chapter 3.
(i)
Estimate the regression model
educ ß0 + ß,motheduc ß2fatheduc + ß,abil B,abil²
u
by OLS and report the results in the usual form Test the null hypothesis that educ is linearly
related to abil against the alternative that the relationship is quadratic.
(ii)
Using the equation in part (i), test H4: ß1 = B2 against two-sided alternative. What is the
p-value of the test?
(iii) Add the two college tuition variables to the regression from part (i) and determine whether they
are jointly statistically significant.
(iv)
What is the correlation between tuit17 and tuir18? Explain why using the average of the tuition
over the two years might be preferred to adding each separately. What happens when you do use
the average?
(v)
Do the findings for the average tuition variable in part (iv) make sense when interpreted
causally? What might be going on?
C2 Use the data in GPA2 for this exercise.
(i)
Using all 4,137 observations, estimate the equation
colgpa B0 + ß,hsperc ß2sat u
and report the results in standard form.
(ii) Reestimate the equation in part (i), using the first 2,070 observations
(iii) Find the ratio of the standard errors on hspere from parts (i) and (ii). Compare this with the
result from (5.10).
C3 In equation (4.42) of Chapter 4. using the data set BWGHT. compute the LM statistic for testing
whether motheduc and fatheduc are jointly significant. In obtaining the residuals for the restricted
model, be sure that the restricted model is estimated using only those observations for which all vari-
ables in the unrestricted model are available (see Example 4.9).
C5 Consider the analysis in Computer Exercise C11 in Chapter 4 using the data in HTV, where educ is the
dependent variable in regression
(i)
How many different values are taken on by educ in the sample? Does educ have continuous
distribution?
(ii)
Plot histogram of educ with normal distribution overlay. Does the distribution of educ
appear anything close to normal?
(iii) Which of the CLM assumptions seems clearly violated in the model
educ Ba ßimotheduc + ß2fatheduc ßyabil Brabil² u?
How does this violation change the statistical inference procedures carried out in Computer
Exercise C11 in Chapter 4?

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction
of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice.
Unethical use is strictly forbidden.