## Transcribed Text

1. Suppose the joint probability distribution function F(x.g) (actually, probability mass function, since
x and y are discrete) is given by:
1
, 2
y 0
0.10
0.20
0.20
y 1
0.20
0.20
0.10
(a) Find the best constant predictor of y. In all of these subparts, you may assume that "best" means
"minimum mean squared error.
(b) Find the best linear predictor =a+ Br for this population. Hint: Use your notes from
class.
(c) Find the best predictor of y given x.
(d) Is the best linear predictor equal to the best predictor? Why or why not?
(e) Write a function to draw randorn samples of size N from the above bivariate distribution. Show
your function works by drawing a sample of size N = 10,000 and tabulating the values (e.g., with
table())
Hint 1: Is there any difference between writing the sample space as a 2x3 matrix versus writing it
as a 1 x 6 vector (i.e. a univariate distribution with discrete outcomes)? Have a look at the
documentation for the expand. grid() function.
Hint 2: Have a look at the documentation for the sample() function, paying particular attention to
the prob- token.
2. Let the randorn variable : y be distributed N(p. 1) and consider a sample with N 2 drawn from this
population. We want to best Hp is = 0 against Ha p= 1 and have specified the critical region for
the test as
R
where y is the sample mean ,
Li=1
Hi.
(a) Plot the critical region R as a subset of the sample space o = € R2
(b) If 1/1 and are both N(p. 1) what is the distribution of the mean y? Is the distribution exact or
asymptotic?
(c) Based on your answer to the previous part, explain why = Il is or is not pivotal.
(d) Repent (b) and (c) for e
(e) Find the power function for the test with test statistic T il and R as defined above.
(f) Using your answer to the previous part, what is the size of the test?
(g) What is the probability of Type II Error?
3. Wooldridge, Exercise 4.13 (p. 86). The Stata data file cornwell dia is available on Canvas.
Clicge
2)
4. Adapted from Hansen, Exercise 3.26 (p. 103)
(a) Using the CPS data set from Section 3.22, estimate a log wage regression for the subsample of
white male Hispanics. Your model specification should include the following:
1
A constant
Region, coded as four categories: Northenst, Midwest, South, and West. Code the contrasts
for your region variable as indicator variables all referencing the "default" category of Midwest.
Marital status, coded as four categories: (1) Married; (2) Widowed or Divorced; (3) Separated;
(4) Single/Never Married. Code the contrasts for your marital status variable as indicator
variables all referencing the "default" category of Single/Never Married
Education, recoded as five categories: (1, "LTH") less than a high school diploma; (2, "HSD")
high school graduate but not attained BA or equivalent; (3. "BA") BA or equivalent, but not
attained a Master's degree; (4, "MA") MA or equivalent, but not attained a professional or
Doctorate; (5, "PD") professional degrees or doctorate. Code the contrasts for your education
variable as backward differences.
Experience and experience squared.
(b) Estimate the marginal effect on the meain log wage of marital status "Married" versus "Sin
gle/ (Never Married.'
(c) Estimate the marginal effect on the meain wage implied by your result from the previous part.
(d) Estimate the marginal effect on the mevari wage implied by an incresse in educational attainment
from "BA or equivalent" to "MA or equivalent."
(e) Estimate the median marginal effect on the mesan log wage, across your sample, of an increase
in experience of one year.
2

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction
of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice.
Unethical use is strictly forbidden.

## Question 1

x = 0 x = 1 x = 2

y = 0 0.10 0.20 0.20

y = 1 0.20 0.20 0.10

Marginal Probability of y:

y y = 0 y = 1

P(Y=y) 0.50 0.50

Expected Value of y, E(y) = 0*0.50 + 1*0.50 = 0.50

E(y^2) = 0^2*0.50 + 1^2*0.50 = 0.50

Var(y) = E(y^2) - E(y)^2 = 0.50 - 0.50^2 = 0.25

sd(y) = sqrt(var(y)) = 0.5

Marginal Probability of x:

x x = 0 x = 1 x = 2

P(X=x) 0.30 0.40 0.30

Expected Value of x, E(x) = 0*0.30 + 1*0.40 + 2 *0.30 = 1

E(x^2) = 0^2*0.30 + 1^2*0.40 + 2^2*0.30 = 1.60

Var(x) = E(x^2) - E(x)^2 = 1.60 - 1^2 = 0.6

sd(x) = sqrt(var(x)) = 0.7745967

E(xy) = sum(y*f(x,y)*x) = 1*0.20*1 + 1*0.10 * 2 = 0.40

Cov(x,y) = E(xy) - E(x)*E(y) = 0.40 - 1*0.50 = -0.10

rho = Corr(x,y) = Cov(x,y)/sd(x)*sd(y) = -0.10/(0.7745967*0.5) = -0.2581989

### (a)

Under MSE loss, the best constant predictor of y is the population

mean ie. E(y) which is 0.50...