QuestionQuestion

1. Recall the mutual fund example from lecture. The data set Mutual.RData contains two objects, mu and V, that are the mean vector and covariance matrix of annual rates of return of five investment funds. Here is mu:

SP500 HighTech SmallCap USTreas CorpBond
0.06 0.10 0.08 0.02 0.04

This says, for example, that the annual mean return on investment for the US Treasuries fund is 2 percent. Assume that the rates of return are jointly distributed as multivariate normal.

Suppose that Tom and Ellen each invest $1000 at the beginning of the year. Tom puts $500 in the High Tech fund and $500 in the Small Cap fund. Ellen puts $200 in each of the five funds. What is the probability that Ellen will have made more money than Tom at the end of the year?

2. The data frame BCMort88 found in the file BCMort88.RData gives breast cancer mortality rates for 217 counties in nine states (six New England states plus NY, NJ, and PA) for the year 1988. The rates are adjusted to account for differing demographic characteristics across the counties. Here is a data description:

Variables:
Pop = Population of county
AdjRate = Adjusted mortality rate (per 100,000)
SE = Estimated standard error (per 100,000)

We want to identify counties that have mortality rates that are significantly more than 18 per 100,000 population, in order to devote resources to them. For this problem assume that the adjusted rates are unbiased and normally distributed although the latter clearly is not the case for smaller counties.

Limit your analysis to those counties that have a population of at least 20,000. Use a multiple testing method to identify those counties that should receive resources according to the criterion stated above, and explain why you chose it over alternative methods.

3. The data frame MTVR found in the file MTVR.RData gives information on n = 112 USMC Medium Terrain Vehicle Replacements (MTVRs) at Camp Lejeune, NC. Here is a data description:

Data on n = 112 MTVRs taken from internal Caterpillar engine diagnostic readings. Variables are as follows:

PTO
Idle
Miles
Load.factor
MPG

Percent of time in Power Takeoff (PTO) mode
Percent of time in idle mode
Number of miles driven

Percent of max. available power used by the engine
Fuel efficiency (miles per gallon)

Source: Penn State Applied Research Laboratory, 2013

(a) Use the pairs() command to identify two outliers: describe what they are and whether you believe it would be justified to delete these observations.

(b) Fit a least-square regression model to predict MPG from the other variables both with and

without the two outliers included. Do the outliers have a large effect on the fitted model?

(c) Refer to the regression with the two outliers removed. Plot the residuals (y-axis) versus the fitted values (x-axis) and comment on the pattern that you see. Is it consistent with what you know the correlation is between the residuals and fitted values?

Solution PreviewSolution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

setwd("")

load(file = "Mutual.RData")
load(file = "MTVR.RData")
load(file = "BCMort88.RData")

# Ex 1
# E(x)
# SP500 HighTech    SmallCap    USTreas CorpBond
# 0.06 0.10    0.08    0.02    0.04

funds<-c("SP500",   "HighTech", "SmallCap", "USTreas", "CorpBond")
# names(means)<-funds
mu
##    SP500 HighTech SmallCap USTreas CorpBond
##    0.06    0.10    0.08    0.02    0.04
mu_tom <- 0.10*500+0.08*500
mu_ellen <- 0.06*200+0.10*200+0.08*200+0.02*200+0.04*200

# v(x1)+v(x2)-2cov(x1,x2)
v_x1_p_x2<-V[2,2]+V[3,3]-2*V[2,3]

v_x1_x2_x3_x4_x5<-V[1,1]+V[2,2]+V[3,3]+V[4,4]+V[5,5]-2*V[1,2]*V[1,3]*V[1,4]*V[1,5]*V[2,3]*V[2,4]*V[2,5]*V[3,4]*V[1,5]*V[4,5]

tval_tom<-mu_tom/sqrt(v_x1_p_x2)
tval_ellen<-mu_ellen/sqrt(v_x1_x2_x3_x4_x5)

# From the above t-values we can see that t-value of Ellen is lesser than that of Tom, so we can say that Tom has the greater probability of making more money than that of ellen...

By purchasing this solution you'll be able to access the following files:
Solution.R and Solution.docx.

50% discount

Hours
Minutes
Seconds
$50.00 $25.00
for this solution

or FREE if you
register a new account!

PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

Find A Tutor

View available Statistics-R Programming Tutors

Get College Homework Help.

Are you sure you don't want to upload any files?

Fast tutor response requires as much info as possible.

Decision:
Upload a file
Continue without uploading

SUBMIT YOUR HOMEWORK
We couldn't find that subject.
Please select the best match from the list below.

We'll send you an email right away. If it's not in your inbox, check your spam folder.

  • 1
  • 2
  • 3
Live Chats