2. The file 72018Elprobl.csv' contains 30 observations on two variables
x' and 'y'. Read this into an R dataframe.
(a) Produce a scatterplot of 'y' versus 'x'. What do you see?
(b) Consider the model
yij z = 1,2,3, 7 1,
Explain in words what this model means. Do you think it is appropriate for the
data, judging by the scatterplot?
(c) Use R to fit the model
yij = Hi + Eij, 2 = 1.2.3. i = 1. 10
by forming an appropriate design matrix x and using matrix algebra (not R
functions) to find the regression coefficients and the residuals. Use R2 to judge
how well the model fits the data. Does it seem that the vector of residuals satisfies
the regression assumptions?
3. The data set 'pima' contains "The National Institute of Diabetes and
Digestive and Kidney Diseases conducted a study on 768 adult female Pima Indians
living near Phoenix". We are interested in the variables 'bmi' = "Body mass
index (weight in kg/(height in metres squared))" and 'triceps' = "Triceps skin fold
thickness (mm)" The thinking is that 'bmi' and 'triceps' are related and we can
use a quick measure of triceps skin fold thickness to estimate body mass index,
which would normally require a more extensive examination.
(a) Produce a scatterplot of 'bmi' versus 'triceps'.
(b) There are many 0 values and one 99 value for 'triceps'. We will assume
these are missing values. Copy 'bmi' and 'triceps' into new variables 'bmi.new
and 'triceps.new with these observations omitted. We will use these for the rest
of the analysis.
(c) Produce a scatterplot of 'bmi.new' versus 'triceps.new'.
(d) Use Im to perform a regression of 'bmi.new' on 'triceps.new'. Is the regres-
(e) What is the regression equation for predicting bmi.new' from 'triceps.new'?
(g) Perform residual diagnostics can we believe the stated significance?
4. The file 'f2018E1 prob5.csv' contains observations on variables
(a) Produce a scatterplot of 'y' versus 'x'. It has an unusual form.
(b) Produce a scatterplot of 'y' versus 'x' using a different symbol for each of
the two groups:
plot (x,y pch=group)
What do you see?
(c) Use R to fit the two linear regressions simultaneously using a single linear
model by creating an appropriate design matrix X and matrix algebra to find the
regression coefficients. What are they?
(d) Calculate the residuals and produce diagnostic plots. Describe what you
(e) Calculate and interpret the value of R² for this model.
These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction
of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice.
Unethical use is strictly forbidden.