 # The data (n = 31) deals with pavement durability which contains mea...

## Question

The data (n = 31) deals with pavement durability which contains measurements on the following variables:
y = change in rut depth
x1= viscosity of rut depth
x2 = % of asphalt in the surface course
x3 = % of asphalt in the base course
x4 = % of fines in the surface course
x5 = % of voids in the surface course
x6 = run indicator

x6 is a run indicator which separates the data into two different experimental runs.
(a) Fit the full regression model with six predictors to the data set and use the ANOVA table to assess its overall fit.
(b) Exhibit the fitted equation for y when the run indicator is 1 and when it is -1. (c) Use All Possible Subsets regression to select the best model based on their R2 values. Perform all diagnostic tests and check the adequacy of this model.

2. The data were collected from a group of workers in the cotton industry to assess the prevalence of the lung disease byssinosis among these workers. This disease is caused by long term exposure to particles of cotton, hemp, flax and jute working in this type of environment. It can result in asthma-like symptom which can lead to death among sufferers. The response variable y is binary and refer to number of workers suffering (response = yes) and not suffering (response = no) and the predictors are:
x1 = idustiness of the workplace (1 = high, 2 = medium, 3 = low)
x2 = race ( 1 = European, 2 = other)
x3 = sex ( 1 = male, 2 = female)
x4 = smoking history (1 = smoker, 2 = nonsmoker)
x5 = length of employment in the cotton industry
(1 = less than 10 years, 2 = between 10 and 20 years, 3 = more than 20 years)
Notice that all five predictors are qualitative variables and the responses are entered in the event/trial format.
(a) Fit a logistic regression model to the data set and discuss which of the predictors have a significant effect on the presence of byssinosis.
(b) Discuss the adequacy of the logistic regression model.
(c) (Needs to be done manually or by hand) From the final model you have selected in part (a), determine the probability that a person will suffer from byssinosis if given:
i. xl = 2, x2 = 2, x3 = 1, x4 = 2, x5 = 3,
ii. xl = 1, x2 = 2, x3 = 2, x4 = 1, x5 = 3iii.
xl = 2, x2 = 1, x3 = 1, x4 = 2, x5 = 2,
iv. xl = 3, x2 = 1, x3 = 2, x4 = 2, x5 = 1.

## Solution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

---
output:
word_document: default
pdf_document: default
html_document: default
---

```{r setup, include=FALSE}
knitr::opts_chunk\$set(echo = TRUE)
library(caret)
```

## Question 1

### (a)
```{r}
dat1 <- read.csv("asphalt.csv", stringsAsFactors = FALSE, fileEncoding = "UTF-8")

fit <- lm(y ~., data = dat1)
summary(fit)
anova(fit)
```

p-value of the model is less than significance level of 0.05, which shows
there are alteast one predictor which is significant for the model.
Inpecting p-value of each predictor is it clear than variables 'run' and
'visc' are significance as p-value is less than 0.05.

### (b)

if run indicator is 1:

y = -62.970450 + 0.003071*visc + 7.498028*X.surf + 6.225817*X.base +
0.522211*X.fines -0.241275*X.voids - 5.386297

if run indicator is -1:

y = -62.970450 + 0.003071*visc + 7.498028*X.surf + 6.225817*X.base +
0.522211*X.fines -0.241275*X.voids + 5.386297

### (c)

```{r}
set.seed(123)
train.control <- trainControl(method = "cv", number = 10)
step.model <- train(y ~ ., data = dat1,
method = "leapBackward",
trControl = train.control
)
step.model\$results
step.model\$finalModel

summary(step.model\$finalModel)
coef(step.model\$finalModel...

By purchasing this solution you'll be able to access the following files:
Solution.docx and Solution.Rmd.

\$40.00
for this solution

or FREE if you
register a new account!

PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

### Find A Tutor

View available Statistics-R Programming Tutors

Get College Homework Help.

Are you sure you don't want to upload any files?

Fast tutor response requires as much info as possible.