 # 1. Decision Trees This question relates following figure. 15 X2 ...

## Question

Show transcribed text

## Solution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

output:
pdf_document: default
html_document: default
---

```{r setup, include=FALSE}
knitr::opts_chunk\$set(echo = TRUE)
# for this assignment
library(ISLR)
library(rpart)
library(randomForest)
library(tree)
library(caret)
library(e1071)
```

## 1. Decision Trees

### (a)

![](1_a.png)

### (b)

![](1_b.png)

## 2. Regression Trees

### (a)

```{r}
# Setting seed
set.seed(1234)
# Splitting the data into 75% training set and remaining 25% as test set.
sample_size <- floor(0.75 * nrow(Carseats))
# Randomly taking 75% samples from the overall dataset.
train_index <- sample(seq_len(nrow(Carseats)), size = sample_size)
# Picking training rows from the sample indexes
train <- Carseats[train_index, ]
# Picking remaining rows as test data
test <- Carseats[-train_index, ]
```

### (b)

```{r}
# Fitting decision tree on the training dataset
fit <- rpart(Sales ~ ., data=train)

# Priting fitted model
fit

# Plotting regression tree
plot(fit, main = "Regression Tree")
text(fit, use.n=TRUE, all=TRUE, cex=.3, cey=0.2)

# Predicting values on test dataset
fitted <- predict(fit, test)

# Calculating Mean Squared Error (MSE)
MSE <- mean((test\$Sales - fitted)^2)

# Printing MSE
cat("Test MSE is:", MSE)
```

First decision node is ShelveLoc, if its value is Bad or Medium, then left subtree is checked for decision, otherwise, right subtree is checked for next rule. Similarly, if left sub-tree is selected, then the next factor to consider is Price, ie. if price is above 10.5.5, then decision tree takes the left subtree path, otherwise, right subtree. Following this, anytime we reach at the leaf node, the value of the node is the predicted value based on test values.

### (c)

```{r}

# Plotting cross validation to get optimal tree size
plotcp(fit)
par(mfrow=c(1,2))
rsq.rpart(fit)

# Pruning model with optimal level
prune_fit <- prune(fit, cp = fit\$cptable[which.min(fit\$cptable[,"xerror"]),"CP"])

# Replotting decision tree
plot(prune_fit, uniform=TRUE,
main="Pruned Regression Tree")
text(prune_fit, use.n=TRUE, all=TRUE, cex=.4)

fitted <- predict(prune_fit, test)

# Calculated MSE for the pruned tree
MSE <- mean((test\$Sales - fitted)^2)
cat("Test MSE after pruning is:", MSE)
```
Test MSE after pruning is almost remains same. Hence, pruning doesn't not seem to be effect in this case...

By purchasing this solution you'll be able to access the following files:
Solution.pdf and Solution.Rmd.

\$75.00
for this solution

or FREE if you
register a new account!

PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

### Find A Tutor

View available Statistics-R Programming Tutors

Get College Homework Help.

Are you sure you don't want to upload any files?

Fast tutor response requires as much info as possible.