## Question

a) Write a Monte Carlo simulation to determine the probability of winning the game. Use 90000 trials. You should find a probability that close to 23%. On the line immediately before your `for` loop, have `set.seed()` with your student ID number, e.g., `set.seed(000028778)`.

Hint 1: you can generate a total of 10 moves with `x <- sample( c(-1,1),size=10,replace=TRUE)`, with -1 representing moving to the left (heads was flipped) and +1 representing moving to the right (tails was flipped).

Note 2: if we take the starting square as square 0, the square occupied by the Rook after (say) 4 moves is the 4th element of the vector of the cumulative sum of `x` (`cumsum(x)`), etc. Try making `x` and then running

```{r Q1 test case}

RNGversion("3.6.1"); set.seed(2012); x <- sample( c(-1,1),size=10,replace=TRUE)

plot.ts( cumsum(x),ylim=c(-2,10))

abline(h=c(0,7),col="red")

RNGversion("3.6.1"); set.seed(2022); x <- sample( c(-1,1),size=10,replace=TRUE)

plot.ts( cumsum(x),ylim=c(-2,10))

abline(h=c(0,7),col="red")

```

In the first instance, you win the game since the Rook stays on the chessboard (positions are between 0 - 7). In the second instance, you lose the game since the Rook falls off on move 5.

Note 3: if any elements of `x` are negative or if any elements of `x` are greater than 7 (alternatively, if all elements are 0, 1, 2, ..., 7), at some point the Rook would have fallen off and you would have lost the game!

```{r Q1a}

n.trials.q1a <- 90000

x <- sample( c(-1,1),size=10,replace=TRUE)

wins <- 0

set.seed(000483096)

```

b) Your estimated probability will be different than other people's in the class because everyone used different parts of the random number stream to run the simulation. Some of your classmates' estimated probabilities will be "close" to the actual probability, while others may be farther off. Using your estimated probability from (a), calculate the numerical value of the standard error, then explain in plain English what its value tells you.

```{r Q1b}

```

**Response:**

c) Construct a 95% confidence interval for p using your results from (a) and (b) and print its lower/upper limits to the screen. Also show the confidence interval out put by `binom.test`.

```{r Q1c}

```

d) Is your confidence interval from (c) "correct"? In other words, does it cover the true value of p? Justify your answer.

**Response:**

e) If each of the 1000 Stat 201 students completed parts (a) through (c), about how many of the confidence intervals would you expect to cover p and how many would not? How could you identify which ones got it right and which ones got it wrong?

**Response:**

f) Imagine one of the Stat 201 found a 95% confidence interval of the form (0.018, 0.020). 1. What's wrong with saying "There's a 95% chance that p is between 1.8% and 2.0%"? 2. The 95% confidence does indeed refer to *something* that has a 95% chance of occurring. What?

**Response 1:**

**Response 2:**

********************

**Question 2:** An inventory control has been proposed and it's your job to determine the probability that a stockout will occur when using it. You decide to write a Monte Carlo simulation to estimate it. The boss wants to make sure that the margin of error of your estimate is no more than 0.5 percentage points, i.e. 0.005.

a) What is the relationship between the margin of error of an estimated probability and the standard error of an estimated probability?

**Response:**

b) With no information about what the probability of a stockout might be, what is the maximum number of trials that your Monte Carlo simulation must have to achieve the desired margin of error?

```{r Q2b}

```

c) A trial takes quite a while to run, so your answer to (b) is a little unsettling. You decide to run 50 trials to get an initial estimate, and it turns out that a stockout occurred in 3 of them. Based on this additional information, about how many trials total will be necessary to achieve the desired margin of error?

```{r Q2c}

```

********************

**Question 3:** It's possible to do more than just buy and sell stocks on the stock market. Options contracts (e.g. put and call options) can be purchased which allow the option to buy or sell stock at a later date for a particular price. For example, Disney stock (DIS) closed at 138.84 on Sep 5, 2019. We could purchase a call option that allows us the ability to buy DIS at 140 in two weeks (if we choose to exercise the option). If we think DIS is going to increase a lot over the course of two weeks (e.g. to 145), this would allow us to buy DIS for cheaper than market price, and we can immediately resell for a profit. "Exotic options" can be more complex (e.g., an option to buy DIS at 140 in two weeks but only if the price never goes above 142 during that time span).

When it comes to pricing options, having accurate probabilities is a necessity! Let's write a Monte Carlo simulation to explore various probabilities.

For this Monte Carlo simulation, we're going to assume that stock prices move "at random" and cannot be predicted by other factors. The random walk hypothesis (https://en.wikipedia.org/wiki/Random_walk_hypothesis) posits that this is indeed the case, though many academics and economists disagree. Regardless, if movements are predictable, the ability to do so is rather minimal.

Imagine that each day, the stock's price changes by a percentage of its current value, anywhere from losing up to 1% of its current value to gaining 1.05% of its current value. If we write the percentage change as a decimal between -.01 and .0105, the model is:

`Price on Day 2 = Price on Day 1 * ( 1 + random percentage change )`

`Price on Day 3 = Price on Day 2 * ( 1 + random percentage change )`

`Price on Day 4 = Price on Day 3 * ( 1 + random percentage change )`

The command `runif(1,min=-0.01,max=0.0105)` creates one random number between -0.01 and 0.0105 that can give the percentage change. This scheme works out to a generous yearly return of about 13%.

```{r Q3}

runif(1,min=-0.01,max=0.0105)

runif(1,min=-0.01,max=0.0105)

runif(1,min=-0.01,max=0.0105)

```

a) Let the starting price on day 1 of a stock be 100. Define the variable `price` to be equal to 100 to reflect this starting condition. Use a `for` loop, evolve the price of the stock in accordance our model and define the 2nd, 3rd, ..., 252nd elements of `price`accordingly (252 trading days in a year). Use `set.seed(471)` on the line immediately before your `for` loop. Plot the contents of `price` with `plot.ts(price)`.

```{r Q3a 252 trading days}

#Sanity check for first 6 elements of price

#100.00000 99.02010 98.28398 98.76700 99.44171 99.61581

```

b) The evolution in (a) was just one of a nearly infinite number of possibilities for how the stock's price could have changed. If you run the code with a different random number seed, you'll see quite a different trajectory. I'm interested in the probability that, after 252 trading days, the price of the stock is 110 or larger. Estimate this probability with a Monte Carlo simulation using 5000 trials (you should find a value a little larger than 1/3). You don't need to set a random number seed.

* Hint 1: initialize a variable `counter` to be equal to 0. After each trial, check to see if `price[252]` is at least 110. If it is, bump up the value of `counter` by 1.

* Hint 2: you can basically copy/paste your code from (a), putting in into the `for` loop associated with your Monte Carlo simulation (taking out `set.seed(471)` so you don't get the same evolution every time, and taking about `plot.ts(price)` so you don't make 5000 plots).

* Hint 3: name sure the name of your looping variable representing the trial number is different than the name of the looping variable you used to evolve the stock's price!

```{r Q3b}

```

c) Having your probabilities wrong by a fraction of a percent can be disastrous if you're trading on the stock market. The margin of error of your estimate in (b) from 5000 trials is about 0.0135. How many total trials is required to shrink the margin of error by a factor of 10 (i.e., from 0.0135 to 0.00135)? Note: you shouldn't need to do any computation in R to be able to answer this.

**Response:**

********************

**Question 4:** A company has redesigned their banner ad with the hope that it will increase the click-thru rate. Extensive data from web-hosting services reveal that the old has had a click-thru rate of 0.0218. After a week, 32112 visitors had seen the ad on various websites, and 726 had clicked on it (for a click-thru rate of about 0.226). Assuming the fraction of visitors that have clicked on the ad during that week provides a reasonable estimate of the probability that a random visitor clicks on the ad in the future, does the evidence suggest that the click-thru rate has improved? Support your answer by finding a 95% confidence interval for p using `binom.test`.

```{r Q4}

```

**Response:**

## Solution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

## Part A### Problem 1

```{r Problem 1}

selectCols <- function(data, ...) {

df <- data.frame(matrix(ncol=0, nrow=nrow(data)))

args <- list(...)

for (arg in args) {

df <- cbind(df, data[,arg])

}

return(df)

}

data <- selectCols(mpg, 2, 'displ', 'drv')

```

### Problem 2

```{r Problem 2}

plotCols <- function(data) {

varnames <- colnames(data)

for (i in varnames) {

coldata <- data[[i]]

g <- ggplot(data=data)...

By purchasing this solution you'll be able to access the following files:

Solution.Rmd.