## Question

Data for this exercise.

```{r}

Year <- c(1936, 1946, 1951, 1963, 1975, 1997, 2006)

CaloriesPerRecipeMean <- c(2123.8, 2122.3, 2089.9, 2250.0, 2234.2, 2249.6, 3051.9)

CaloriesPerRecipeSD <- c(1050.0, 1002.3, 1009.6, 1078.6, 1089.2, 1094.8, 1496.2)

```

#Exercise 1.

Revisit Exercise 1 from Homework 6. Print a list of messages describing the effect size for each unique pair of means from `CaloriesPerRecipe`. Your messages should look like

```

The difference between 1936 and 1945 is ####. This is a ???? difference.

The difference between 1936 and 1951 is ####. This is a ???? difference.

...

```

`####` should be replaced with absolute value for the difference between pair of means, and `????` will be `small`, `medium` or `large`. Calculate Cohen $d$ for each pair, and use $d<0.2$ to determine small effects and $d>0.8$ for large effects.

Print each message on a single line. The result will look better if you use `cat` in R.

If you use SAS, used `scan` to iterate over the list macro variables `CaloriesPerRecipeMean` and `CaloriesPerRecipeSD`. You can put the results to the log.

# Exercise 2.

Calculate MSW, MSB, $F$ and $p$ for the data from Wansink Table 1, but start with the strings:

```{r}

Means <- "268.1 271.1 280.9 294.7 285.6 288.6 384.4"

StandardDeviations <- "124.8 124.2 116.2 117.7 118.3 122.0 168.3"

SampleSizes <- "18 18 18 18 18 18 18"

```

Tokenize the strings, then convert the tokens to a create vectors of numeric values. Use these vectors to compute and print MSW, MSB, $F$ and $p$, reusing formula from Homework 4 or 6. Name the vectors appropriately to reuse code.

If you use SAS, do this in a macro. Use local macro variables to accumulate sums, and `%put` to report the results.

Compare your results from previous homework, or to the resource given in previous homework, to confirm that the text was correctly converted to numeric values.

#Exercise 4.

Download the two files `zero.to.60.csv` and `quarter.mile.csv`. These are records of motorcycle performance for a standing start to 60 mph and for quarter mile time. Each table has a column identifying the make and model for each entry, but this name of the column is different for each table.

## Part a.

There are some duplicates, so compute a mean of `Time` for each motorcycle, from both tables.

## Part b.

Create a new table with these means, but use only those motorcycles that are in both tables. You will need to merge these by names.

## Part c.

Plot the relationship between 0-to-60 time and quarter mile time.

# Exercise 5.

Read either file, `zero.to.60.csv` or `quarter.mile.csv`, from Exercise 4 into a table. Use partial matching to show the following sets of entries. You can assume make is the first word in the motorcycle name, and the model are the remaining words.

## Part a

What entries in this list were made by `BMW`?

## Part b

Which entries inlcude 'Ninja` in the model name?

## Part c

List the motorcycle with model names ending with 'R' (for racing? I suppose)?

## Part d

List the motorcycles that might be smaller than 'liter' bikes (engine size < 1000 cc), based on their name. First, exclude motorcycles with `1` in the name (these will mostly be 1000+ numbers). From that set of names, select those with numbers in range `2-9` in their names.

## Solution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

#Exercise 1.Revisit Exercise 1 from Homework 6. Print a list of messages describing the effect size for each unique pair of means from `CaloriesPerRecipe`. Your messages should look like

```

The difference between 1936 and 1945 is ####. This is a ???? difference.

The difference between 1936 and 1951 is ####. This is a ???? difference.

...

```

`####` should be replaced with absolute value for the difference between pair of means, and `????` will be `small`, `medium` or `large`. Calculate Cohen $d$ for each pair, and use $d<0.2$ to determine small effects and $d>0.8$ for large effects.

Print each message on a single line. The result will look better if you use `cat` in R.

If you use SAS, used `scan` to iterate over the list macro variables `CaloriesPerRecipeMean` and `CaloriesPerRecipeSD`. You can put the results to the log.

```{r}

cohen.d <- function(M_1, M_2, S_1, S_2) {

d <- abs(M_1-M_2)/sqrt((S_1**2+S_2**2)/2)

return(d)

}

len <- length(Year)

df <- data.frame()

for(i in seq(1, len-1)){

for(j in seq(i+1, len)){

if(i != j){

r <- data.frame(Year[i], Year[j], CaloriesPerRecipeMean[j], CaloriesPerRecipeMean[i],

cohen.d(CaloriesPerRecipeMean[j], CaloriesPerRecipeMean[i],

CaloriesPerRecipeSD[j], CaloriesPerRecipeSD[i]))

df <- rbind(df, r)

}

}

}

names(df) <- c("Year1", "Year2", "Mean1", "Mean2", "CohenD")

df$diff <- abs(df$Mean1 - df$Mean2)

df$effect <- ifelse(df$CohenD < 0.2, "small", ifelse(df$CohenD > 0.8, "large", "medium"))

apply(df, 1, FUN = function(x){

cat("The difference between ", x[1], " and ", x[2], " is ", x[6],". This is a ", x[7], " difference.", sep = "")

cat("\n")

})

```...

By purchasing this solution you'll be able to access the following files:

Solution.rmd.