# Part 1 Merge the most relevant data found in the 3 tables (golub....

## Question

Part 1
Merge the most relevant data found in the 3 tables (golub.gnames, golub, and golub.cl) that make-up the golub data in the library(multtest) into one data.frame with the following properties:

Name: golub.df
Dimensions: patient rows and named gene columns, and an additional named column for the cancer classifications
Column Names: use the gene name (column 2) from golub.gnames and “classification”

Classification Column: use a factor column in golub.df that uses "ALL" and "AML" as the classifications

Part 2
Answer the Chapter 3 Exercise 9 and a new Exercise 8 below using your new golub.df data.frame. Try not to cheat by =’the answers in the book – unless you get really stuck.

Exercise (Re-worded so as to use your golub.df dataframe)
a) Perform a hypothesis test to see if the distribution of the expression values for the Zyxin gene for the ALL patients are normally distributed.
b) Plot the PDF of N(0.3, 0.752) on top of the histogram of the distribution of the expression values for the Zyxin gene for the ALL patients, for the range -2 < x < 2. Does it look like N(0.3, 0.752) models the data well?
c) Extra credit: Perform a hypothesis test to see if the distribution of the expression values for the Zyxin gene for the ALL patients are distributed according to N(0.3, 0.752).

Part 3
Answer the Chapter 4 Exercises 1, 3, 6, 8, and 10 using your new golub.df data.frame. Try not to cheat by reformulating the answers in the book – unless you get really stuck.

Notes and Hints:
1. In Parts 2 and 3, use only your golub.df data.frame from Part 1 to answer the questions. (Do not use the original golub, golub.cl, and golub.gnames matrices and vector.)
2. Don't use the gene index or gene ID (columns 1 and 3 in golub.gnames) in your new data.frame. Just have named gene columns and one extra named column for the cancer classification.

## Solution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

data(golub, package = "multtest")

dim(golub)
#this returns 3051 by 38

#let's transpose the matrix
golubt = t(golub)

dim(golubt)

#this returns 38 by 3051

#this creates a data frame filled with NAs
golub.df = as.data.frame(matrix(nrow=38,ncol=3052))

#this fills the first 3051 columns of the data frame with the
#transposed Golub data
golub.df[1:38,1:3051]=golubt

#here are the factors of classifications calculated on the
#class file
gol.fac <- factor(golub.cl,levels=0:1, labels= c("ALL","AML"))...

By purchasing this solution you'll be able to access the following files:
Solution.R.

\$48.00
for this solution

PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

### Find A Tutor

View available Statistics-R Programming Tutors

Get College Homework Help.

Are you sure you don't want to upload any files?

Fast tutor response requires as much info as possible.