Question

Problem 1
On the Golub et al. (1999) data set, we consider the correlation between the Zyxin gene expression values and each of the gene in the data set
(a)How many of the genes have correlation values less than negative 0.5? (Those genes are highly negatively correlated with Zyxin gene).
(b)Find the gene names for the top five genes that are most negatively correlated with Zyxin gene.
(c) Using the t-test, how many genes are negatively correlated with the Zyxin gene? Use a false discovery rate of 0.05. (Hint: use cor.test() to get the p-values then adjust for FDR. Notice that we want a one-sided test here.)

Problem 2
On the Golub et al. (1999) data set, regress the expression values for the GRO3 GRO3 oncogene on the expression values of the GRO2 GRO2 oncogene.
(a)Is there a statistically significant linear relationship between the two genes’ expression? Use appropriate statistical analysis to make the conclusion. What proportion of the GRO3 GRO3 oncogene expression’s variation can be explained by the regression on GRO2 GRO2 oncogene expression?
(b)Test if the slope parameter is less than 0.5 at the α = 0.05 level.
(c) Find an 80% prediction interval for the GRO3 GRO3 oncogene expression when GRO2 GRO2 oncogene is not expressed (zero expression value).
(d)Check the regression model assumptions. Can we trust the statistical inferences from the regression fit?

Solution Preview

This material may consist of step-by-step explanations on how to solve a problem or examples of proper writing, including the use of citations, references, bibliographies, and formatting. This material is made available for the sole purpose of studying and learning - misuse is strictly forbidden.

# Problem 1
# On the Golub et al. (1999) data set, we consider the correlation between the Zyxin gene
# expression values and each of the gene in the data set

library("multtest")
data(golub)
dim(golub)
# [1] 3051   38
golub <- data.frame(golub)
gol.fac <- factor( golub.cl, levels=0:1, labels=c("ALL","AML"))
# (a)How many of the genes have correlation values less than negative 0.5? (Those genes are
# highly negatively correlated with Zyxin gene).
golub.gnames[2124,]
# [1] "4847"      "Zyxin"    "X95735_at"
correlations <- apply(golub,1,cor, as.numeric( golub[2124,] ))
correlations.less.than.05 <- correlations < 0.5
sum(correlations.less.than.05)
# [1] 2941
# 2941 gnes

# (b)Find the gene names for the top five genes that are most negatively correlated with
# Zyxin gene.
o <- order(correlations)
golub.gnames[o,][1:5,2]
# [1] "Macmarcks"                                                                                                      
# [2] "Inducible protein mRNA"                                                                                       
# [3] "C-myb gene extracted from Human (c-myb) gene, complete primary cds, and five complete alternatively spliced cds"
# [4] "Oncoprotein 18 (Op18) gene"                                                                                    
# [5] "54 kDa protein mRNA"...

This is only a preview of the solution. Please use the purchase button to see the entire solution

Assisting Tutor

Related Homework Solutions

R Programming Problems
Homework Solution
$30.00
Statistics
Mathematics
R Programming
Temperatures
Data
Probability
Correlation
Research
Results
Tables
Functions
Statistics - R Programming Problems
Homework Solution
$63.00
Statistics
Mathematics
R-Programming
Computer Science
Codes
Data Sets
Classification Tree
ROC Curve
Logistic Regression
Matrix
Expression Values
Sensitivity
Support Vector Machine
Functions
R Programming Problems
Homework Solution
$33.00
Mathematics
Statistics
R Programming
Baseball Players
Strikes
Scores
Samples
Information
Estimation
Functions
Countries
Standard Errors
Salary
Statistics & R Programming Problem
Homework Solution
$50.00
Mathematics
Statistics
R Programming
Normal Distribution
EM Algorithm
MLE
Augmented Variables
Equations
Computing
Conditions
Statements
Parameters
Get help from a qualified tutor
Live Chats