QuestionQuestion

Transcribed TextTranscribed Text

Study the outlier effect against the factor analysis using the hemangioma data in the table. 1) Identify the outliers using marginal plot and scatter plot. 2) Perform the factor analysis and find the appropriate number of factors. 3) Repeat step 2) with the outliers excluded. 4) Compare the loadings of the factors between 2) and 3) and summarize your findings. 5) How are the factors and loadings different between the two (with and without outliers)? Explain why. Table 8.2: Age, in days, and expression of genetic markers for infants who were surgically treated for hemangioma Age RB p16 DLK Nanog C-Myc EZH2 IGF-2 81 2.0 3.07 308975 94 6.49 2.76 11176 95 6.5 1.90 70988 382 1.00 7.09 5340 95 3.6 3.82 153061 237 0.00 5.57 6310 165 1.9 3.74 596992 88 0.00 2.47 7009 286 2.6 5.17 369601 282 12.23 1.63 7104 299 2.9 5.76 1119258 177 8.76 3.51 9342 380 1.9 2.40 214071 45 5.76 1.41 3726 418 7.1 3.38 69511 265 1.17 3.07 8039 420 6.4 3.37 81457 659 1.88 3.87 12583 547 6.4 4.05 64348 336 0.78 4.76 6505 590 1.8 5.15 164881 2012 35.65 9.45 32722 635 6.7 2.67 126016 3072 0.00 4.35 11763 752 1.8 3.28 567858 127 4.13 1.00 10283 760 7.3 0.92 43438 698 1.77 3.32 11518 1171 1.8 6.56 716260 392 12.92 2.90 13264 1277 1.3 0.05 94 15 0.36 3.83 30 1520 4.0 2.79 31125 454 0.62 2.33 1163 2138 0.5 0.00 2331 33 0.03 0.17 66 3626 4.2 4.24 560208 340 5.43 1.36 21174 Data courtesy of Dr. Deepak Nayaran

Solution PreviewSolution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

### Loading psych library
library(psych)
library(reshape)
library(ggplot2)

# Function to find the outliers based in standard deviation
findOutlier <- function(data, cutoff = 3) {
## Calculate the sd
sds <- apply(data, 2, sd, na.rm = TRUE)
## Identify the cells with value greater than cutoff * sd (column wise)
result <- mapply(function(d, s) {
    which(d > cutoff * s)
}, data, sds)
result
}
# This method removes the outliers from the data
removeOutlier <- function(data, outliers) {
result <- mapply(function(d, o) {
    res <- d
    res[o] <- NA
    return(res)
}, data, outliers)
return(as.data.frame(result))
}

# Selected columns
col_selected <- c('EA1', 'TA1', 'PA1', 'NA1', 'EA2', 'TA2', 'PA2', 'NA2')...

By purchasing this solution you'll be able to access the following files:
Solution.R.

$30.00
for this solution

PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

Find A Tutor

View available Statistics-R Programming Tutors

Get College Homework Help.

Are you sure you don't want to upload any files?

Fast tutor response requires as much info as possible.

Decision:
Upload a file
Continue without uploading

SUBMIT YOUR HOMEWORK
We couldn't find that subject.
Please select the best match from the list below.

We'll send you an email right away. If it's not in your inbox, check your spam folder.

  • 1
  • 2
  • 3
Live Chats