QuestionQuestion

Transcribed TextTranscribed Text

Below are descriptions for the first six problems of the exam. These problems will make up approximately 60 – 80% of the exam. Use α = .05 for all inference on these problems. The exam is open book, open notes, etc. Make sure you bring this information sheet with you, along with any work you have done ahead of time on these seven problems. Also make sure you bring the statistical tables from the notes. You may use a smart phone to access the calculator, and you may use a computer to access course notes and tables. However, no internet access (including Moodle) is allowed during the exam (so, download the tables, notes, etc. that you might want to use prior to the exam). The remainder of the exam will be typical in-class questions where you may see printout, have to do some short calculations, interpret results, select the best analysis approach, answer multiple choice questions, etc. Please work alone on these problems! Any questions should be directed to the class instructor or teaching assistant – no one else! Problem #1: We want to model grip strength in people age 50 and older. Possible independent variables are sex, self-reported general health (good, fair, or poor), age, and systolic blood pressure. A plot of grip strength vs. age is given, where the plotting symbol represents the sex of the person. This is followed by two analyses. The first is a simple linear regression that models grip strength from age; the second is a general linear model (ANACOVA) that uses all four independent variables. Be prepared to report and interpret results for each analysis, and to explain why the results of the two may differ. Problem #2: We want to test if median grip strength differs from 15 for men 60 and older. We measure this in a sample of n=15 such men; the results are below. 17.65 18.20 8.55 17.95 16.40 16.00 13.25 15.85 14.80 16.75 16.45 17.35 17.05 16.80 18.05 Be prepared to provide results of two relevant nonparametric tests that the median is 15. Also be prepared to demonstrate how the results for the simpler of the two tests are obtained. Problem #3: We want to model the probability of full recovery after a woman has shoulder surgery. The main independent variable is whether or not she attended physical therapy (PT). We also record the age of the woman, and whether or not she has gone through menopause. For the categorical variables, referent levels are no PT and not having gone through menopause. Three logistic regression models have been run. The first contains all three variables, the second contains PT and age, and the third contains only PT. Be prepared to compare any two models (including a formal test), and to interpret the results of any model. Problem #4: We want to compare two intervention programs (one based on diet, the other based on exercise) and a control group, in how well they reduce cholesterol. Twelve people are randomly assigned to each of the three groups. Each person has cholesterol measured before and after the program. In addition to plots, cholesterol is modeled as a function of group, time, and their interaction, where time is treated as a categorical variable. The analysis used a compound symmetric covariance structure. Solutions for fixed effects, least square means (“sliced” by time), and other estimates have been requested. Be prepared to interpret all results of this model. Problem #5: We want to compare two screening tests (labeled as Test A and Test B) for diabetes, given to the same 180 people. You will be asked to calculate sensitivity, specificity, and PPV for both tests (assume 18% of the population of interest has diabetes). We also want to compare the two tests, based on their performance when looking at the 80 people with diabetes (see the last table). Be prepared to present a measure of agreement for the tests, and to test if there is a difference in the proportion who have a positive screen. For this problem, you would be wise to do some work before coming to the exam. Problem #6: We want to model survival (months) after a diagnosis of bladder cancer. The main variable of interest is treatment type (Standard or Standard+Drug). We also want to control for sex, race (Black, Hispanic, White), and age at diagnosis. Prior to doing the survival analysis, we perform a two-sample t-test to see if the mean age is the same for the two treatment types. We also used contingency tables to look at the association between treatment and sex, as well as treatment and race. For the survival analysis, Kaplan-Meier curves and associated tests are presented for treatment type. Next, a Cox model containing all three independent variables is used. Be prepared to interpret results of all tests. Edited Printout for Problem #1: Source Model 1 Error 125 Corrected Total Sum of Mean Square 11.40676 3.97314 Variable Intercept age Parameter Estimates Parameter Standard 1 DF 1 Estimate 15.52869 -0.02751 Error 1.13876 0.01623 t Value 13.64 -1.69 Pr > |t| <.0001 0.0927 The REG Procedure Dependent Variable: grip Analysis of Variance DF Squares 11.40676 496.64268 F Value 2.87 Pr > F 0.0927 Root MSE Dependent Mean Coeff Var 14.63211 0.0225 0.0146 126 508.04944 1.99327 R-Square 13.62260 Adj R-Sq Source DF Model 5 Error 121 Corrected Total R-Square 0.491342 Sum of Squares 249.6260067 Mean Square 49.9252013 F Value 23.38 Source sex health age systolic Source sex health age systolic Parameter Intercept sex Female sex Male DF Type I SS Mean Square 1 139.7672702 139.7672702 2 97.9862673 48.9931336 Pr > F health health health age systolic fair good poor 16.99140284 B -2.03763735 B 0.00000000 B 0.13023072 B 1.83954225 B 0.00000000 B 9.72 -7.34 . 0.28 4.09 <.0001 <.0001 0.7809 <.0001 health fair good poor LSMEAN grip LSMEAN 13.0522024 14.7615139 12.9219717 Number 1 2 3 The GLM Procedure 1 1 10.1352510 1.6049797 Standard Estimate 10.1352510 1.6049797 Error t Value 1.74779751 0.27750910 . . 0.46714795 0.44924159 258.4234359 126 508.0494425 Pr > F <.0001 F Value 65.44 <.0001 22.94 <.0001 4.81 0.0302 0.75 0.3877 FValue Pr>F 53.91 <.0001 21.91 <.0001 4.75 0.0313 0.75 0.3877 Pr > |t| Coeff Var Root MSE grip Mean 10.72787 1.461414 13.62260 1 1 DF 1 115.1452037 115.1452037 10.2674896 1.6049797 10.2674896 1.6049797 Type III SS 2 93.5687313 46.7843657 . . -0.02630992 0.01207746 . -2.18 -0.87 0.0313 0.3877 -0.00991040 0.01143220 Least Squares Means sex Female Male grip LSMEAN 12.5597440 14.5973814 Least Squares Means for effect health Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: grip i/j 1 2 3 1 2 3 <.0001 0.9581 <.0001 0.9581 0.0002 0.0002 2.1357309 Mean Square Edited Printout for Problem #2: The UNIVARIATE Procedure Variable: grip Moments N Mean Std Deviation Skewness Uncorrected SS Coeff Variation 15 SumWeights 15 16.0733333 Sum Observations 241.1 2.45683207 Variance -2.3272315 Kurtosis 3959.785 Corrected SS 15.2851435 Std Error Mean 0.63435131 Basic Statistical Measures Location Variability Mean 16.07333 Std Deviation 2.45683 Median 16.75000 Variance Mode . Range Interquartile Range 6.03602 9.65000 1.80000 Tests for Location: Mu0=15 Test -Statistic- -----p Value------ Student's t Sign M Signed Rank t 1.692017 Pr > |t| 0.1128 4.5 Pr >= |M| 0.0352 S 37.5 Pr >= |S| 0.0313 Quantiles (Definition 5) Level Quantile 100% Max 99% 95% 90% 18.20 18.20 18.20 18.05 75% 50% 25% 10% 5% 1% 0% Min 8.55 Extreme Observations -----Lowest---- Value Obs 8.55 3 13.25 7 14.80 9 15.85 8 16.00 6 ----Highest---- Value Obs 17.35 12 17.65 1 17.95 4 18.05 15 18.20 2 Q3 Median 17.65 16.75 Q1 15.85 13.25 Stem Leaf 18 002 16 04488046 14 88 122 1 | 10 861* ----+----+----+----+ Edited Printout for Problem #3: Model with PT, age, and menopause: 8.55 8.55 3 2 # Boxplot 8 | +-----+ +--+--+ 6.03602381 6.26915894 84.5043333 The LOGISTIC Procedure Model Information Data Set WORK.THREE Response Variable recovery Number of Response Levels 2 Model binary logit Optimization Technique Fisher's scoring Response Profile Ordered Total Value recovery Frequency 1 Full 369 2 NotFull 142 Probability modeled is recovery='Full'. Model Fit Statistics Intercept Parameter Intercept PT Yes age menopause Yes Standard Estimate 1.8836 1.1751 Wald Error Chi-Square 0.6908 7.4345 Pr > ChiSq 0.0064 Criterion AIC SC -2 Log L Only 605.947 610.183 1 1 0.2276 0.0146 26.6563 2.1754 <.0001 0.1402 DF 1 Intercept and Covariates 572.459 589.405 603.947 Type 3 Analysis of Effects Wald Chi-Square 26.6563 Effect DF PT 1 age 1 menopause Analysis of Maximum Likelihood Estimates 564.459 2.1754 1 2.8356 0.0922 Pr > ChiSq <.0001 0.1402 -0.0216 1 -0.4255 0.2527 2.8356 0.0922 Odds Ratio Estimates Point 95% Wald Estimate Confidence Limits Effect PT age menopause Yes vs No 0.653 0.398 Yes vs No 3.238 2.073 5.059 Model with PT and age: The LOGISTIC Procedure Model Information 0.979 0.951 1.007 1.072 Data Set WORK.THREE Response Variable recovery Number of Response Levels 2 Model binary logit Optimization Technique Fisher's scoring Response Profile Ordered Total Value recovery Frequency Parameter Intercept DF 1 Standard Estimate Wald Error Chi-Square Pr > ChiSq 0.0002 PT age Yes 1 1 2.3669 1.1569 -0.0356 0.6301 14.1083 1 Full 369 2 NotFull 142 Probability modeled is recovery='Full'. Model Fit Statistics Intercept Criterion AIC SC -2 Log L Only 605.947 610.183 Effect DF PT 1 age 1 Pr > ChiSq <.0001 0.0033 Intercept and Covariates 573.315 586.024 603.947 Type 3 Analysis of Effects Wald Chi-Square 26.0415 8.6476 Analysis of Maximum Likelihood Estimates Odds Ratio Estimates 0.2267 0.0121 26.0415 8.6476 <.0001 0.0033 Point 95% Wald Effect Estimate Confidence Limits PT Yes vs No 3.180 2.039 4.959 age 0.965 0.942 0.988 567.315 Model with PT: The LOGISTIC Procedure Model Information Data Set WORK.THREE Response Variable recovery Number of Response Levels 2 Model binary logit Optimization Technique Fisher's scoring Response Profile Ordered Total Value recovery Frequency 1 Full 369 2 NotFull 142 Probability modeled is recovery='Full'. Model Fit Statistics Intercept Parameter Intercept PT Yes DF 1 Wald Error Criterion AIC SC -2 Log L Only 605.947 610.183 603.947 1 0.1200 Chi-Square 21.8348 25.1168 Pr > ChiSq <.0001 <.0001 Intercept and Covariates 580.187 588.660 576.187 Type 3 Analysis of Effects Wald Effect DF Chi-Square Pr > ChiSq PT 1 25.1168 <.0001 Analysis of Maximum Likelihood Estimates Standard Estimate 0.5609 1.1243 0.2243 Odds Ratio Estimates Point 95% Wald Estimate Confidence Limits Effect PT Yes vs No 3.078 1.983 4.778 Edited Printout for Problem #4: time*group time*group time*group time*group time*group time*group Before Before Before zAfter Control 0 . . . . zAfter Diet 0.... zAfter Exercise 0.... Type 3 Tests of Fixed Effects Effect time*group time*group time*group time*group time*group time*group t Value 41.2 Pr > |t| 67.93 <.0001 The Mixed Procedure Covariance Parameter Estimates Cov Parm CS id Residual Effect Intercept group group group Exercise0.... time Before 22.2500 2.5006 33 time zAfter 0 .... time group Standard Estimate Error DF t Value 61.39 4.58 1.66 8.90 Pr > |t| <.0001 <.0001 0.1047 <.0001 Control Diet 228.92 3.7291 41.2 24.1667 5.2738 41.2 Subject Estimate 129.36 37.5177 Solution for Fixed Effects 8.7500 5.2738 41.2 Control Diet Exercise -22.0000 3.5364 -7.4167 3.5364 0 . . 33 33 -6.22 -2.10 . <.0001 0.0437 Num Den DF DF F Value 2 33 3.58 1 Least Squares Means Standard time group Estimate Error DF Before Control 253.33 3.7291 Before Diet 252.50 3.7291 41.2 67.71 <.0001 Before Exercise 251.17 3.7291 41.2 67.35 <.0001 zAfter Control 253.08 3.7291 41.2 67.87 <.0001 zAfter Diet 237.67 3.7291 41.2 63.73 <.0001 zAfter Exercise 228.92 3.7291 41.2 61.39 <.0001 Effect group time time*group Pr > F 0.0393 <.0001 Label Control vs. Diet After Program Control vs. Exercise After Program Diet vs. Exercise After Program 15.4167 24.1667 8.7500 2.92 4.58 <.0001 33 74.30 2 33 20.04 <.0001 Tests of Effect Slices Num Den Effect time DF DF F Value Pr > F time*group Before 2 41.2 0.09 0.9179 time*group zAfter 2 41.2 10.77 0.0002 DF t Value 5.2738 41.2 Estimates Estimate Std Error Pr > |t| 5.2738 41.2 5.2738 41.2 0.0056 1.66 0.1047 . Tables Problem #5: Test A Diabetes No Diabetes +76 22 -4 78 80 100 Test B Diabetes No Diabetes +68 7 -12 93 80 100 Test A Test B +- + 66 10 76 - 224 68 12 80 Edited Printout for Problem #6: treatment Standard Std+Drug Diff (1-2) treatment N Mean 129 56.6977 Std Dev 14.5777 15.0924 Std Err 1.2835 1.6370 Minimum 31.0000 30.0000 Std Dev Maximum 80.0000 79.0000 95% CL Std Dev Standard Std+Drug Diff (1-2) Pooled 14.5777 12.9897 16.6115 Diff (1-2) 15.0924 13.1149 17.7778 1.4036 -2.6676 5.4748 14.7838 13.5005 16.3387 85 Method 55.2941 1.4036 14.7838 2.0653 The TTEST Procedure Variable: age Mean 95% CL Mean 56.6977 54.1581 59.2373 55.2941 52.0388 58.5495 Satterthwaite 1.4036 -2.7018 5.5089 Method Variances Pooled Equal Satterthwaite Unequal DF t Value Pr > |t| 212 0.68 0.4975 175.5 0.67 0.5007 Equality of Variances Method NumDF DenDF FValue Pr>F Folded F 84 128 1.07 0.7165 The FREQ Procedure Table of treatment by sex treatment sex Frequency‚ Percent ‚ Row Pct ‚ Col Pct ‚Female ‚Male ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Standard ‚ 76 ‚ 53 ‚ ‚ Total 129 ‚ 35.51 ‚ 24.77 ‚ 60.28 ‚ 58.91 ‚ 41.09 ‚ ‚ 76.77 ‚ 46.09 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Std+Drug ‚ 23 ‚ 62 ‚ 85 ‚ 10.75 ‚ 28.97 ‚ 39.72 ‚ 27.06 ‚ 72.94 ‚ ‚ 23.23 ‚ 53.91 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 99 115 214 46.26 53.74 100.00 Statistics for Table of treatment by sex Statistic DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 1 20.9155 <.0001 Likelihood Ratio Chi-Square Continuity Adj. Chi-Square Mantel-Haenszel Chi-Square Phi Coefficient Contingency Coefficient Cramer's V 1 21.5071 1 19.6538 1 20.8178 0.3126 0.2984 0.3126 <.0001 <.0001 <.0001 Fisher's Exact Test ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Cell (1,1) Frequency (F) 76 Left-sided Pr <= F Right-sided Pr >= F Table Probability (P) Two-sided Pr <= P Sample Size = 214 1.0000 <.0001 <.0001 <.0001 The FREQ Procedure Table of treatment by race treatment race Frequency‚ Percent ‚ Row Pct ‚ Col Pct ‚Black ‚Hispanic‚White ‚ Total ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Standard‚ 38‚ 43‚ 48‚ 129 ‚ 17.76 ‚ 20.09 ‚ 22.43 ‚ 60.28 ‚ 29.46 ‚ 33.33 ‚ 37.21 ‚ ‚ 64.41 ‚ 63.24 ‚ 55.17 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Std+Drug‚ 21‚ 25‚ 39‚ 85 ‚ 9.81 ‚ 11.68 ‚ 18.22 ‚ 39.72 ‚ 24.71 ‚ 29.41 ‚ 45.88 ‚ ‚ 35.59 ‚ 36.76 ‚ 44.83 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 59 68 87 214 27.57 31.78 40.65 100.00 Statistics for Table of treatment by race Statistic DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 2 1.6156 0.4458 Likelihood Ratio Chi-Square 2 1.6115 Mantel-Haenszel Chi-Square 1 1.3818 0.0869 0.0866 0.4467 0.2398 Phi Coefficient Contingency Coefficient Cramer's V 0.0869 Sample Size = 214 The LIFETEST Procedure Stratum 1: treatment = Standard Quartile Estimates Point 95% Confidence Interval Percent Estimate Transform [Lower Upper) 75 27.0000 LOGLOG 26.0000 50 22.0000 LOGLOG 19.0000 25 17.0000 LOGLOG 14.0000 Stratum 2: treatment = Std+Drug Quartile Estimates 31.0000 25.0000 18.0000 Point 95% Confidence Interval Percent Estimate Transform [Lower Upper) 75 29.0000 50 23.0000 25 18.0000 LOGLOG LOGLOG LOGLOG 26.0000 34.0000 20.0000 26.0000 15.0000 20.0000 Test of Equality over Strata Pr > Test Chi-Square DF Chi-Square 0.3108 0.4411 0.4603 Log-Rank Wilcoxon -2Log(LR) 1.0272 1 0.5933 1 0.5451 1 Parameter Analysis of Maximum Likelihood Estimates Parameter Standard Hazard 95% Hazard Ratio DF Estimate Error Chi-Square Pr > ChiSq Ratio Confidence Limits The PHREG Procedure Summary of the Number of Event and Censored Values Percent Total Event Censored Censored 214 148 66 30.84 Model Fit Statistics Score Wald Effect 132.1085 5 117.2857 5 Criterion -2 LOG L AIC SBC Without Covariates 941.371 941.371 With Covariates 811.077 821.077 941.371 Testing Global Null Hypothesis: BETA=0 Test Likelihood Ratio 130.2936 Pr > ChiSq treatment sex 1 race 2 age 1 1 10.7456 43.8598 0.2958 105.3042 Chi-Square DF 5 <.0001 <.0001 <.0001 Type 3 Tests DF Wald Chi-Square Pr > ChiSq 0.0010 <.0001 0.8625 <.0001 836.063 treatment Standard 1 0.63270 0.19301 10.7456 0.0010 1.883 1.290 2.748 sex Female 1 -1.40373 0.21196 43.8598 <.0001 0.246 0.162 0.372 race Black 1 0.00433 0.21872 0.0004 0.9842 1.004 0.654 1.542 race Hispanic 1 0.10083 0.20063 0.2526 0.6153 1.106 0.746 1.639 age 1 0.07546 0.00735 105.3042 <.0001 1.078 1.063 1.094

Solution PreviewSolution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

Problem #1:

We want to model grip strength in people age 50 and older. Possible independent variables are sex, self-reported general health (good, fair, or poor), age, and systolic blood pressure.

A plot of grip strength vs. age is given, where the plotting symbol represents the sex of the person. This is followed by two analyses. The first is a simple linear regression that models grip strength from age; the second is a general linear model (ANACOVA) that uses all four independent variables.

Be prepared to report and interpret results for each analysis, and to explain why the results of the two may differ.

When we look at the regression between age and grip strength, we see that the R-squared is very low at 0.0225, this means that only 2.25% of the variability in the grip strength can only be explained by age. This, coupled with a high p-value (we want a low p-value to make the model appealing) of 0.0927, suggest that the regression model relating age to grip strength is not strong.

For the GLM procedure, the R-squared for the full model (including health, sex and systolic) improved to 0.4913 as well as the p-value for the F-test of the full model. However, only sex and health have significant coefficients (low p-value). This means that a linear regression of sex and health are more appropriate....

By purchasing this solution you'll be able to access the following files:
Solution.docx.

50% discount

Hours
Minutes
Seconds
$65.00 $32.50
for this solution

or FREE if you
register a new account!

PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

Find A Tutor

View available Advanced Statistics Tutors

Get College Homework Help.

Are you sure you don't want to upload any files?

Fast tutor response requires as much info as possible.

Decision:
Upload a file
Continue without uploading

SUBMIT YOUR HOMEWORK
We couldn't find that subject.
Please select the best match from the list below.

We'll send you an email right away. If it's not in your inbox, check your spam folder.

  • 1
  • 2
  • 3
Live Chats