QuestionQuestion

Transcribed TextTranscribed Text

1. Using data from a case-control study with 697 people, we want to model the probability of having lung cancer. The main independent variable is exposure to second-hand smoke. We also want to control for college degree (yes or no), type of work (blue collar or white collar), sex, and age. For the categorical variables, referent levels are no exposure, no college degree, blue collar, and female. Two logistic regression models have been run. The first (full) model contains all five variables, and the second contains only exposure, sex, and age. i) Using the first model, interpret the results for exposure, college, sex, and age. For each variable, use the CI for the odds ratio (whether or not it includes 1) and/or the p-value to assess “significance”. For results that are significant, interpret the odds ratio (as an estimated relative risk) to complete your interpretation. ii) Perform a likelihood ratio test that all non-significant variable(s) can be removed from the full model. Show how you calculated the test statistic, state the rejection region, and interpret the result of the test. iii) There should be one number that must be the same for both the full and reduced models in order for the likelihood ratio test to be valid. Give that number. iv) Give a brief plausible explanation why the variable(s) cannot all be removed in the likelihood ratio test in part (ii). What could you look at to see if your explanation was correct? 2. The number of times a person needed to take pain medicine in the three days after surgery is recorded for 58 patients. We also have the type of surgery used (experimental or standard), as well as the sex and age of the patient. i) Explain why Poisson regression, rather than a linear model, is the better approach here. ii) For each variable in the model, give the p-value from the partial test, and interpret the result. Use the exponentiated least squares means for any “significant” categorical variable to help complete the interpretation. iii) Calculate and interpret the rate ratios for surgery type and age. iv) This model has no interaction terms, but describe what an interaction between sex and surgery type would mean in the context of this problem. Hint: The dependent variable must be included as part of this description. Printout for Problem #1: Model #1: The LOGISTIC Procedure Probability modeled is LungCa='yes'. Model Fit Statistics Intercept Intercept and Criterion Only Covariates AIC 967.729 888.673 SC 972.276 915.954 -2 Log L 965.729 876.673 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 89.0561 5 <.0001 Score 84.5596 5 <.0001 Wald 76.2226 5 <.0001 Type 3 Analysis of Effects Wald Effect DF Chi-Square Pr > ChiSq exposed 1 38.8496 <.0001 college 1 1.5540 0.2125 work 1 0.0509 0.8215 sex 1 35.0073 <.0001 age 1 9.0510 0.0026 Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -0.5278 0.2768 3.6348 0.0566 exposed yes 1 1.0797 0.1732 38.8496 <.0001 college yes 1 -0.3130 0.2511 1.5540 0.2125 work white 1 -0.0600 0.2662 0.0509 0.8215 sex male 1 -0.9713 0.1642 35.0073 <.0001 age 1 0.0130 0.00434 9.0510 0.0026 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits exposed yes vs no 2.944 2.096 4.134 college yes vs no 0.731 0.447 1.196 work white vs blue 0.942 0.559 1.587 sex male vs female 0.379 0.274 0.522 age 1.013 1.005 1.022 Model #2: The LOGISTIC Procedure Probability modeled is LungCa='yes'. Model Fit Statistics Intercept Intercept and Criterion Only Covariates AIC 967.729 888.641 SC 972.276 906.828 -2 Log L 965.729 880.641 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 85.0878 3 <.0001 Score 81.0451 3 <.0001 Wald 73.4314 3 <.0001 Type 3 Analysis of Effects Wald Effect DF Chi-Square Pr > ChiSq exposed 1 39.1151 <.0001 sex 1 35.3411 <.0001 age 1 9.0999 0.0026 Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -0.6315 0.2714 5.4123 0.0200 exposed yes 1 1.0791 0.1725 39.1151 <.0001 sex male 1 -0.9735 0.1637 35.3411 <.0001 age 1 0.0130 0.00432 9.0999 0.0026 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits exposed yes vs no 2.942 2.098 4.126 sex male vs female 0.378 0.274 0.521 age 1.013 1.005 1.022   Printout for Problem #2: The GENMOD Procedure Class Level Information Class Levels Values surgery 2 exprmntl standard sex 2 female male Criteria For Assessing Goodness Of Fit Criterion DF Value Value/DF Deviance 54 48.7827 0.9034 Scaled Deviance 54 48.7827 0.9034 Pearson Chi-Square 54 45.0095 0.8335 Scaled Pearson X2 54 45.0095 0.8335 Log Likelihood 307.9718 Full Log Likelihood -127.9937 AIC (smaller is better) 263.9874 AICC (smaller is better) 264.7421 BIC (smaller is better) 272.2292 Analysis Of Maximum Likelihood Parameter Estimates Standard Wald 95% Confidence Wald Parameter DF Estimate Error Limits Chi-Square Pr > ChiSq Intercept 1 2.6092 0.1993 2.2185 2.9999 171.31 <.0001 surgery exprmntl 1 -0.3756 0.1131 -0.5973 -0.1538 11.02 0.0009 surgery standard 0 0.0000 0.0000 0.0000 0.0000 . . sex female 1 0.0483 0.1096 -0.1666 0.2632 0.19 0.6595 sex male 0 0.0000 0.0000 0.0000 0.0000 . . age 1 -0.0150 0.0040 -0.0230 -0.0071 13.80 0.0002 Scale 0 1.0000 0.0000 1.0000 1.0000 LR Statistics For Type 3 Analysis Chi- Source DF Square Pr > ChiSq surgery 1 11.27 0.0008 sex 1 0.19 0.6594 age 1 13.94 0.0002 surgery Least Squares Means Standard surgery Estimate Error z Value Pr > |z| Exponentiated exprmntl 1.5866 0.08839 17.95 <.0001 4.8872 standard 1.9622 0.06887 28.49 <.0001 7.1151 sex Least Squares Means Standard sex Estimate Error z Value Pr > |z| Exponentiated female 1.7986 0.07807 23.04 <.0001 6.0410 male 1.7503 0.07792 22.46 <.0001 5.7561

Solution PreviewSolution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

1. i)

In the first model, the point estimate for exposure is 2.944, which means that a one unit change in exposure increases the odds of having a lung cancer by a factor of 2.944. Thus we will choose a dependent variable whose 95% CI does not include 1 since if this is so, then the interval contains the possibility that the variable did not affect the odds ratio (since 1 is included). Hence the variables that are deemed significant are exposure, sex and age while college and work are not.

1. ii)

The difference between the -2LogL under Intercept and Covariates of Model 1 and 2 is |876.673 – 880.641| = 3.968. This is the test-statistic and we will compare this to x²₂,₀.₉₅=0.103. Since our test statistic is greater than this, then we reject the null hypothesis and conclude that model 2 is more significant in modelling the odds ratio of having a lung cancer....

By purchasing this solution you'll be able to access the following files:
Solution.docx.

$45.00
for this solution

or FREE if you
register a new account!

PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

Find A Tutor

View available Advanced Statistics Tutors

Get College Homework Help.

Are you sure you don't want to upload any files?

Fast tutor response requires as much info as possible.

Decision:
Upload a file
Continue without uploading

SUBMIT YOUR HOMEWORK
We couldn't find that subject.
Please select the best match from the list below.

We'll send you an email right away. If it's not in your inbox, check your spam folder.

  • 1
  • 2
  • 3
Live Chats