# Study Design, Two-way ANOVA, Multiple Linear Regression and Cox Proportional Hazard

## Question

Answer the following questions. Copy and paste any required data charts or summaries into this Word document. Use additional space as needed. Be sure to include your name on the document and use the file naming convention.This exam is open book and open notes.

I. Study Design and Sample Size
You are asked to design a study to assess the risk of water contamination in water supply and development of GI cancer.The national GI cancer rate is reported at 1 per 100,000.
1. What kind of study design would you use? Why?
2. What type of the statistical testing would you use? Why?
3. Based on your study design, calculate the minimum sample size with alpha 0.05, power 0.80, and an assumed effect size of .30.
4. Interpret the result of the sample size calculation.

II. Two-way ANOVA
A researcher is comparing the effects of 2 different asthma drugs on test performance. The researcher suspects that at least one of the drugs may have a different effect on fresh versus tired test takers. Data was collected on test results achieved by fresh and tired test takers after using each of the 2 drugs. There is concern that the exposure to these drugs could impair more than a person’s ability to take a test and therefore become a Public Health issue. The Research Question is: Does Drug A or Drug B impair test performance in either fresh or tired test takers? Please answer this question using the Final Exam – ANOVA (SPSS document) dataset and the following steps:
1. Provide numeric descriptive statistics (include skewness and kurtosis if appropriate) and graphic descriptions for Alertness, Drug Treatment, and Test Performance.
2. Create histograms of the test performance results (dependent variable) for each combination of levels for the two independent variables. Describe the data and shape of the distributions.
3. Discuss whether the assumptions of homogeneity of variance of the groups and normality of the data on test performance are met. Be sure to include output to support your decision on whether the assumptions have been met. (Continue with the analyses even if assumptions are not met.)
4. Conduct two-way ANOVA with interaction and post hoc analysis (as appropriate) using Tukey to correct for multiple comparisons. Provide relevant SPSS output.
5. Interpret the analysis results in the context of the research question. Include important statistics from your analysis results to support your conclusion and generalize your results, if appropriate, to the relevant population(s).

III. Multiple Linear Regression
A health department randomly selected 400 subjects from a local community and monitored their cardiovascular condition. Data from this study are provided in the Final Exam – Linear and Logistic (SPSS document) dataset.The following variables are included in the database: sex, age, BMI, SBP, DBP, serum cholesterol, coronary heart disease, and follow-up.
1. Conduct a multiple linear regression using SPSS. Provide relevant SPSS output and assess the statistical significance of the effect of sex, age, and BMI on systolic blood pressure.
2. Explain the assumptions of Linearity, Sampling independence, Normality, and Homoscedasticity (or equal variance). How would you test whether these have been met? (Note: for the exam you do not need to test these assumptions)
3. Explain the practical implications of your finding. Include a reference to the R square of the model in your discussion.
4. Discuss whether or not there is interaction (effect modification) between sex and age.

IV. Multiple Logistic Regression
Use the Final Exam – Linear and Logistic (SPSS document) dataset to assess the impact of sex, age, and BMI on the risk of coronary heart disease.
1. Conduct simple logistic regression of coronary heart disease and sex.
2. Conduct a multiple logistic regression using SPSS to address the research question: What is the association between sex and coronary heart disease after controlling for age and BMI?
3. Discuss how the addition of age and BMI in the model affected the association of sex and coronary heart disease using the Odds Ratios and confidence intervals in your output.
4. Assess the statistical significance of the individual risk factors and explain the practical implication of your finding.

V. Cox Proportional Hazard
The Final Exam – Linear and Logistic (SPSS document) dataset, used in problems III and IV, also includes follow-up time (in days) from the beginning of the study to either onset of coronary heart disease or end of the study. This allows you to also look at the relationship of sex to CHD using survival analysis techniques.
1. Complete a Kaplan-Meier Survival Analysis using Followup as the Time variable, Chdfate as the status variable and Sex as the factor. Produce a plot of the survival function. Discuss whether the survival time appears related to whether the person is male or female based on the survival plot.
2. Use Kaplan-Meier in SPSS to test the assumption of proportionality. Create a Hazard plot with time = followed, status=Chdfate, and factor = sex. Interpret the results.
3. Conduct a Cox Proportional Hazard regression to compare the time to coronary heart disease event between men and women. Include a Plot of the Hazards function stratified by sex in the output. Interpret the results.
4. How does the hazard ratio compare to the odds ratio obtained from the simple logistic regression from the previous problem? Why might they differ?

## Solution Preview

This material may consist of step-by-step explanations on how to solve a problem or examples of proper writing, including the use of citations, references, bibliographies, and formatting. This material is made available for the sole purpose of studying and learning - misuse is strictly forbidden.

I. 1. We will use a cross-sectional study design because here we can examine the event such as GI cancer, we need to know the selected person is having or developing GI and then we will classify the subject whether it is a case of water contamination in water supply or not.
2. We will use categorical data analysis because these types of study design required categorical data analysis such as Odds ratio or logistic ratio. If we came to know any variable which is affecting the outcome such as GI cancer, then we can use logistic regression and if we have only 2x2 table, then we can simply calculate the Odds ratios.
3. The total sample size required for conducting this study with alpha 0.05 and to obtained the power of 80% and assumed the effect size of 0.30 is 18. The below is the G*Power output for the same:
Input: Tail(s) = One
Effect size g = 0.3
α err prob = 0.05
Power (1-β err prob) = 0.80
Constant proportion = 0.5
Output: Lower critical N = 13.0000000
Upper critical N = 13.0000000
Total sample size = 18...
\$60.00 for this solution

PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

### Find A Tutor

Get College Homework Help.

Are you sure you don't want to upload any files?

Fast tutor response requires as much info as possible.