## Question

The following contingency table is based on the GPAs (grade point averages) of a random sample of 300 students selected from all classes taught by a particular instructor during the past four years, and how these students evaluated this instructor.

GPA of Student

Below 2.5, 2.5 to 3.5, Above 3.5

Excellent 18 33 37

Evaluation of Good 17 27 43

Instructor Average 21 31 23

Poor 25 14 11

a) Perform an appropriate chi-square test to determine whether there is any relationship between GPA and instructor evaluation. Check the assumptions of the test. State whether each assumption is met and provide support for your statements.

b) Graduate Students Only: Discuss all the measures of association used in contingency table analysis that SAS provides in the CHISQ option. Explain when it is appropriate to use each measure. For this problem, compute and interpret an appropriate measure of the strength of the association between GPA and instructor evaluation.

Problem #2: Perform in SAS

In the article “Whole-Body Dual-Modality PET/CT and Whole Body MRI for Tumor Staging in Oncology,” the authors cite the importance of accurately identifying the stage of a tumor. Accurate staging is critical for determining appropriate therapy. The article discusses a study involving the accuracy of positron emission tomography (PET) and computed tomography (CT) compared to magnetic resonance imaging (MRI). Using the data from file (R:\Fridline\Statistical Data Management\Project #2\ PET-CT Compared to MRI.txt) for 50 cancers analyzed with both technologies, does there appear to be a difference in accuracy? Does either technology appear better? What is the degree of association between the two methods?

Problem #3: Perform in SAS

A hypothesis has been suggested that a principal benefit of physical activity is to prevent sudden death from heart attack. The following study was designed to test the hypothesis: 100 men who died from a first heart attack and 100 men who survived a first heart attack in the age group 50-59 were identified and their wives were each given a detailed questionnaire concerning their husband’s physical activity in the year preceding their heart attacks. The men were then classified as active or inactive. Suppose that 30 of the 100 who survived and 10 of the 100 who died were physically active. Find the odds ratio in favor of physical activity for heart attack survivors versus deceased. Include in this discussion the interpretation of the 95% confidence interval. For graduate students, explain why calculating the Relative Risk would be inappropriate in this study.

Problem #4: Perform in SAS

Five bodies of water were measured for strontium levels. The five bodies of water were:

• Var1=Grayson’s Pond, Var2=Beaver Lake, Var3=Angler’s Cove, Var4=Appletree Lake, Var5=Rock River.

Determine if the average strontium levels differ by the body of water. Thirty water specimens were used (6 at each location) and the strontium level measured. The data set is located R:\Fridline\Statistical Data Management\Project #2\Strontium.txt.

a) Perform the appropriate data manipulation step for analysis.

b) Perform an ANOVA on these data. Check the assumptions of the hypothesis tests. State whether each assumption is met and provide support for your statements.

c) Does any of the strontium levels differ based on the location? If so, indicate which average strontium levels are statistically different. Provide the results of the analysis you used to make your conclusions.

d) Consider your check of the ANOVA assumptions in Part b). Choose the nonparametric ANOVA procedure that you feel is more appropriate, based on the assumption you consider to be most tenuous. Re-analyze the data using this nonparametric test and compare the results with the results of the parametric ANOVA you performed in Part 1).

e) Graduate Students Only: It is possible that Beaver Lake, Angler’s Cove, and Appletree Lake have similar strontium levels, but Rock River and Grayson’s Pond are quite different. Write two contrasts to compare the combined average strontium level of Beaver Lake, Angler’s Cove, and Appletree Lake with mean strontium level of (a) Rock River and (b) Grayson’s Pond. Then perform the appropriate analysis based on these contrasts and report your results.

Problem #5: Perform in SAS and SPSS

Does spending more money on education have an impact on students’ learning? Consider the editorial piece titled “Meaningless Money Factor” (The Washington Post, September 12, 1993). In this article, political commentator George Will uses statistics to support his claim that there is a negative association between state educational expenditure and average SAT score. In this problem, you will use a data set similar to that used by George Will to evaluate his argument. Import the dataset R:\Fridline\Statistical Data Management\Project #2\SAT.xls into SAS and SPSS. These data were collected by Professor Lynn Guber of the University of Vermont. The file contains data from 1997 on all 50 states of the U.S. for the following variables:

• State: State name

• Expenditure: Per pupil expenditure in state, in thousands of dollars

• Student/Faculty Ratio: Number of faculty per pupil in state

• Salary: Average teacher salary for in state, in thousands of dollars

• Percent taking: Percentage of students taking the SAT

• Verbal: Average score on verbal part of the SAT for students who took it

• Math: Average score on math part of the SAT for students who took it

• Total SAT Score: Average combined score of the SAT for students who took it

• Expenditure (100s): Per pupil expenditure in state, in hundreds of dollars

a) do a simple linear regression Y=Total SAT and X=Expenditure.

b) Build the “best” multiple regression model using all the possible predictors. Be sure to check all the assumptions.

## Solution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

Problem #1a) Perform an appropriate chi-square test to determine whether there is any relationship between GPA and instructor evaluation. Check the assumptions of the test. State whether each assumption is met and provide support for your statements.

Null Hypothesis (Ho): There is no relationship between GPA and instructor evaluation.

Alternative Hypothesis (Ha): There is a relationship between GPA and instructor evaluation.

Level of significance = 0.05

Assumption of the test: We can see that all cells have expected frequency more than 5, so we can conclude that the assumption of this test is met. Also both variables are categorical.

b) Graduate Students Only: Discuss all the measures of association used in contingency table analysis that SAS provides in the CHISQ option. Explain when it is appropriate to use each measure. For this problem, compute and interpret an appropriate measure of the strength of the association between GPA and instructor evaluation.

Conclusion:

Chi-square is used for categorical data when the assumption of the expected frequency is satisfied, Likelihood Ratio test is also useful for categorical data when the assumption of the expected frequency is satisfied. The Mantel-Haenszel test can be used to estimate the common odds ratio and to test whether the overall degree of association is significane. Phi coefficient is a measure of association for two binary variables. The contingency coefficient is a coefficient of association that tells whether two variables or data sets are independent or dependent of each other. Cramer's V is a way of calculating correlation in tables which have more than 2x2 rows and columns. It is used as post-test to determine strengths of association after chi-square has determined significance.

The best measure to answer our research question is based on the chi-square. We can see that the value of chi-square test statistic is 22.9185 with p-value as 0.0008, since the p-value is smaller than 0.05 or 5% level of significance so we will reject the null hypothesis and conclude that there is a statistically significant relationship between GPA and instructor evaluation....

By purchasing this solution you'll be able to access the following files:

Solution.docx.