#1. (Statistics Review). You are given an EXCEL data set with information about average incomes and percent of returns that were audited by the IRS for select urban areas. (a) Fill in the following table for the “Descriptive Statistics” of these two data variables (you should treat these as “samples” and not “populations” in your computations for standard deviation and variance): Adjusted Gross Income Percent Audited Average Standard Deviation Variance Maximum Minimum Range (b) Create a 95% confidence interval for the average percent audited. (c) Could any of these observed “% audited” in urban areas reported here be considered “unusual”? Explain. (Hint: consider the Z-scores). #2: You are given an EXCEL data set with information about the retention rates and graduation rates for a set of online universities. (a) Given the different forms of data sets described in this unit (Cross section, Time series, Pooled and Panel) identify and explain the format of this data set. (b) If you were the President of ITT Technical Institute and were especially interested in the historic trends of the relationship between retention rates and graduation rates, would this be the best format of data to use? Why or why not? Could you think of a better format for a data set to investigate? Be specific. (c) Create a scatterplot of this data, you should put retention rate on the x-axis of this scatterplot. (Please copy/paste your scatterplot into this document. Add a trendline to this scatterplot, if possible.) How would you characterize the relationship between these variables? Explain. (d) Do you think this relationship could be estimated using regression analysis? Explain why or why not. (Hint: please be sure to address the issue of causality.)

