PROJECT OPTION #3:

Data file: Camera-Datafile.xlsx

NOTE: Use α = 0.05 as the significance level

Consumer Reports tests Point-and-Shoot digital cameras and assigns each model an overall score based upon factors such as the number of megapixels, weight (oz.), image quality, and ease of use. The overall score ranges from 0 to 100, with higher scores indicating better test results (and, presumably, a better camera). Selecting a camera with many options can be a difficult process, and price is certainly a key issue for most consumers. By spending more, will a consumer really get a superior camera? Are any other characteristics associated with higher scores?

The data file contains average retail price ($), number of megapixels, weight (oz), and the overall score for 28 digital cameras. You will analyze this data with the objective of informing the consumer about what to focus on to choose a better camera.

Directions:

1) First, you must describe your data set. In the case of regression, that means two things: presenting descriptive statistics in numerical form; and, graphing bivariate scatterplots for the dependent variable (DV) versus each of the independent variables (IVs).

In Excel: For each variable (Price, Megapixels, Weight, and Score), calculate the following descriptive statistics: mean, median, standard deviation, minimum, and maximum. You may do this using the Descriptive Statistics procedure in the Excel Data Analysis tool, or by calculating them yourself using Excel functions. Just be sure to leave the Excel formulas in place in the cells if you choose the latter option.

Graph three bivariate scatterplots: Score vs. Price; Score vs. Megapixels; and, Score vs. Weight. The dependent variable, Score, should always be on the y-axis. Label the axes and title the plots so that your audience can understand what each one shows. Modify the range on the x-axis and y-axis in the scatterplots so that the data points occupy most of the plot area. (You can modify the range of numbers on any axis in Excel by double-clicking the axis and then changing the Maximum and Minimum Bounds on the Format Axis: Axis Options tab that pops up.) Rule of thumb in regression scatterplots is to minimize whitespace; showing the origin on the axes is not necessary.

In the Managerial Report:

Include a table that shows the descriptive statistics for each variable. The statistics should be in rows and the variables in the columns. Include an indication somewhere about how many cameras are included in the analysis. When presenting numbers in your report, limit the results to four decimal places. Having more decimal places than that clutters up the report while adding little useful information for the reader.

Include all three bivariate scatterplots. Comment briefly on each plot: from what you see on each scatterplot, do you expect there to be a relationship between each independent variable and score? What direction do you expect that relationship (if there is one) to have?

2) Investigate the relationship between each independent variable (Price, Megapixels, and Weight) and the dependent variable (Score) by estimating three different simple linear regression models.

In Excel: Use Excel’s Regression tool (see section 14.7 in the book for directions) to estimate the following three simple linear regression models:

DV: Score, IV: Price

DV: Score, IV: Megapixels

DV: Score, IV: Weight

NOTE: when estimating the regression models, check the boxes to save the Residuals and the Standardized Residuals for each model.

For each model, determine whether there is a statistically significant relationship between the DV and the IV. For each model in which there is a statistically significant relationship, do the following:

Perform residual analysis by graphing two scatterplots:

Residuals vs. Predicted Score with Residuals on the y-axis. Adjust the scale on the x-axis so that the points take up the majority of the plot area. If you do not, you may miss important patterns that can be obscured by leaving the points clumped together in a small area.

Standardized Residuals vs. Predicted Score with Standardized Residuals on the y-axis. Adjust the scale on the x-axis so that the points take up the majority of the plot area (see #1 above for why)

Examine the scatterplots for evidence that any of the four regression assumptions are violated (see section 14.8, and the Regression Overview and Summary: Handout #2). What patterns do you see? Identify possible outliers.

In the Managerial Report:

Include Regression Output for all three models. Report the p-values for the slope on each IV and state whether there is a statistically significant relationship between each IV and Score (the DV). For any model in which there is a statistically significant relationship between the DV and the IV:

Report the estimated regression equation.

Report and interpret the coefficient of determination (R^2): what percentage of the variation in Score is explained by the IV?

Report and interpret the standard error of the estimate (s): what is the average error you can expect to make when using this estimated regression equation to predict Score?

Interpret the slope on the IV: How much do you expect Score to increase or decrease on average for a one unit increase in the IV?

Report and interpret the confidence interval for the slope on the IV.

Take a look at the descriptive statistics for the range of values of the IV. Choose two plausible values for the IV and calculate the predicted Score for those values. The values should not be exactly the same as any of the IV values in the data; instead, they should be within the range of values that the IV takes. Report the predicted scores.

Include both residual plots in the report. Discuss the residual analysis that you performed in Excel. Describe any patterns you may see in the Residual plot and discuss what assumptions (if any) might be violated. Circle any possible outlier(s) on the Standardized Residual plot. (You can use the Shape tool in Excel or Word with Shape Fill set to No Fill to do this). Which cameras are possible outliers (if any)? Suggest what might be done to address any violations of the assumptions of regression (if you find any).

Answer the question: what should a customer focus on to get the best camera?

**Subject Mathematics Advanced Statistics**