Situation

You are the VP of sales for a major US company. Your company currently sells products in 47 states. Its main product is targeted to males between the ages of 25 and 30. You are trying to understand the factors that impact the sales of that product. You have collected the following data, which represent some factors you believe could be important. The dataset includes the following variables for each state.

Y = sales (2014)

X1 = the number of males between the ages of 25 and 30 (your target market)

X2 = a binary variable distinguishing the southern states from the others (1 = southern, 0 = other)

X3 = the mean number of years of college education for people age 25-30

X4 = amount of money spent on promotions (2014)

X5 = amount of money spent on promotions (2013)

X6 = percentage of males (25-30) who said they were in the market for your product (2014)

X7 = percentage of males (25-30) in the state population

X8 = percentage of low income males (25-30)

X9 = the unemployment rate for males (25-30)

X10 = median income for males (25-30)

Step 1

Provide an analysis of the distribution of each variable individually. Use the appropriate descriptive statistics and visual tools. Your analysis should include an interpretation of your results.

Step 2

Provide an analysis of the relationship between each explanatory variable (i.e., the Xs) and sales (Y). Your analysis should include the appropriate descriptive statistics and visual tools. Your analysis should include an interpretation of your results.

Step 3

Provide an analysis that compares the southern states with the others states over all of the variables included in the dataset. Your analysis should include an interpretation of your results.

Step 4

Develop a multiple regression model using Y as the dependent variable. Your analysis should include an interpretation of your results.

Step 5

Provide an overview of your results that would be useful for making decision in the future.

**Subject Mathematics General Statistics**