1. For this problem, you will analyze your own data. This can be data that you’ve collected or you can download any publically available data set (do not use proprietary data!)

Pick a dependent variable and two key independent variables. Make sure your dependent variable is continuously measured, and also use a continuous variable for at least one of your independent variables. Answer the following questions:

-What is your dependent variable?

-What are your two key independent variables?

-Are there any variables in the data set that you will employ as controls? What are they, and why have you decided to use them as controls?

-Are any of your independent variables categorical? (If so, be sure to make dummies!)

Now estimate an OLS regression of your dependent variable on all of your independent variables (that is, your independent variables of interest and the controls, if you have any). Paste the output below. Then do/answer the following:

-What can you say about the fit of your model? In other words, how well do your independent variables explain the variance in your dependent variable?

-How many of your variables have a statistically significant relationship with the dependent variable? If applicable, why do you consider this/these variable/variables to be statistically significant?

-If you have more than one continuous variable, use standardized coefficients to investigate which has the largest effect on the expected value of your dependent variable, all else being equal. What did you find?

-Are there any potentially problematic outliers in your model? How did you test for these? If there are outliers, account for them either by creating dummies or dropping them from the analysis. Paste the output below. How did your findings change?

-Are there any potentially problematic nonlinearities in the relationships between the continuous variable(s) in your regression and your dependent variable? How did you test for this? If there is nontrivial nonlinearity in your model, account for it either by transforming the variable(s) or by using a polynomial regression, depending on the nature of the nonlinearity. Paste the output below. How did your findings change?

-Estimate a model in which two of your independent variables are interacted.

-Interpret this interactive relationship (doing so correctly will require that you take into account the level of measurement of each interacted variable.)

**Subject Mathematics Advanced Statistics**