## Question

a. Create a well-formatted (i.e. not copy-pasted from Stata) table showing the numbers of people giving each combination of answers on these questions.

b. Within each level of interest, identify what percentage of people (among those at that level of interest) felt that too little was spent on the military.

c. Identify how many people total indicated they were not at all interested in international issues and the number of these that would have been expected to say that too much is spent on the military if there were no relationship between the variables. Compare this number to the number within this level of interest that actually hold that view.

d. Estimate a χ2 test for the relationship between international affairs interest and views on military spending and provide the χ2 statistic and p-value.

e. Interpret your results statistically and, if there is evidence of a relationship, substantively for a general audience.

2. The male. The sei10 and spsei10 variables are continuous measures, ranging from 0 to 100, of the socioeconomic status of the respondent and the respondent’s spouse, respectively.

sex variable is a dichotomous indicator of the respondent’s sex, where 1=male and 2=fe-

a. Generate a new variable called seidiff equal to the difference between the individual’s status and the spouse’s status (that is, showing how much greater sei10 is than spsei10). Report separately for males and females the mean, standard deviation, and number of observations of this variable.

b. Generate side-by-side box plots showing the values this variable takes on separately for males and females.

c. State the null hypothesis and two-tailed alternative hypothesis for a difference of means test using sex as the independent variable and seidiff as the dependent variable.

d. Conduct this test, reporting the t-statistic and p-value.

e. Interpret your results statistically and, if there is evidence of a relationship, substantively for a general audience.

3. The content variable indicates the respondent’s yearly income in dollars, and the hrs1 variable shows the number of hours the respondent worked in the previous week. For the entirety of this question, include only individuals making less than $150,000 per year.

a. Generate a scatterplot showing hours worked on the x-axis (horizontal) and income on the y-axis (vertical), overlaid with a line of best fit.

b. Estimate a linear regression model, and provide the β coefficient along with its standard error, t statistic, and p-value.

c. Write down the equation for the line of best identified by the model.

d. Create and fill in a table like the one below showing the mean number of hours worked, the values of hours worked one standard deviation above and below the mean, and the income predicted by the model at each of these levels. Round all values to two decimal places as you do your calculations.

e. Interpret your results statistically and, if there is evidence of a relationship, substantively for a general audience.

1 sd below mean 1 sd above mean

hours

worked

predicted

income

4. The age variable indicates the age (in years) of the respondent, and the full-time variable indicates whether the respondent was (1) or was not (0) employed at a full time job at the time of the survey.

a. Estimate a logistic regression model using full-time employment as the dependent variable and age as the independent variable. Report the β coefficient, its standard error, the z-statistic, and the p-value.

b. Identify the median age in the data. For respondents of this age, calculate both the actual frequency of full-time employment and the frequency of full-time employment predicted by the model.

c. Generate a scatterplot with full-time employment as the dependent variable and age as the independent variable and draw the logistic fit curve predicted by the model above on this plot.

d. Interpret your results statistically and, if there is evidence of a relationship, substantively for a general audience.

5. The usetech variable is a continuous measure of the percentage of time a respondent spent using technology at work, and the tech50 variable is a dichotomous measure of whether this was 50% or more. The size of the respondent’s area (in thousands) is coded in the size variable. Whether (1) or not (0) the respondent holds at least a bachelor’s is coded in the college variable.

a. Estimate a multivariate linear regression using usetech as the dependent variable and including both size and college. Write down the equation for the estimated line of best fit.

b. Using the results from (a), calculate the predicted frequency of technology use for an individual with a college degree living in a city of 100,000 people.

c. Estimate a multivariate logistic regression using tech as the dependent variable and including both size and college. Write down the equation for the estimated (curved) line of best fit.

d. Using the results from (c) calculate the predicted frequency of technology use for an individual with a college degree living in a city of 100,000 people.

## Solution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

By purchasing this solution you'll be able to access the following files:

Solution.docx.