# Task 1 (10 marks) In this task you are required to analyse the Wai...

## Question

Task 1 (10 marks)
In this task you are required to analyse the Waist-to-Hip Ratio (WHR) in the sample (you can find more information about WHR in Section 2).
1.0. Open the file in Excel.
1.1. Compute the exact WHR for every individual Use the variables Hip and Waist provided. Call the new variable as WHR Save the results in column I. (5 marks)
1.2. Obtain the descriptive statistics of WHR Discuss your findings (e.g. how is the shape of the frequency distribution function? How the data set is spread around the mean? What is the best measure of central tendency and why? Etc.) (5 marks)

Task 2 (5 marks)
In this task you will plot some graphs to illustrate the distribution of the ethnic origin and household size.
2.1. Describe the distribution of the variable Origin by means of a Pie Chart. Discuss the results. (2 marks)
2.2. Describe the distribution of the variable hhsize by means of a Bar Chart. Discuss the results. (3 marks)

Task 3 (35 marks)
In this task you are required to analyse the Income distribution in the population. You will use the variable Income provided in the data set. This variable should be treated as a continuous variable.
3.1. Provide a table with the following information regarding the variabl le Income code, meaning, lower threshold, upper threshold, midpoint of the interval and frequencies. The upper threshold of Class 30 will be assumed to be 300,000. Note that that the meaning of the codes is provided in Section 2. (5 marks)
For example, the table would look as follows (all the cells must be fulfilled):

Code          Answer                                        Lower       Upper       Midpoint    Frequency
threshold    threshold
1          Income less than £520                           0               520
2    From £520 to £1,600 or [520, 1,600)            520       1600

30   from 140,000 to 300,000 or [140,000,
300,000)                                                          140,000    300,000
3.2. Compute the mean, mode and median. (5 marks)
3.3. Explain whether these measures suggest positive or negative skewness. (3 marks)
3.4. Compute the standard deviation and the Inter-Quartile Range (IQR). Would you prefer IQR to standard deviation as a measure of dispersion of the distribution? Explain your answer. (7 marks)
3.5. If we consider the 30th class as "from 140,000 to 150,000", how will this affect the measures computed above? (Do not compute again the descriptive measures.) Which upper threshold (150,000 or 300,000) may hel us describe better the income distribution in the sample? Which definition of the 30th class do you prefer, "from 140,000 to 150,000" or "from 140,000 to 300,000"? Explain your answer. (15 marks)

Task 4 (25 marks)
Create a new Stata document and enter the data in Columns A-I in Excel (in other words, your Stata file should contain the variables origin, hhsize, income, age, hip, height, waist, weight and WHR copied or imported from your Excel file).
4.1. Prepare a Histo: gram representation of the data on WHR Note that you can choose the number of bins OR the interval width (you can find more information in Section 3)
Explain whether the histogram suggests positive or negative skewness and whether or not you expect the mean to exceed the median. (5 marks)
4.2. Create a new variable with the BMI (Body Mass Index) values for every individual BMI = kg/m²). Use the variables height and weight provided. Call the new variable as BMI. (4 marks)
4.3. Obtain the following descriptive statistics of BMI mean, median, min, max, range, sd, skewness and IQR. Discuss your findings (e.g. how is the shape of the frequency distribution function? How the data set is spread around the mean? What is the best measure of central tendency and why? Do you think that the UK female has obesity problems?) (5 marks)
4.4. Prepare a Histogram representation of the data on BMI Note that you can choose the number of bins OR the interval width (you can find more information in Section 3).
Compare it with the histogram you prepared for WHR in 4.1. Do you think that both variables are capturing the same factor? (5 marks)
4.5. Create a new variable (3 categories) with the Health Risk Based Solely on WHR (see "Additional information" in section 2). Call the new variable as WHR class. Tabulate the new variable. (3 marks)
4.6. Create a new variable with the BMI classes for underweight, normal weight, overweight and obesity (see "Additional information" in section 2). Call the new variable as BMI_class. Tabulate the new variable. (3 marks)

Task 5 (25 marks)
In this task you will analyse the existence of correlation between WHR, BMI, Income, Origin and Age. You have to use Stata.
5.1. Create a cross table for the variables BMI_class and WHR_class Is it true what
NHS states?: "While body mass index (BMI) is a good way to tell if you're a healthy weight, it doesn't tell the whole story". In other words, can you see differences between both measures? Which percentage of obese individuals are considered in "High Risk" based on the WHR measure? (5 marks)
5.2. Create a cross table for the variables Origin and BMI_class. Do you think that some ethnic groups are more prone to be in the "High Risk" class' ? (5 marks)
5.3. Create a cross table for the variabl les Income and BMI_class. Do you think that there is some correlation between both variables? (5 marks)
5.4. Prepare a scatter plot for age and WHR Do you think that there is some correlation between both variables? Prepare also a scatter plot for age and BMI. Do you think that there is some correlation between both variables? Which variable (WHR or BMI) seem to be more correlated with age? (5 marks)
5.5. Why did I ask you to use a scatter plot in 5.4, and not in 5.2 or 5.3? (5 marks)

