Use R (RStudio Notebook) to analyze and interpret the following data set using two different approaches to data analysis (i.e. Correlation, descriptive statistics, ANOVA and regression analysis, Application of various tests as applicable or Time series analysis as applicable). The dataset is of the entire US but I only need to focus on California.

Solution PreviewSolution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

Analysis of data of top risks associated with Heart Disease in California

Introduction and Objective:
We want to analyze the given data for California to find out the top risks associated with Heart Diesase. In order to select the data from this huge database, we have filter out the data for California from variable “LocationDesc” as well as Risk Factors from variable “Category”. In the data set there are so many variables, we choose important variables in order to find the information from the data. We also filter the data for “Age-Standardized”. This data needs to be lot of filtering in order to analyze it correctly. The reported data is from year 2011 to 2015. There were three important variables collected in this dataset, which is Race, Gender and Overall and of our interest. The most suitable variable in order to analyze the data is Race and Gender. Race has 5 category and Gender has 2 categories. There are 6 types of risk factors which are Cholesterol Abnormalities, Diabetes, Nutrition, Obesity, Physical Inactivity, Smoking and Hypertension in this dataset. We will try to find out from these risk factors which are different from each other or having statistically significant difference in the Prevalence in the past 6 years among US adults (20+) in Gender and Race category. So, the most suited test which we can apply on this type of data is t test for Gender and analysis of variance on Race. We can also see the trend of the data with respect to years but for this purpose we need data for more years, and we have only data for 6 years and that also the estimates are provided with respect to each category and it is not an individual level data. So only t test and analysis of variance can be used on 7 different categories in order to see whether there is any statistically significant difference in the mean Prevalence in the past 5 years among US adults (20+) in Gender and Race category....

By purchasing this solution you'll be able to access the following files:

50% discount

$100.00 $50.00
for this solution

or FREE if you
register a new account!

PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

Find A Tutor

View available Statistics-R Programming Tutors

Get College Homework Help.

Are you sure you don't want to upload any files?

Fast tutor response requires as much info as possible.

Upload a file
Continue without uploading

We couldn't find that subject.
Please select the best match from the list below.

We'll send you an email right away. If it's not in your inbox, check your spam folder.

  • 1
  • 2
  • 3
Live Chats