Subject Mathematics Statistics-R Programming

Question

4 K-Nearest Neighbors
 Brie
y describe the method in your own words. State any assumptions used and whether you think these assumptions are violated. Include relevant formula.
 Use V -fold cross-validation to choose a value of k.
 Plot the cross-validation error against k.
 State the value of k chosen and the cross-validation error for this k.

5 QDA, LDA and FDA
 Brie
y describe the methods in your own words. State any assumptions used and whether you think these assumptions are violated. Include relevant formula.
 Find the V -fold cross-validation error for QDA, LDA and FDA.
 Use the LDA and FDA coefficients to assess which variables increase the chances of diabetes.

6 Classification Trees
 Briefly describe the method in your own words. State any assumptions used and whether you
think these assumptions are violated. Include relevant formula.
 Find the V -fold cross-validation error for the classification tree t using the default settings in rpart.
 Interpret the fitted tree using all of the data. Interpret the importance of each variable and the role each variable plays in predicting diabetes.

7 Logistic Regression
 Briefly describe the method in your own words. State any assumptions used and whether you think these assumptions are violated. Include relevant formula.
 Use appropriate methods discussed in class to select a model.
 Find the V -fold cross-validation error for this model.
 Use the selected logistic regression model to interpret the role each variable plays in predicting diabetes.

Solution Preview

This material may consist of step-by-step explanations on how to solve a problem or examples of proper writing, including the use of citations, references, bibliographies, and formatting. This material is made available for the sole purpose of studying and learning - misuse is strictly forbidden.

Data Description:-
1. Title: Applying different statistical models on hospital data
2. Source of data: - This dataset (pid.dat) has been extracted from a combined dataset of several United State (US) hospitals. The aim for the collection was to determine the risk factors involved with diabetes.
3. Attribute Information: From these hospitals of united state various types of measurements were taken from total 392 patients. The variables that has been collected are: - 1) pregnant: frequency of patient’s pregnancy.
2) Glucose: the patient's plasma glucose concentration.
3) Pressure: the patient’s blood pressure (B.P.) (mm Hg).
4) Triceps: the patient's triceps thickness (mm).
5) Insulin: the patient's serum insulin (mu U/ml).
6) Mass body mass index: the patient's weight(kg) divided by the height
7) Pedigree: the patient's diabetes pedigree function.
8) Age: the patient's age in years.
9) Diabetes: Class variable (“pos" or “neg").
3. Missing Attribute Values: None...

This is only a preview of the solution. Please use the purchase button to see the entire solution

Related Homework Solutions

R Programming Problems
Homework Solution
$35.00
Mathematics
Statistics
R Programming
Tables
Information
Airfare Report
Datasets
Transportation
Scripts
Variables
Missing Values
Passengers
Patterns
Series of Coordinates in R
Homework Solution
$68.00
Computer Science
R Programming
Series
Coordinates
Cartesian Plane
Third Dimension
Euclidean Space
Text File
Tables
Matrices
Arrays
Formulas
Practical Aspects of Database Design
Homework Solution
$75.00
Computer Science
R Programming
Database Design
NASA
Text Mining
Cluster Analysis
K-Means
Time Series
Forecast
Seasonal Components
Plots
Trends
Packages
R Programming Questions
Homework Solution
$40.00
Computer Science
R Programming
Database
Information
Classes
Variables
Blood Pressure
Serum Insulin
Pedigree Function
Distribution
Data Split
Statistics Questions
Homework Solution
$38.00
Mathematics
Statistics
Binomial Probability Distribution
Tabular Form
Commercial Fisherman
Sample Mean
Population
Standard Deviation
Plotting in R
Homework Solution
$10.00
Computer Science
Statistics
R Programming
Density
Functions
Poisson
Binomial
Scatter Plot
Distributions
Get help from a qualified tutor
Live Chats