Question

We are looking at regression with factors (categorical variables). We often treat them with what we call ‘dummy variables’ in econometrics. You will be showing you can use and interpret regressions with factors.
1. Load the LDC dataset in csv file given below. The dataset has two factors, size and area.
2. Write number 290742. Divide that number into three two-digit numbers, a, b, and c. If any of a, b, and c is greater than 72, divide it by 2. Delete observations numbered a, b, and c from the dataset. Show the command you used. The Assignment
3. Plot OMandA against the factors and reproduce the graphs. Explain what you get. What kind of graphs do you get? What do they tell you? Are the differences significant?
4. Regress OMandA on Area. Show the results in a nice table.
5. Why does only one area appear in the regression results? Explain how R chose which one to show.
6. Explain what the regression tells us?
7. Regress Total.Bill on Customers, OMandA, and Area . Show the results in a nice table.
8. You have just done a dummy-variable regression and found an intercept dummy, Explain what it tells you. Why is only one value of Area presented in the table?
9. For your last trick, you will find a slope dummy. To test to see if there is an interaction between the number of Customers and the Area add OMand:Area to the regression. Interpret the result.

Bonus:
10. Load library(effects)
11. define the model M<-lm(Total.Bill Size:OMandA). (This really just asks R to come up with a regression line for each of the categories. The colon is a shorthand way to indicate all the interactions.)
12. Plot the result using the effects package ( it is designed to show the different regressions when there are interaction all on one graph.) There are two versions
> plot(allEffects(M))
> plot(allEffects(M), multiline=T)
Explain the resulting graphs briefly.
Miscellaneous notes
f<-file.choose() d<-read.csv(f) d<-d[-c(1,14,21),] length(d[,2])

When you do a regression with a dummy, one category must be left out. (R will do it automatically. It leaves out the first factor in alphabetical order so take care. You may want to change the baseline case to make it easier to explain). You will have to think carefully about the meaning of your results.

Contents of CSV file:
Customers,OMandA,Dist.Cost,Total.Bill,Area,Size
Algoma,11581,839.00 ,56.15 ,132.47 ,N,S
Atikokan,1661,564.00 ,55.61 ,135.34 ,N,S
Brant,9741,490.00 ,36.34 ,114.13 ,S,S
CW,6496,299.00 ,33.35 ,110.82 ,S,S
Chapleau,1293,416.00 ,34.16 ,113.20 ,N,S
Coop Embrun,1954,274.00 ,33.90 ,112.26 ,S,S
ELK,11276,214.00 ,36.00 ,115.98 ,S,S
Espanola,3299,326.00 ,40.88 ,119.99 ,N,S
Fort Frances,3775,345.00 ,22.98 ,98.32 ,N,S
Grimsby,10307,202.00 ,36.82 ,114.82 ,S,S
Hearst,2817,308.00 ,25.52 ,103.07 ,N,S
Hydro 2000,1208,264.00 ,35.16 ,114.85 ,S,S
Hawkesbury,5521,165.00 ,22.21 ,99.66 ,S,S
Kenora,5572,359.00 ,36.94 ,114.45 ,N,S
Lakefront,9976,217.00 ,31.68 ,109.94 ,S,S
Lakeland,9598,293.00 ,41.16 ,119.56 ,N,S
Midland,6951,258.00 ,34.18 ,113.04 ,S,S
NOTL,8000,238.00 ,35.13 ,112.70 ,S,S
NOW,6059,353.00 ,34.23 ,111.70 ,N,S
Orangeville,11248,263.00 ,34.43 ,112.03 ,S,S
Ottawa River,10555,253.00 ,12.73 ,87.30 ,N,S
Parry Sound,3441,383.00 ,41.23 ,120.77 ,N,S
Renfrew,4183,269.00 ,27.07 ,106.29 ,S,S
Rideau SL,4185,275.00 ,37.60 ,117.46 ,S,S
Sioux Lookout,2755,425.00 ,45.00 ,123.79 ,N,S
Tillsonburg,6745,330.00 ,32.07 ,109.34 ,S,S
Wasaga,12324,180.00 ,30.96 ,110.59 ,S,S
Wellington North,3626,432.00 ,38.07 ,117.37 ,S,S
Blue Water,35772,309.00 ,40.98 ,117.81 ,S,M
Brantford,37964,176.00 ,28.79 ,106.07 ,S,M
Burlington,64329,225.00 ,34.33 ,111.50 ,S,M
Cambridge,51584,209.00 ,33.94 ,110.30 ,S,M
Chatham kent,32132,209.00 ,26.73 ,106.26 ,S,M
Collus,15723,259.00 ,33.48 ,110.80 ,S,M
CNP,15708,279.00 ,39.68 ,111.15 ,S,M
Enwin,85083,268.00 ,41.14 ,118.12 ,S,M
ErieThames,18090,315.00 ,40.56 ,118.16 ,S,M
Essex,28094,197.00 ,36.91 ,115.43 ,S,M
Festival,19885,200.00 ,38.25 ,114.75 ,S,M
Greater Sudbury,46748,280.00 ,33.90 ,111.91 ,N,M
Guelph,50859,251.00 ,34.71 ,110.54 ,S,M
Haldimand,21070,346.00 ,46.73 ,125.78 ,S,M
Halton,21232,227.00 ,32.40 ,110.92 ,S,M
Innisfil,14826,281.00 ,43.59 ,123.09 ,S,M
Kingston,26844,224.00 ,34.40 ,111.16 ,S,M
KitchenerWilmot,87964,155.00 ,29.60 ,106.19 ,S,M
Milton,30485,210.00 ,35.76 ,112.63 ,S,M
Newmarket,33338,198.00 ,37.98 ,115.17 ,S,M
NPEI,51162,275.00 ,36.75 ,114.98 ,S,M
Norfolk,19032,251.00 ,49.50 ,127.73 ,S,M
North Bay,23850,224.00 ,36.29 ,113.97 ,N,M
Oakville,63614,206.00 ,32.75 ,109.72 ,S,M
Orillia,13035,345.00 ,31.57 ,108.13 ,S,M
Oshawa,53083,191.00 ,26.63 ,103.97 ,S,M
Peterborough,35270,199.00 ,30.75 ,108.24 ,S,M
Sault Ste Marie,32998,260.00 ,26.65 ,100.16 ,N,M
St Thomas,16436,225.00 ,28.85 ,105.64 ,S,M
Thunder Bay,49765,238.00 ,26.29 ,103.75 ,N,M
Waterloo North,53611,182.00 ,35.30 ,112.47 ,S,M
Welland,21768,242.00 ,37.77 ,115.81 ,S,M
Westario,22257,207.00 ,28.91 ,108.70 ,S,M
Whitby,40337,214.00 ,38.19 ,115.69 ,S,M
Woodstock,15181,251.00 ,41.05 ,118.40 ,S,M
Enersource,195381,238.00 ,30.88 ,107.75 ,S,L
Horizon,235327,175.00 ,36.72 ,113.90 ,S,L
Hydro One Brampton,137856,148.00 ,30.67 ,107.46 ,S,L
Ottawa,305266,191.00 ,34.67 ,111.44 ,S,L
London,148331,209.00 ,34.83 ,112.03 ,S,L
Powerstream,332993,184.00 ,32.21 ,108.65 ,S,L
Veridian,113709,181.00 ,31.38 ,108.81 ,S,L
Hydro One Networks,1210695,454.00 ,51.42 ,131.34 ,N,XL
Toronto,709323,328.00 ,39.59 ,116.73 ,S,XL

Solution Preview

This material may consist of step-by-step explanations on how to solve a problem or examples of proper writing, including the use of citations, references, bibliographies, and formatting. This material is made available for the sole purpose of studying and learning - misuse is strictly forbidden.

Code with explanations

#set working directory to the one where your file is saved
#
#Setting up
#
#
#NO 1
LDC <- read.csv("LDC.csv")
#
#NO 2
student_number <- 290742
#
#this one is a bit tricky, because the term that's used is 'TWO DIGIT NUMBERS'
#as I am using a string-to-int conversion on a substringed number, the variable
#b will only contain one digit, because it is a number variable
#so for example '07' is assigned as just 7
#
a <- strtoi(substr(student_number,1,2))
b <- strtoi(substr(student_number,3,4))
c <- strtoi(substr(student_number,5,6))
#
#function that is used for checking and dividing numbers
#we are using the floor function because we need to delete the exact observation
#numbered by 'num' so this variable has to be of type int
#this function rounds the division result to the closest bottom integer
#(in our case, not important, all variables are under 72)
#
numberDivide <- function(num) {
if (num>72) {
num = floor(num/2)
}
else {
num = num
}
}
#...

This is only a preview of the solution. Please use the purchase button to see the entire solution

Related Homework Solutions

Statistics Project: The Effect of Sleep on GPA
Homework Solution
$60.00
Statistics
Project
Mathematics
Descriptive Analysis
GPA
Sleeping
Survey
Research
Regression Model
Questionnaire
Concentration Level
Mean
Median
Hypothesis Test
Variables
Standard Error
Statistical Models
Linear Regression Questions
Homework Solution
$48.00
Mathematics
Statistics
Linear Regression
Samples
P-Values
Probability
Injuries
Tables
Predictors
Variances
Null Hypothesis
Significance Level
Applied Statistics Questions
Homework Solution
$150.00
Applied Statistics
Mathematics
Profile Analysis
ND Data
Transformations
Treatment Effect
Probability
P-Value
Functions
Research
R Programming Problems
Homework Solution
$78.00
Statistics
Mathematics
R Programming
ANOVA
Data Sets
Cancer Patients
Functions
Variance
Coefficients
Clusters
Molecules
Cell Differentiation
Genes
Cross Validation
Classification
Statistics - R Programming Problems
Homework Solution
$63.00
Statistics
Mathematics
R-Programming
Computer Science
Codes
Data Sets
Classification Tree
ROC Curve
Logistic Regression
Matrix
Expression Values
Sensitivity
Support Vector Machine
Functions
Statistics Questions
Homework Solution
$78.00
Statistics
MLE Formula
Mathematics
Chi-Square Distribution
Degrees Of Freedom
Mean
Variance
Gene Expression
R codes
Samples
Functions
Monte Carlo Study
Parameter Values
Probability
Patterns
Get help from a qualified tutor
Live Chats