## Transcribed Text

Part 1: Measures of Center
Instructions: Analyze the Simulated Data. You will analyze one quantitative variable (2008). Report your findings in a
well-written narrative.
Distribution of 2008 Classroom Averages
25%
20%
15%
10%
5%
0%
Classroom Averages
The Center of the Distribution
What is the mean? Interpret the mean in context.
Mean = 692.5 The mean is the average of all scores. There are more scores below the mean than there are above the mean.
What is the median? Interpret the median in the context.
Median = 693 The median is the middle value, determined in this list by looking at index numbers 50 and 51.
What is the mode? Interpret the mode in context.
Mode = 648 The mode is the number that is repeated more often than any other.
Compare the mean, median, and mode. The mean and median being almost the same is not surprising. Because the
mode of 648 was repeated only four times, it is not significant in this analysis. However, it is good to look at in case
one score was repeated, say 10 or more times, as this may indicate invalid scores.
School
Average
Adams
709
Harrison
666
Jackson
707
Jefferson
660
Madison
703
Monroe
712
Van Buren
688
Washington
693
Score Average by School
850
MEAN
800
750
700
650
600
550
School
How does the shape of the histogram relate to the measures of center?
The mean (average is the most significant measure of center and by inserting it as a Trendline, we can see how each
school
performed against mean.
What does a side-by-side comparison reveal about the distribution of 2008 averages across the schools?
There are four above average -performing schools, two average-performing ones, and two well below average.
It also shows that there are more low scores (below mean) and the high scores are relatively higher than
the
low
ones
are
low.
Is there anything interesting, surprising, or noteworthy about the distribution?
It is interesting that there are more above-mean performing schools than there are average-performing ones. It is
noteworthy that the low-performing schools are well below mean/average.
Part 2: Measures of Spread and Outliers
School
2008
2009
2008
Monroe
690
716
range
214
Jackson
667
712
stdev
46.77963
Monroe
648
632
IQR
61.25
Monroe
742
714
Variance
2188.333
Harrison
723
727
Q1
658.5
Adams
697
708
Q3
719.75
Van Buren
688
745
1.5 IQR
91.875
Madison
759
753
mean
692.5
Madison
738
745
median
693
Madison
781
789
mode
648
Washington
692
754
min
589
Van Buren
770
810
max
803
Van Buren
648
674
Jackson
724
711
Jefferson
646
675
Jackson
681
680
Harrison
600
637
Adams
684
690
Adams
728
745
Washington
624
672
Washington
678
720
Washington
731
774
Madison
657
702
Jackson
694
706
Van Buren
700
712
jackson
659
671
Harrison
703
695
Washington
589
643
Monroe
704
729
Harrison
677
677
Jefferson
681
709
Madison
703
715
Jackson
681
711
Adams
707
713
Jefferson
620
596
Jefferson
649
700
Van Buren
698
680
Monroe
788
804
Jefferson
674
700
Jackson
773
738
Van Buren
670
673
Washington
710
738
Jefferson
677
680
Washington
803
867
Jefferson
607
638
Van Buren
609
622
Harrison
635
660
Monroe
748
770
Van Buren
722
710
Harrison
621
631
Van Buren
678
653
Washington
711
790
Monroe
742
731
Monroe
699
757
Washington
717
738
Jefferson
714
742
Van Buren
672
709
Monroe
732
765
Jefferson
652
673
Madison
648
670
Washington
718
786
Harrison
619
650
Adams
778
782
Washington
689
701
Harrison
625
651
jackson
714
743
Madison
639
618
Washington
648
690
Monroe
642
642
Van Buren
704
744
Jackson
755
761
Jackson
754
760
Monroe
613
631
Washington
740
771
Monroe
716
733
Adams
670
661
Harrison
747
810
mean
e
Jefferson
651
635
median
Adams
690
685
mode
Van Buren
682
730
Jefferson
697
719
Jackson
706
712
Jackson
801
820
Harrison
689
669
Madison
709
694
Jackson
663
679
Washington
651
710
Madison
691
695
Van Buren
735
727
Van Buren
660
693
Adams
719
756
Harrison
713
690
Harrison
643
687
Jefferson
665
694
Adams
706
755
Jackson
631
690
Jefferson
705
727
Monroe
702
738
Monroe
753
771
Monroe
754
780
Analyze the data Simulated Data in Excel. Report your findings in a well-written narrative.
The Measures of Spread
Range
Range = 214. It is the Difference between the smallest average and largest average.
IQR
IQR = 61.25. I got this answer from your friend, get help with this explanation: IQR tells how spread out the middle values are;
it can also be used to tell when some of the other values are too far from the central value. Too far away points are called
"outliers," because they "lie outside" the range in which we expect them. An outlier is any value that lies more than one and
a
half times the length of the box from either end of the box.
Variance
Variance = 2188.333 I got this answer from your friend, get help with this explanation
Standard Deviation
What do the measures of spread relative to one-another reveal about the dispersion among 2008 CLASS AVERAGES?
You should also use the robust method (1.5 IQR Rule) to test for outliers.
Discuss how outliers do, or would, affect the various statistics:
Mean?
Median?
Mode?
Range?
IQR?
Variance?
Standard Deviation?
Were the outliers obviously in, or absent from, the histogram (Project 1 and above)? Did, or would, the outliers explain
the shape of the distribution (skew)? Do you think that the outliers should be excluded from the calculations? Why, or
why not?
What did you learn that was interesting, unexpected, or noteworthy?
Note: Measurements are relative. Be careful of making statements such as "The standard deviation is 100. So the data is very spread
out." These measurements are all relative. 100 may be large in some context and small in another. Try comparing one
measurement
to
another. For example, does the standard deviation seem small/large relative to the overall range?
Part 3: Regression Equation
Instructions: Analyze the data Simulated Data in Excel. Report your findings in a well-written
narrative.
Analyze the simulated 2008/2009 classroom averages using our bivariate analysis
methods. State the regression equation. Interpret the slope and the y-intercept. Do
you think that the regression equation would be useful in predicting responses? Why
or why not?
Excel Instructions:
Method 1: Using r, sy, SX, y-bar, and x-bar.
Since you already have r, the Excel formulae that you will need are:
=average(array)
=stdev(array)
Recall that you can calculate the slope and the y-intercept of the regression line using the formulae
below. Try to use cell references and mathematical operations to perform these calculations in Excel.
Round final values to the hundredth place.
b=y-mx
Sx
Method 2: Using linest(). The Excel function =linest(known y's, known x's, const, stats) will produce
the
slope and y-intercept of the Least Square Regression Equation. In an empty cell, type the formula. Use
the range of Salary cells for "known y's" and the range of Age cells for the "known x's". You can leave
the other two arguments empty: "const" and "stats". Initially, you will get the slope only. To get the y-
intercept, high-light the formula cell and the adjacent cell. Hit the "F2" key and release. Simultaneously
press, Crtl, Shift, and Enter. The adjacent cell will populate with the intercept.
Note:
if
you want to add the regression line over your scatter. An easy way to do this is to highlight your
original scatterplot by left-clicking on a point in the scatter and then right-clicking to bring up a
formatting menu. Select "Add Trendline". The default option is a Least Squares Regression line. Just
click on "OK" to return to your scatter.

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction
of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice.
Unethical use is strictly forbidden.

Part 1:

The mean of the distribution of 2008 class averages is 692.5, its median is 693, and mode is 648. They are in fact very similar suggesting that the distribution is a typical bell-shaped with a single peak. If they are different, then we may suspect the skewness to the left or to the right. The following histogram proves my expectation of the shape of the distribution. It appears to be a bell-shaped coming from the normal distribution.

The side-by-side comparison barplots show that class averages among all schools are roughly similar. However, without any formal statistical tests, it is hard to see if any of the averages are different from others. But I would say that they are not terribly different from one another.

It is interesting that the observation that averages across all schools are very similar is translated to the shape of the distribution of whole observations being a bell-shaped, and it intuitively makes sense...