General Information
  Home
Author Bio
Product/Purchase Info

Instructor Resources
Registration Required
  Register
Download Instructor Resources

Computer Activities
and Data Sets
  Table of Contents
Excel
Fathom
SPSS

Community
  Contact the Author
Ideas/Comments for Publisher
Testimonials
Coming soon!

Other Key Sites
  Key Curriculum Press

Key College Publishing

Data Matters with Fathom! Dynamic Statistics™ software

Activity 7.3

Section 7.3 suggests two ways to estimate the population variance from data in groups. One way looks at how the group means vary. The other way looks at how the observations within each group vary from the group mean. In this project, you will collect these estimates from many random samples from Rep US Sample. You can then explore the estimates and see whether they seem to work well.

Fathom has both estimates built in, but not labeled. To start, you need to calculate those estimates for two different data sets, then check them against what Fathom has built in, so as to make sure it does provide the two estimates.

The first dataset is:

  • Group A: 1, 2, 6
  • Group B: 4, 5, 6

Here are the calculations to find the two estimates:

Estimating Variance from Variation Within Groups

Group

Number

Group
Mean
Deviation Squared
Deviations

A

1

3 –2 4
A 2 3 –1 1
A 6 3 3 9
B 4 5 –1 1
B 6 5 1 1
B 5 5 0 0
Sum of Squared Deviations: 16
Degrees of Freedom: 4

Estimated Variance:

4


Estimating Variance from Variation Between Groups

Group

Number

Group
Mean
Overall
Mean
Deviation Squared
Deviations

A

1

3 4 –1 1
A 2 3 4 –1 1
A 6 3 4 –1 1
B 4 5 4 1 1
B 6 5 4 1 1
B 5 5 4 1 1
Sum of Squared Deviations: 6
Degrees of Freedom: 1

Estimated Variance:

6

To see where Fathom reports the two estimates of the variances, drag a new case table onto the Fathom workspace and enter the above data into two attributes, Group and Number.

When you’re done, the case table should look like this.

Group

Number

A 1
A 2
A 6

B

4

B

6

B 5

Once that data is entered, select Analyze, Test Hypothesis, Empty Test, Analysis of Variance. Drag Number onto Response attribute and Group onto Grouping attribute.

Fathom returns this table.

Source of
variation

df

Sum of
squares
Mean
square

Groups

1

6 6

Error

4

16 4

The two estimates of variance appear on the right. The top row has the variance estimated from variation between the groups. The bottom row has the variance estimated from variation within the groups. The table does not line up, but Fathom is calling the variance estimates the Mean Square and puts the variation between the groups in a Groups row. Fathom puts the variation within the groups in an Error row.

So that you can be sure the analysis of variance (ANOVA) provides these two estimates, here is another dataset that you can enter and get the estimates for.

  • Group A: 10, 30
  • Group B: 0, 60
  • Group C: 30, 50

Here are the calculations to get the two variance estimates.

Estimating Variance from Variation Within Groups

Group

Measure

Group
Mean
Deviation Squared
Deviations

A

10

20 –10 100
A 30 20 10 100
A 0 30 –30 900
B 60 30 30 900
B 30 40 –10 100
B 50 40 10 100
Sum of Squared Deviations: 2200
Degrees of Freedom: 3

Estimated Variance:

733.3333

Estimating Variance from Variation Within Groups

Group

Measure

Group
Mean
Overall
Mean
Deviation Squared
Deviations

A

10

20 30 –10 100
A 30 20 30 –10 100
A 0 30 30 0 0
B 60 30 30 0 0
B 30 40 30 10 100
B 50 40 30 10 100
Sum of Squared Deviations: 400
Degrees of Freedom: 2

Estimated Variance:

200

Enter this data into the case table and Fathom will update the analysis of variance. Check that the table really includes the two estimates.

We are going to collect the estimates as samples. To see how that’s going to work, select the Test Hypotheses box, Analyze, Collect Measures. Drag a case table onto the workspace to see what measures are collected and where the two variance estimates will appear.

In the Measures Collection, the variance from variation between the groups is called ms_error. The variance from variation within the groups is called ms_treatments.

Now that you know where the variance estimates will be, we can take samples and see how these estimates do.

Open Rep US Sample. Select Analyze, Sample Cases. Inspect the sample collection to set sample size and Animation as you like. We will be taking samples and need each sample to have multiple categories and to have at least two observations in at least one category. You’ll need at least three observations in each sample. You will be able to try different sample sizes.

Drag a case table onto the workspace so you can see the attributes in the Sample Collection.

Select Analyze, Test Hypothesis, Empty Test, Analysis of Variance. Drag a numeric attribute onto Response Variable and a categorical attribute onto Grouping Attribute.

Select Analyze, Collect Measures. Drag a case table onto the workspace so you can see the variables in the Measures Collection.

Drag two new graphs onto the workspace and drag ms_error and ms_treatments into the graphs.

Inspect the Measure Collection to set the number of samples and Animation as you like.

Select Analyze, Estimate Parameters, Empty Test, Estimate Mean. Drag ms_error and ms_treatments onto the Estimate Parameters box to get their means. Drag your number-line attribute from the population data (the original collection) onto the Estimate Parameters box to get its standard deviation. Square the standard deviation to get its variance.

How do the estimated variances compare with the real variance of the population? Are the claims in Section 7.3 sensible?

Try different sample sizes. Would you like to propose any warnings for these variance estimates?


©2008 Key College Publishing. All rights reserved.