General Information
  Home
Author Bio
Product/Purchase Info

Instructor Resources
Registration Required
  Register
Download Instructor Resources

Computer Activities
and Data Sets
  Table of Contents
Excel
Fathom
SPSS

Community
  Contact the Author
Ideas/Comments for Publisher
Testimonials
Coming soon!

Other Key Sites
  Key Curriculum Press

Key College Publishing

Data Matters with Excel®

Activity 7.1

If you were thinking of doing a z-test of a sample mean, you might subtract that mean from the population mean you were interested in and divide by the standard error that you estimate from the sample’s standard deviation. Section 7.1 claims that what you would get is something that is not normally distributed, but distributed more like a narrow volcano.

In this project, you will test this claim. To start, you will take samples with two observations each. You’ll be taking the samples from a normal distribution with a mean of 0 and standard deviation of 1. For each of several hundred samples, you will calculate a statistic that is the sample mean divided by the standard error estimated from the sample standard deviation.

Open a new worksheet. Label column A Sample. In cells A2 and A3, enter the formula =normsinv(rand()) . You know what rand() does: It returns a random number from 0 to 1. Normsinv() produces numbers from a standard normal distribution. A standard normal distribution is a normal distribution with a mean of 0 and a standard deviation of 1. Normsinv() takes a proportion (represented as a decimal) and returns the z-value from a normal distribution that has that proportion below it. For example, normsinv(.5) is 0, and normsinv(.025) is –1.96. So normsinv(rand()) produces random observations from a normal distribution with a mean of 0 and standard deviation of 1. (Don’t believe it? Get a lot, then get a histogram of them.)

Label column B Mean; column C Standard Deviation; column D Standard Error; column E t ; and column F t-values. This table shows the columns, their labels, and their formulas.

Column

B C D

E

Label
(in row 1)

Mean Standard Deviation Standard Error

t

Formula
(in row 2)

=average(A2:A3) =stdev(A2:A3) =C2/sqrt(2)

=B2/D2

Because we are thinking of these t-values as part of a test that the population mean is 0, the formula for t is really “=(B2-0)/D2”, but there’s not really any reason to include subtracting 0.

Now you will use a macro to take new samples and store the t-values in column F (the sixth column).

Sub takeSamples()
  For SampleRow = 2 TO 501 ‘Change 501 to get as many samples as you like.
    Calculate
    Cells(SampleRow,6).Value = Cells(2,5).Value
  Next SampleRow
End Sub

You can get a histogram of the t-values (Tools, Data Analysis, Histogram, and so on).

What do you think about Gosset’s claim, discussed on page 392 in the text? Does it look as if you could do a regular z-test with those statistics?

Sort the statistics and check the value that is 2.5% up the list. With two observations, there is one degree of freedom, and Gosset claimed that 95% of the t-values would fall between –12.7 and 12.7. Did that work for your data? Try 4,000 samples. Do –12.7 and 12.7 work with 4,000 samples? Why does it make a difference how many samples you take?

If you used 2 and –2 as your cutoffs for significance, how often would you have rejected the true null hypothesis?

Try other population standard deviations and other sample sizes. Do they make a difference?


©2008 Key College Publishing. All rights reserved.