Data Matters with SPSS®
Activity 7.1
If you were thinking of doing a z-test of a sample mean, you might subtract that mean from the population mean you were interested in and divide by the standard error that you estimate from the samples standard deviation. Section 7.1 claims that what you would get is something that is not normally distributed, but distributed more like a narrow volcano.
In this project, you will test this claim. To start, you will take samples with two observations each. Youll be taking the samples from a normal distribution with a mean of 0 and standard deviation of 1. For each of several hundred samples, you will calculate a statistic that is the sample mean divided by the standard error estimated from the sample standard deviation.
We will use a Syntax program to create the random samples. It is very much like programs we used in simulations that were not draws from the population (for example, the project in Section 4.3).
INPUT PROGRAM.
LOOP #Sample = 1 TO 400.
LOOP #Case = 1 to 2.
COMPUTE sample = #Sample.
COMPUTE z = RV.NORMAL(0,1).
END CASE.
END LOOP.
END LOOP.
END FILE.
END INPUT PROGRAM.
EXECUTE.
|
When you run the program, you will see that it is set to take 400 samples of two observations each. You can change 400 or 2 to change the number of samples or the sample size.
In the data editor, click on Data, Aggregate. Sample is the break variable. Select z and click on the black triangle next to the Aggregate Variable(s) box. Do that again to get z in the box a second time. Click on Function, select Standard Deviation. Click on Continue, Replace working data, OK.
Now you will have to compute t. Use Transform, Compute to first calculate the standard error, SE, and use this formula: z_2/sqrt(2). Then calculate t with this formula: z_1/SE.
We are thinking of t-values as if they were in tests of the null hypothesis that the population mean is 0 (which it is), so you could subtract 0 in that formula, but there isnt any reason to write that down.
Now you have a column of t-values. You can use Explore to see what they are like.
What do you think about Gossets claim, discussed on page 392 of the text? Does it look as if you could do a regular z-test with those statistics?
Sort the statistics and check the value that is 2.5% up the list. With two observations, there is one degree of freedom, and Gosset claimed that 95% of the t-values would fall between 12.7 and 12.7. Did that work for your data? Try 4,000 samples. Do 12.7 and 12.7 work with 4,000 samples? Why does it make a difference how many samples you take?
If you used 2 and 2 as your cutoffs for significance, how often would you have rejected the true null hypothesis?
Try other population standard deviations and other sample sizes. Do they make a difference?
| ©2008 Key College Publishing. All rights reserved. |
|