Data Matters with Excel®
Activity 6.3
Section 6.3 makes some very precise claims about the means of random samples. It says that if you gathered a bunch of random samples, you would find that they were roughly normally distributed and that the standard deviation of the means would be equal to the standard deviation of the observations in the population divided by the square root of the sample size. This standard deviation of sample means is called the standard error.
Your task in this project is to test those claims.
These are the steps for this task.
- Pick a variable to work with and find its mean and standard deviation in the population.
- Take random samples, saving their means.
- Find the mean and standard deviation of the means and get a histogram of the means to see whether these claims are correct.
Step 1: Pick a variable to work with and find its mean and standard deviation in the population.
This will be easiest if you use the workbook from the project in Section 6.2. For the mean, select an empty cell to the right of the data and type the formula =average( , then select the column of the data you want to work with, type ) , and press Enter. For the standard deviation, use =stdev([column]) to find the standard deviation. (For example, if your data is in column B, then enter =stdev(B:B) .)
Step 2: Take random samples, saving their means.
Using the same strategy you used in the project in Section 6.2, empty the Stats worksheet by right-clicking the upper-left gray cell and selecting Clear Contents.
In A1, label the first column Mean. Label the third column Means. In A2, enter a formula that calculates the mean of the column of the data you want to work with, but have it calculate from the data on the Sample worksheet. For example, if you want to work with the data in the second column, use the formula =average(sample!B:B) .
To take the samples and save their means, use exactly the same macro as in the project in Section 6.2. (The line that is in bold could be deleted, but leaving it in will cause no trouble.)
Sub Sampling()
Randomize Timer
For r = 2 To 55 Set your number of samples by changing 55.
For SampleRow = 2 To 22 Set your sample size by changing 22.
dataRow = 2 + Fix(50000 * Rnd)
Worksheets("Data").Rows(dataRow).Copy
Worksheets("Sample").Rows(SampleRow).PasteSpecial
Next SampleRow
Worksheets("Stats").Cells(r, 3).Value = Cells(2, 1).Value
Worksheets("Stats").Cells(r, 4).Value = Cells(2, 2).Value
Next r
End Sub
|
Step 3: Find the mean and standard deviation of the means and get a histogram of the means to see whether these claims are correct.
Use =average() and =stdev() to get the mean of the means and the standard deviation of the means.
To get a histogram, click on Tools, Data Analysis, select Histogram and click OK. Select the first mean, and drag down to select all of the means in the Means worksheet. Click OK. Click on the Graph button in the toolbar, then select Next, Finish.
What do you think? Does the standard error work? Does the standard deviation of the mean equal the standard deviation of the observations divided by the square root of the sample size?
Try other sample sizes. Does sample size affect how well the equation for the standard error works? Is sample size related to the shape of the samples distributions?
Try other attributes that start out with other distributions. Does the shape of the underlying distribution matter when there are small sample sizes? Does the shape of the populations distribution matter when there are large sample sizes?
| ©2008 Key College Publishing. All rights reserved. |
|