General Information
  Home
Author Bio
Product/Purchase Info

Instructor Resources
Registration Required
  Register
Download Instructor Resources

Computer Activities
and Data Sets
  Table of Contents
Excel
Fathom
SPSS

Community
  Contact the Author
Ideas/Comments for Publisher
Testimonials
Coming soon!

Other Key Sites
  Key Curriculum Press

Key College Publishing

Data Matters with Fathom! Dynamic Statistics™ software

Activity 10.2

Violations of Pearson’s assumptions of constant variance and normality can wreck Pearson’s correlation test. How much can they mess it up? In this project, you will find out.

You will set Fathom to perform significance tests for correlation when the null hypothesis is true, then keep track of how often the true null hypothesis is rejected. You will do this by first setting up Fathom to test distributions, then trying a variety of distributions to see how they do.

Setting Fathom to Generate

Drag a case table onto the workspace. Add two attributes, x and y. Right-click on each attribute to set its function. Use the formula randomNormal(0,1) for both. Later you will change that to mess up the Pearson testing.

Drag a new hypothesis test onto the workspace. Select Empty Test, Test Correlation. Drag x and y onto the Test Correlation attributes.

Select Analyze, Collect Measures. Drag a case table onto the workspace. The first column is pValues. Add an attribute, Significant, using the formula pValue<.05. Drag a new estimate onto the workspace, then select Empty Estimate, Estimate Proportion. Drag Significant onto the attribute of the Estimate Proportion window. Now you can see what proportion of the samples lead Pearson’s correlation test to reject the true null hypothesis.

Double-click on the Measures Collection to set your number of samples and Animation as you like. Use a number of samples that are above 400.

Try Other Distributions

Get back to the original case table. You can change the sample size.

Try editing the formulas for x and y. There are menus listing all of the formulas at your disposal. To get Raised to the power, click on “^.” The only restriction is that the formula for x cannot refer to y, and the formula for y cannot refer to x. At the core of every formula, there has to be a random number. At any point, you can click on Apply to see what kinds of numbers you get.

When you have a formula that you would like to test, get into the Measures Collection and click on Collect More Measures. Here’s one that gets true null hypotheses rejected 16% of the time (at alpha=5%): randomCauchy()4.

How do the violations of Pearson’s assumptions affect the Pearson correlation test?

Testing Independence

Pearson’s correlation test also assumed independence. Independence means you can’t predict an attribute’s observation from the other observations of that attribute. When independence is violated, Pearson’s correlation test can be very unreliable. You may have seen that if you used the prev() function in the first part of this project.

For this project, we need only a case table and a scatter plot of the data in the case table.

The case table has two variables. The formula for x is caseIndex. The formula for y is conditional. Type If (caseIndex=1 . Click on the top question mark and enter 0 . Click on the bottom question mark and enter prev(y) + randomNormal(0,1).

That equation for y sets y to have a random shift. On the average, the changes in y will equal 0. On the average, there is no trend and no correlation with x.

Drag a new graph onto the workspace. Drag x to the x-axis and y to the y-axis. Select Scatter Plot, Line Scatter Plot.

Press “Ctrl-Y” to collect a new sample. Right-click on the scatter plot and select Rescale Graph Axes. Repeat these steps to see what happens.

In this part of the project, there are violations of independence. What effect would a lack of independence have on regression analysis?

Most analysis of time-related data, like stock prices, shows that observations are not independent. For example, today’s stock price is predictable from yesterday’s stock price. How does that affect correlational studies that try to predict something from the date?


©2008 Key College Publishing. All rights reserved.