General Information
  Home
Author Bio
Product/Purchase Info

Instructor Resources
Registration Required
  Register
Download Instructor Resources

Computer Activities
and Data Sets
  Table of Contents
Excel
Fathom
SPSS

Community
  Contact the Author
Ideas/Comments for Publisher
Testimonials
Coming soon!

Other Key Sites
  Key Curriculum Press

Key College Publishing

Data Matters with Excel®

Activity 9.2

Section 9.2 proposes that a way to find a line that does a good job estimating y-values from x-values is to find the line that goes through the mean of the y-values at each x-value, if that’s possible.

There are seven steps to testing this claim.

  1. Pick two x-values.
  2. Have your software select several random y-values for each x-value.
  3. Calculate the regression line that goes through the means of the y-values of each x-value.
  4. Use your software to calculate the mistakes the estimate makes.
  5. Add up all the mistakes the regression line makes.
  6. Does the line go through the mean-mean point?
  7. Shift the line by changing the slope and check how that affects the sum of the line’s errors.

Here’s how to do these steps.

Step 1: Pick two x-values.

Label column A and column B. Call them X and Y. Pick two values for x and type several of each into the cells in column A.

Step 2: Have your software select several random y-values for each x-value.

If you set Excel to select the y-values with its random number generator, =rand(), Excel will repeatedly replace the y-values as you try to work with them. So for this project, it will be fine to simply type in some numbers chosen at random. You don’t need a truly random sample here.

Step 3: Calculate the regression line that goes through the means of the y-values of each x-value.

You will need the mean of y for each value of x. To get a mean y-value, select an empty cell in column E, type =average( , click and drag to select the values you need the mean of, type ) and press “Enter.”

Calculate the equation of a line from the mean of y for one x-value to the mean of the y-values of the other x-value. To do this, you are calculating the equation for a line between two points. The x-values are the x-values of your data. The y-values are the means of the y-values at each x-value. (Hint: You can start by finding the slope—rise over run.)

Step 4: Use your software to calculate the mistakes the estimate makes.

Label column C Estimates. Enter your regression equation into C2 and drag it down to fill all the rows. Type an asterisk for multiplication and refer to the x-value as “A2.” For example, if your regression equation is y=3x-2, then enter =3*A2-2 .

Label column D Error. Enter its formula, =B2-C2 .

Step 5: Add up all the mistakes the regression line makes.

Select the cell below the column of errors. Click on the summation symbol and press “Enter.”

Step 6: Does the line go through the mean-mean point?

To get the mean x-value and mean y-value, you can put their means underneath each column. Select the cell below the x-values. Type =average( , select the x-values, type ) , and press “Enter.” You can drag that equation under the y-values to get the mean of the y-values.

Is the mean-mean point a point on the line described by your regression equation?

Step 7: Shift the line by changing the slope and check how that affects the sum of the line’s errors.

Go to cell C2. Edit the formula, changing the slope and/or intercept. After each change, drag the equation down to fill all the lines of data. How do your changes affect the sum of the errors?


©2008 Key College Publishing. All rights reserved.