Monday, June 17, 2013

Regression, Influential Points, and Residual Analysis

Today I finished up a whirlwind look at simple linear regression. The class had a lot of questions, especially about R2, which is a difficult concept to grasp. As quickly as we are covering this material, I expected there would be a lot of questions.

The first half of the class was spent reviewing what we did and why from the last class. This was coupled with showing students how to calculate regression equations from summary statistics and by using their calculators. I then had the class work through finding regression equations for the different data sets we had and to write a sentence interpreting the b1 coefficient in context of the problem and a sentence explaining the meaning of the R2 value.

After this, I focused on influential points, residual plots, and analyzing residual plots. For our purposes, I defined an influential point as one the substantially affected the slope of the equation. I provided some examples and we discussed that an influential point does not necessarily mean that the data is an outlier in terms of either the response or predictor variable. I used the experience of drawing ovals around a scatter plot to help students see how they could assess potential influential points.

We wrapped up with working with residuals. We looked at different residual plots from the data sets we were working with and discussed whether or not they looked relatively random, which is the main point I wanted them to walk away with.

Below is the outline used for today's class with comments italicized within square brackets, [like this]

Regression and Correlation
o Influential points
o Y-hat = b0 + b1x
o Regression equation passes through the mean-mean point so y-bar = b0 + b1x-bar
o b1 = r sy / sx [emphasized for those not able to calculate regressions on their calculator]
o R-squared = r2 [got into a nice connection between comparing (y - y-bar)2 and (y - y-hat)2]

Residual Analysis and Standard Error of Estimate
Assumptions for regression [didn't really get into these specifics]
o Linear relationship [this was emphasized when discussing correlation]
o Errors have
Equal variances
Independence
Normal distribution [since normal distributions have not been covered yet, it didn't make sense to get into this]
o Show plots of residuals
o Have students create a residual plot for orbit drop data

No comments:

Post a Comment