Tuesday, July 16, 2013

Statistical Testing Errors

The focus today was on statistical testing errors, specifically the nature and consequences of making Type I and Type II errors.

The class was asking good questions about hypothesis testing, so I took some time to go through these. I also discussed the differences between creating confidence intervals and conducting hypothesis tests. For confidence intervals, we are interested in estimating a value. We use our level of confidence to establish the range of values that "make sense" based upon the sample data that we collected. For hypothesis testing we are simply giving a yeah or nay to a specific value (the hypothesized population parameter). If our well-collected data is consistent with the null hypothesis it's a yeah vote, if our data is inconsistent with the null hypothesis it's a nay vote.

I used a problem from the last lesson regarding average apparel expenditures to illustrate this idea. We worked through the hypothesis test and rejected the null hypothesis. Since the null hypothesis was tossed out, it is only natural to wonder what a more reasonable value for the population mean would be. A confidence interval is used to establish a new range of possible values with our best guess simply being the mean of our sample.

From here I moved to the issue of statistical testing errors. I have a chart that I can use to discuss the different errors and how values like the level of significance and power of the test relate.

We spent quite a bit of time discussing these and the ramifications of these errors within the context of a problem. I like to use a drug test as an example. Would the drug manufacturer rather see a Type I or Type II error made and why? What about the drug manufacturing regulator? Consider the scenario of a company evaluating a sales training program: what happens if they make a Type I error and what happens if they make a Type II error? These questions and discussions really help bring meaning to the errors.

After this, I pointed out graphically the relationship between significance level and power. Basically if the level of significance is reduced (making the value smaller) the power is also reduced and vice versa. The only way to reduce the level of significance without affecting power is to increase sample size.

We then looked at non-parametric testing methods for when you have small sample sizes that are skewed. This was to make students aware that other methods are available for non-conforming data sets.

I spent some time showing students how a calculator assists in calculating confidence intervals, t-scores, and p-values. We practiced using the calculator on a couple of different problems; one we had worked before and one new one.

I also passed out sample project reports and rubrics and gave students time to look through the samples to see how items in the report represented the requirements of the rubric.

Finally, it was course evaluation time. I expect that I will be scored low on a number of fronts as many students have struggled and their grades are lower than they would like.

Below is an outline of today's lesson with italicized comments enclosed in square brackets [like this]

·         Test errors – may be wrong and won’t know because usually won’t know population values
   o   Show diagram of correct/incorrect hypotheses versus decisions and errors
   o   Discuss alpha, beta and power
   o   Show chart connecting relationship between alpha and beta
   o   Discuss what happens if sample size increases and its impact on alpha, beta, and power
·         What if you have a small sample that is skewed?
   o   Use the Wilcoxon signed-rank test
   o   Explained in section 9.6 of the book [provided a slide with example to describe the process and how to calculate the test statistic]

No comments:

Post a Comment