Thursday, July 18, 2013

Hypothesis Tests of Means Using Two Samples

The focus today was on testing hypotheses about means when you have two samples. We also had a quiz today covering confidence intervals and hypothesis test of a mean for a single sample. I reviewed material that students had questions about, working through a couple of examples that addressed their questions.

Before taking the quiz, I covered hypothesis testing for paired samples. Paired samples include before and after measurements, working with siblings or spouses, and, as in our first day experiment, comparing things such as dominant and non-dominant measurements.

I used the coin stacking data from our first class, which is readily recognizable as paired data. I asked the class to formulate null and alternative hypotheses for this problem. We discussed a one-tail alternative of the dominant hand performing better and a two-tail alternative of the two hands will perform differently. This totally depends on the question you are addressing. If you are trying to answer the question, "Will your dominant hand stack more coins than your non-dominant hand?" then your alternative hypothesis will be the mean dominant hand stack is greater than the mean non-dominant hand stack, a one-tail upper test. If you ask the question, "Will my two hands stack the same number of coins?" then your alternative hypothesis is that  the mean dominant hand stack is not the same as the mean non-dominant hand stack, a two-tail test.

To perform a paired sample analysis, you take the difference between your individual paired data values and treat these differences as a single sample. All the tests, confidence intervals, and results work exactly the same as a single sample test, except that the results are for the mean of the differences.

After the quiz we covered the situation where you work with two independent samples. In this case the sample sizes may be different and there is no connection between the two samples. The combined sampling model we use is still a t-model with a mean of the difference of the two sample means. The standard deviation is calculated by adding the variances of the individual standard errors and then taking the square root. I showed students how to use their calculator to perform this test. The calculator or software will provide the degrees of freedom. As I told the class, there is a formula to determine the degrees of freedom but it is not something you want to calculate by hand. We worked through an example involving operation times using two different protocols and conducted a hypothesis test and created a confidence interval.

Students are working on finishing their project papers over the next few days and trying to get ready for the final. The project is due next week and the final is one week from today.

Below is an outline of today's lesson with italicized comments enclosed in square brackets [like this].

—Inference for 2 Population Means
   ·         What would you try if you wanted to compare two samples, such as heights of males and females? [reversed the order of presentation since the paired sample was good review for the quiz]
   ·         If the samples are independent we can take the difference between the two
      o   Will still have a t-model
      o   Mean is difference of the means
      o   Standard error is determined by adding variances [due to limited time just stated this result versus having students work through sample data to see that this was the case]
      o   Degrees of freedom—let your calculator or computer figure it out
   ·         Proceed with same process as single sample testing except now you are working with hypotheses about the difference in values
      o   Confidence intervals
      o   p-value and inference
   ·         Practice two independent samples [used example in book for surgery times; samples were of different sizes, which students typically bring into question. The point is were are testing a hypothesis about the difference in the means of each sample, the individual sample sizes will only affect the variances and hence standard deviations when the sampling models are combined.]
   ·         What if two samples are not independent, such as coin stacking?
      o   Take differences in values since they are paired
      o   Now have a one-sample situation
      o   Proceed as if dealing with a single sample and infer about difference
         §  The differences should be roughly symmetrical, the raw data doesn’t matter
   ·         Practice paired sample [Used coin stacking from the first day; this makes a nice closure to the entire semester.]

No comments:

Post a Comment