My AP Statistics class is now entering a phase of exploring associations between two quantitative variables. We've been looking at scatter plots and discussing what the plot indicates about association. We also discussed how to draw an ellipse around a scatter plot and use the major and minor axes of the ellipse to estimate the absolute value of the correlation. (For those of you who have never done this, if L is the major axis and W is the minor axis then (L - W) / L provides a very good estimate, easily within 0.2 of the correlation.)
I have developed or borrowed a few data collection activities that are easy to do, engage students, and provide reasonable data for use in describing scatter plots and in creating linear regression models.
ORBIT EXPRESS
This is an activity that I modified from a session I attended at the NCTM Conference. The original was presented as an experiment to test two different materials. I do the original as well, while working through experimental design, but focus on a single material and measure drops from different heights.
The scenario is that students work in the design and analysis department for Orbit Express, a package delivery company that will delivery packages by dropping them from orbit. Take a sheet of paper and wad it up; this is the delivery vehicle. Students have to measure six drops from different heights. They drop toward a target on the floor (a coin or piece of tape works well). They measure the height of the drop and the distance of the delivery vehicle from the target once the delivery vehicle has come to rest.
CAR MILEAGE
I don't recall where this one came from, I think it came from a book, so I apologize in advance for not citing the source. (If you happen to know, contact me and I will rectify the citation issue). In this activity, students record the year of car and the mileage for the primary vehicle that they drive. This provides some interesting discussions once a regression model is built as to the meaning of the y-intercept and the slope in context of the situation.
DA VINCI'S VITRUVIAN MAN
Another activity that I forget the source. Da Vinci's sketch the Vitruvian Man indicates that a person's arm span is equal to that person's height. The natural question is to ask whether or not Da Vinci was correct. Students measure their height and arm span length and take a look. This also provides a basis for regression inference in terms of a confidence interval for the slope. Ask students which is the explanatory variable and which the response variable. It gets them thinking about what is happening in the context of the situation. Discuss how applicable the model is for different heights.
PASS THE BUCK
This activity came from a pre-AP session provided by my school district. In this scenario, three students pass a buck (I use a 3" x 5" card) from hand-to-hand as quickly as they can. After coming to agreement as a class as to what constitutes a pass, the method of passing, and other measurement conditions, we start accumulating data. The first person in line holds the buck and says, "Go." A timer starts timing at that moment. The buck is quickly passed from person to person. When the last person has possession of the buck the yell, "Stop!" and the timer stops timing. The number of people in line and the time it took them to pass the buck are recorded. Three more students join the line and the process is repeated. Continue until the entire class, except for the timer, are in the line. It's a fun activity that leads to some interesting discussions. First, students are used to using time as the explanatory variable, but in this case it is the response variable. Another fun thing to do, if it doesn't happen naturally, which it often does, is to secretly ask one student to fumble the buck causing a delay and outlier for the scatter plot. Be sure to discuss the meaning of the slope and y-intercept in the context of the problem.
I have used all of these and found them to produce workable data that lead to pertinent discussions. If you need more information on how to set any of these up let me know; I'd be happy to provide additional information.
And, if you have some favorite activity for generating data for regression models and are willing to share it, I am always looking for new things to incorporate into my lessons.
Saturday, October 5, 2013
Saturday, September 28, 2013
Why Do You Graph Data in Statistics?
My Inferential Probability and Statistics class is now transitioning from how to gather data to how to organize and begin to analyze data. We spent some time working with categorical data, which I discussed in a previous post. We also worked through some preliminaries for quantitative data using the name rank data described in a previous post.
The classes conducted an experiment for testing the melting times for chocolate chips (for a full description of the process see posts IPS Day 47 and IPS Day 48). The classes had gone through the basics of stem plots, histograms, and bin sizes of histograms using the name rank data. We also examined medians and mean. It was time to look at the melting time data.
Students wrote their melting times on the board under the three columns for milk chocolate, semi-sweet chocolate, and white chocolate. I asked the students to graph all of the data in one graph. Students were a bit confused as to why I would want to do this so I explained that we had chocolate melting times and even though they came from different types of chocolate they were still, fundamentally, melting times. Many students used stem plots, some entered their data into a calculator to create histograms.
I also asked students to calculate the median and mean of the distribution and decide which was more representative of the center. We had already looked at many sample graphs doing this and they had a pretty good sense that skewed data needed to use the median.
The stem and leaf plots looked similar to the one displayed below:
Students described this graph as being skewed positively and many thought the graph was uni-modal. A few students described the graph as being multi-modal. The graph was what I expected to see, after all, we were dealing with three different types of chocolate.
I asked the class what the possibilities could be for the melting times? This is a difficult question for students who are not used to considering practical aspects of problems. If all the chocolates melt at relatively the same rate we should see a graph that reflects this, basically a uni-modal graph with possible skew.
If two chocolates melt at approximately the same rate and the third melts at a different rate, we should expect to see two modes in the graph although the modes may be masked because of overlap and number of items of each type being graphed. (We actually saw this phenomenon when we graphed the heights of students in the class. The data should be bi-modal because of males and females but the number of females was so large compared to males that the male mode looked more like a wave than a mode since the height overlaps between the genders filled in the gap between the two modes.)
The third possibility is that all three chocolate types have different melting times, resulting in three modes. The amount of separation between the melting times could mask how distinct each mode is from the others.
I sketched the three possibilities out and asked students to consider which of the three their graph most resembled?
As students looked at these possibilities they realized that the data was not uni-modal. The stem plot follows a pattern of three distinct melt times. Without going back into the data, we cannot say, at this point, which melts faster and which melts slower, but the graph strongly suggests three distinct melt times, which provides a foundation for analyzing the data as three separate groups.
Students had never considered what a graph should look like and then compare their results to the possibilities. This was a real eye opener for them as they realized that graphs aren't just made to appease some requirement or make a pretty picture, graphs are made to gain understanding about the data we are about to analyze. We can learn so much about the data just by creating a graph and thinking about what the possible graph shapes can be and what our graph actually looks like.
I was able to push this further as we looked at histograms for the data. As I told the class, stem plots are great because they capture all of the raw data. They also show comparable results to histograms when working with smaller data sets. However, stem plots are limiting when making decisions about how to split apart the data, especially when you start working with larger data sets.
Below are histograms for the chocolate melting times, the only difference is in the bin width that was used: 50, 30, 15, and 5 seconds.
The 50 second bin width hides too much detail. From this graph we would describe the distribution as uni-modal with a slight positive skew.
The 30 second bin width tells a slightly different story. We now have two distinctive modes. There still appears to be slight skew to the right, but there is still too much detail hidden.
The 5 second bin width shows more detail. Notice how the lower mode is definitely below 70 seconds. The green bar at approximately 80 seconds is interesting, especially considering the mode near 100 seconds. This peak at 80 seconds is what I would call the tidal effect, as the tails of two distributions push against each other and raise up values where they overlap (recall my discussion of heights).
Although the 15 second bin width is the easiest to describe, the 5 second bin width graph provides additional insight into the data. Using histograms provides the flexibility to analyze our data distribution by examining the distribution at different levels.
At this point the class started to understand that graphs were a tool to help them better understand the data they are tasked with analyzing.
Why do you graph data in statistics? Is it take make a pretty picture? Or, is it to make a telling story about data distributions and their implications for analysis?
The classes conducted an experiment for testing the melting times for chocolate chips (for a full description of the process see posts IPS Day 47 and IPS Day 48). The classes had gone through the basics of stem plots, histograms, and bin sizes of histograms using the name rank data. We also examined medians and mean. It was time to look at the melting time data.
Students wrote their melting times on the board under the three columns for milk chocolate, semi-sweet chocolate, and white chocolate. I asked the students to graph all of the data in one graph. Students were a bit confused as to why I would want to do this so I explained that we had chocolate melting times and even though they came from different types of chocolate they were still, fundamentally, melting times. Many students used stem plots, some entered their data into a calculator to create histograms.
I also asked students to calculate the median and mean of the distribution and decide which was more representative of the center. We had already looked at many sample graphs doing this and they had a pretty good sense that skewed data needed to use the median.
The stem and leaf plots looked similar to the one displayed below:
Students described this graph as being skewed positively and many thought the graph was uni-modal. A few students described the graph as being multi-modal. The graph was what I expected to see, after all, we were dealing with three different types of chocolate.
I asked the class what the possibilities could be for the melting times? This is a difficult question for students who are not used to considering practical aspects of problems. If all the chocolates melt at relatively the same rate we should see a graph that reflects this, basically a uni-modal graph with possible skew.
If two chocolates melt at approximately the same rate and the third melts at a different rate, we should expect to see two modes in the graph although the modes may be masked because of overlap and number of items of each type being graphed. (We actually saw this phenomenon when we graphed the heights of students in the class. The data should be bi-modal because of males and females but the number of females was so large compared to males that the male mode looked more like a wave than a mode since the height overlaps between the genders filled in the gap between the two modes.)
The third possibility is that all three chocolate types have different melting times, resulting in three modes. The amount of separation between the melting times could mask how distinct each mode is from the others.
I sketched the three possibilities out and asked students to consider which of the three their graph most resembled?
As students looked at these possibilities they realized that the data was not uni-modal. The stem plot follows a pattern of three distinct melt times. Without going back into the data, we cannot say, at this point, which melts faster and which melts slower, but the graph strongly suggests three distinct melt times, which provides a foundation for analyzing the data as three separate groups.
Students had never considered what a graph should look like and then compare their results to the possibilities. This was a real eye opener for them as they realized that graphs aren't just made to appease some requirement or make a pretty picture, graphs are made to gain understanding about the data we are about to analyze. We can learn so much about the data just by creating a graph and thinking about what the possible graph shapes can be and what our graph actually looks like.
I was able to push this further as we looked at histograms for the data. As I told the class, stem plots are great because they capture all of the raw data. They also show comparable results to histograms when working with smaller data sets. However, stem plots are limiting when making decisions about how to split apart the data, especially when you start working with larger data sets.
Below are histograms for the chocolate melting times, the only difference is in the bin width that was used: 50, 30, 15, and 5 seconds.
The 50 second bin width hides too much detail. From this graph we would describe the distribution as uni-modal with a slight positive skew.
The 30 second bin width tells a slightly different story. We now have two distinctive modes. There still appears to be slight skew to the right, but there is still too much detail hidden.
The 15 second bin width clearly shows three distinct modes. It also shows that one mode sits close to 170 seconds while the other two modes are closer together at approximately 60 seconds and 90 seconds.
Although the 15 second bin width is the easiest to describe, the 5 second bin width graph provides additional insight into the data. Using histograms provides the flexibility to analyze our data distribution by examining the distribution at different levels.
At this point the class started to understand that graphs were a tool to help them better understand the data they are tasked with analyzing.
Why do you graph data in statistics? Is it take make a pretty picture? Or, is it to make a telling story about data distributions and their implications for analysis?
A First Look at Graphing Distributions of Quantitative Data
[This post was supposed to be published last week, but I got really busy and forgot I had put this post together. Then, I realized that I didn't have this post to refer back to as I was writing my next post. So here it is. Just realize there were several lessons that took place between this post and the next even though both posts were published on the same day.]
Up to this point, the data we've been working with in my Inferential Probability and Statistics class has been categorical data. We are beginning to transition to working with quantitative data. To that end we conducted an observational study on name popularity and conducted an experiment on the melting time of chocolate chips.
The name popularity activity comes from Making Sense of Statistical Studies. Students found the ranking of their given name for their birth year and for the most recent year data was available using the Social Security Administration's Baby Name database.One of the questions asked students "to construct a graphical display that shows the distribution of the [year] ranks data."
Students either had no idea how to proceed or started to create graphs displaying categorical characteristics of the data set, such as year versus gender or high versus low ranks. When I asked my classes about histograms, only a handful indicated they knew about or how to use histograms. This was interesting since students had been taught (note: I am not using the term learned, as it is evident that learning did not take place) histograms since the sixth grade.
I wonder why students have so much exposure to histograms yet seem so ignorant as to their use. Is it possible that so much time is spent teaching students how to make "pretty" graphs without really thinking about what the graph is telling you about the data distribution? Even with an AP Statistics class, it is difficult to move students away from just making a graph to considering what the graph tells you about your data.
I will be working primarily with stem plots, histograms and box plots to analyze data sets. It will be interesting to see how well students move off of here is how to make a graph to here is what the graph is telling you about the data distribution.
Up to this point, the data we've been working with in my Inferential Probability and Statistics class has been categorical data. We are beginning to transition to working with quantitative data. To that end we conducted an observational study on name popularity and conducted an experiment on the melting time of chocolate chips.
The name popularity activity comes from Making Sense of Statistical Studies. Students found the ranking of their given name for their birth year and for the most recent year data was available using the Social Security Administration's Baby Name database.One of the questions asked students "to construct a graphical display that shows the distribution of the [year] ranks data."
Students either had no idea how to proceed or started to create graphs displaying categorical characteristics of the data set, such as year versus gender or high versus low ranks. When I asked my classes about histograms, only a handful indicated they knew about or how to use histograms. This was interesting since students had been taught (note: I am not using the term learned, as it is evident that learning did not take place) histograms since the sixth grade.
I wonder why students have so much exposure to histograms yet seem so ignorant as to their use. Is it possible that so much time is spent teaching students how to make "pretty" graphs without really thinking about what the graph is telling you about the data distribution? Even with an AP Statistics class, it is difficult to move students away from just making a graph to considering what the graph tells you about your data.
I will be working primarily with stem plots, histograms and box plots to analyze data sets. It will be interesting to see how well students move off of here is how to make a graph to here is what the graph is telling you about the data distribution.
Friday, September 20, 2013
Teaching the normal model using a z-table
We are just wrapping up working with z-scores and the normal model in my AP Statistics class. In the past, I had students use the empirical rule (68-95%-99.7% of distribution within 1, 2 or 3 standard deviations) and then moved to using a graphing calculator.
This summer, while teaching the introductory statistics class at MSU Denver, I had to teach the normal model a z-table since a graphing calculator was not required. What I discovered was that using the table made the conversions among raw scores, z-scores, and percentages was much more transparent to students and, hence, more understandable. I decided that I needed to bring this experience to my AP classes.
I started through the z-score and normal model unit as I have in the past, including showing students how to find cumulative probability and z-scores from percentiles. Then we started working through an ten-part problem that required using the empirical rule and explicit values. I showed students how to read a z-table and asked a few questions to confirm they could find a percentage from a z-score or a z-score from a percentile. I then told students they could not use their calculators to work through the problem; they had to use the z-table.
There were a few questions as students started working through the problem parts but all readily picked up on how to make use of the table. But, more importantly, students started getting a better feel of how to transition easily between a raw score through a z-score to determine a percentage or vice-versa.
For the assessment on this unit, it was interesting to see that more than half the class asked if they could use the z-table, which was, of course, okay to use. I regret that I did not see the value of teaching to use tables earlier. If you are like me and have not previously taught using the z-table and just worked with technology, I highly recommend spending some time using the z-table in your instruction. You may be pleasantly surprised, as I am.
This summer, while teaching the introductory statistics class at MSU Denver, I had to teach the normal model a z-table since a graphing calculator was not required. What I discovered was that using the table made the conversions among raw scores, z-scores, and percentages was much more transparent to students and, hence, more understandable. I decided that I needed to bring this experience to my AP classes.
I started through the z-score and normal model unit as I have in the past, including showing students how to find cumulative probability and z-scores from percentiles. Then we started working through an ten-part problem that required using the empirical rule and explicit values. I showed students how to read a z-table and asked a few questions to confirm they could find a percentage from a z-score or a z-score from a percentile. I then told students they could not use their calculators to work through the problem; they had to use the z-table.
There were a few questions as students started working through the problem parts but all readily picked up on how to make use of the table. But, more importantly, students started getting a better feel of how to transition easily between a raw score through a z-score to determine a percentage or vice-versa.
For the assessment on this unit, it was interesting to see that more than half the class asked if they could use the z-table, which was, of course, okay to use. I regret that I did not see the value of teaching to use tables earlier. If you are like me and have not previously taught using the z-table and just worked with technology, I highly recommend spending some time using the z-table in your instruction. You may be pleasantly surprised, as I am.
Friday, September 13, 2013
Using Contingency Tables and Segmented Bar Graphs to Determine Association
For my Inferential Probability and Statistics class, I am beginning to bounce back and forth between gathering data and organizing data. Students used sampling methods to collect information on cars in our school parking lot. The data are primarily categorical in nature. From preliminary discussions I saw that the classes were solid in creating pie graphs and bar graphs for their data. To help push their thinking about how to analyze data, I focused on making use of contingency tables and segmented bar graphs.
It is easy to get students started on contingency tables. I created a 3 x 3 table on the board, labeling the vertical boxes on the left as Male, Female, and Total (see table below). Across the top boxes I used the labels Jeans, No Jeans, and Total. I then took a quick poll of boys and girls as to whether they were wearing denim jeans or not. Voila, a contingency table.
Next we worked through calculating percentages of total, row percentages, and column percentages. I like to use different color markers for this so it is easier to reference percentage types. From here, it is an easy matter to begin discussing marginal and conditional distributions, to compare these distributions, and to use these to have students begin to think about variables being dependent or independent.
It is surprising how difficult it is for students to get independence and dependence straight in their minds. They are so used to thinking of independent variables and dependent variables from a mathematical perspective that it is difficult for them to shift gears. Rather than get into a formal look at independence and dependence (such as with probabilities), I simply ask them to consider if the marginal and conditional distributions are tracking along similarly. For example, if the class is 60% male, then it should be reasonable to assume that we would see 60% of jean wearers being male. If the marginal and conditional distributions are "close" then the two variables are independent.
On the other hand, if we see marked differences between the marginal and conditional distribution then the two variables are dependent. Knowing that an item has a certain characteristic, such as being female, alters my perception of how likely that person will wear jeans.
I conducted another quick poll for hair color and eye color. I like to keep it simple, so I just broke these into light and dark categories. These two variables are very much dependent, so the marginal and conditional distributions will show differences that everyone can readily see.
At this point, I have lots of data that students can work with. We used the class survey data to gauge association between political leaning and gender. I have a worksheet that looks at highest level of education completed and whether or not the person is a smoker. This practice allows students to create contingency tables and create segmented bar graphs from the data. The discussions of what is created enables students to see many examples and allows me to point out strengths to emulate and weaknesses to avoid.
With all of this practice under their belts, they can now turn to working with the car data that they collected. I have them formulate a question about association and then create a contingency table and segmented bar graph to assess the association. It is easy for students to lose sight of why they are doing this work. As the grind through the data, double checking their counts and calculating percentages, it is easy to forget why you are putting all this effort into working with this data. I reminded the class numerous times to not lose sight of the question they were addressing. We are not creating tables and graphs simply to make the data look nice, we are doing it to understand relationships that may or may not exist in the data.
This work took up nearly 90 minutes of time. For the next class we looked at what students created. The class presentations are helpful because it allows me to focus in on things that are done well and points out issues that need attention. It also enables students to see and hear how to communicate their results. Finally, it provides a forum to view data from multiple lenses, hopefully broadening students' perspective on how to analyze data.
The presentations went well. Below is one example of the results presented in class. The group decided there appeared to be an association and the class concurred.
I was pleased with the results presented and the classes indicated they felt comfortable working with contingency tables.
It is easy to get students started on contingency tables. I created a 3 x 3 table on the board, labeling the vertical boxes on the left as Male, Female, and Total (see table below). Across the top boxes I used the labels Jeans, No Jeans, and Total. I then took a quick poll of boys and girls as to whether they were wearing denim jeans or not. Voila, a contingency table.
Next we worked through calculating percentages of total, row percentages, and column percentages. I like to use different color markers for this so it is easier to reference percentage types. From here, it is an easy matter to begin discussing marginal and conditional distributions, to compare these distributions, and to use these to have students begin to think about variables being dependent or independent.
It is surprising how difficult it is for students to get independence and dependence straight in their minds. They are so used to thinking of independent variables and dependent variables from a mathematical perspective that it is difficult for them to shift gears. Rather than get into a formal look at independence and dependence (such as with probabilities), I simply ask them to consider if the marginal and conditional distributions are tracking along similarly. For example, if the class is 60% male, then it should be reasonable to assume that we would see 60% of jean wearers being male. If the marginal and conditional distributions are "close" then the two variables are independent.
On the other hand, if we see marked differences between the marginal and conditional distribution then the two variables are dependent. Knowing that an item has a certain characteristic, such as being female, alters my perception of how likely that person will wear jeans.
I conducted another quick poll for hair color and eye color. I like to keep it simple, so I just broke these into light and dark categories. These two variables are very much dependent, so the marginal and conditional distributions will show differences that everyone can readily see.
At this point, I have lots of data that students can work with. We used the class survey data to gauge association between political leaning and gender. I have a worksheet that looks at highest level of education completed and whether or not the person is a smoker. This practice allows students to create contingency tables and create segmented bar graphs from the data. The discussions of what is created enables students to see many examples and allows me to point out strengths to emulate and weaknesses to avoid.
With all of this practice under their belts, they can now turn to working with the car data that they collected. I have them formulate a question about association and then create a contingency table and segmented bar graph to assess the association. It is easy for students to lose sight of why they are doing this work. As the grind through the data, double checking their counts and calculating percentages, it is easy to forget why you are putting all this effort into working with this data. I reminded the class numerous times to not lose sight of the question they were addressing. We are not creating tables and graphs simply to make the data look nice, we are doing it to understand relationships that may or may not exist in the data.
This work took up nearly 90 minutes of time. For the next class we looked at what students created. The class presentations are helpful because it allows me to focus in on things that are done well and points out issues that need attention. It also enables students to see and hear how to communicate their results. Finally, it provides a forum to view data from multiple lenses, hopefully broadening students' perspective on how to analyze data.
The presentations went well. Below is one example of the results presented in class. The group decided there appeared to be an association and the class concurred.
I was pleased with the results presented and the classes indicated they felt comfortable working with contingency tables.
Saturday, August 31, 2013
Student Engagement and Learning
Last year, after having my formal observation by one of the assistant principals at my high school, we sat down for my post-observation debriefing. I try to maintain interactive and engaging classrooms, which was affirmed during the observation. The question that was posed was how do I know how much each individual has learned, other than during assessments? My assistant principal was asking me to push further, to make direct linkages to the learning objectives I set and verification that those objectives were met. She wanted me to push students to be more personally accountable for their learning.
I thought about this conversation over the summer and while I was teaching the introductory statistics class at MSU Denver. Due to the compressed time-frame of the class, I elected to give weekly quizzes that covered material from the previous week. Toward the end of each lesson, I tried to provide a problem that reflected what I would be quizzing on the following week.
I decided to implement a similar program in my Inferential Probability and Statistics course. I am giving a weekly quiz that focuses on last week's materials. The questions cover foundational knowledge that students need to be successful. During the week, I provide problems and ask students to write their responses in their notebooks.
In the past, I had asked students to answer the questions but did not require them to write down their responses. Of course, there would be some students who would appear to be thinking about the problem but were more likely thinking about where they were going to eat lunch that day and with whom.
After we discuss responses, I tell students my expectations toward their understanding. For example, in working through various sampling strategies, I provided scenarios and students needed to identify the sampling technique being described. I told my classes that the expectation is that they should be able to read a scenario and determine the sampling technique.
I realize this is the first two weeks and we're still in the "honeymoon" phase of the semester, but students seem much more engaged and attentive. The scores on the first quiz were quite good, approximately 85% of the students received an A or B grade.
I still need to make more explicit connections to the lesson's learning objective and what students are doing. I will continue to require students to write answers down when working through example problems. I will see if the level of engagement and learning continues to hold over the next few weeks and throughout the semester.
I thought about this conversation over the summer and while I was teaching the introductory statistics class at MSU Denver. Due to the compressed time-frame of the class, I elected to give weekly quizzes that covered material from the previous week. Toward the end of each lesson, I tried to provide a problem that reflected what I would be quizzing on the following week.
I decided to implement a similar program in my Inferential Probability and Statistics course. I am giving a weekly quiz that focuses on last week's materials. The questions cover foundational knowledge that students need to be successful. During the week, I provide problems and ask students to write their responses in their notebooks.
In the past, I had asked students to answer the questions but did not require them to write down their responses. Of course, there would be some students who would appear to be thinking about the problem but were more likely thinking about where they were going to eat lunch that day and with whom.
After we discuss responses, I tell students my expectations toward their understanding. For example, in working through various sampling strategies, I provided scenarios and students needed to identify the sampling technique being described. I told my classes that the expectation is that they should be able to read a scenario and determine the sampling technique.
I realize this is the first two weeks and we're still in the "honeymoon" phase of the semester, but students seem much more engaged and attentive. The scores on the first quiz were quite good, approximately 85% of the students received an A or B grade.
I still need to make more explicit connections to the lesson's learning objective and what students are doing. I will continue to require students to write answers down when working through example problems. I will see if the level of engagement and learning continues to hold over the next few weeks and throughout the semester.
Tuesday, August 27, 2013
Statistical Graphs and Thinking About Data
This semester I have made a more direct message about why we graph data distributions. I think in the past my students viewed graphing as part of the response requirement but weren't really considering the ramifications of the data distribution. So I have decided to hammer this point home.
We generated some data and looked at graphs that displayed the data distributions. In looking at shape, I emphasized the aspect of symmetry tells us which summary statistics we can employ to describe the data. Moderately to extremely skewed data indicates that the mean and standard deviation are not appropriate to use; in this situation we should use the median and inter-quartile range (IQR).
The same goes for gaps and outliers. The presence of these adversely affects the mean and standard deviation. As I told my class, an outlier may be screwing up your analysis but you can't simply drop the outlier due to annoyance. We worked with how many movies each student watched in a movie theater this summer. The data contained an outlier. I said that unless we knew something about this data point, such as the individual was a movie critic, that we were stuck working with the outlier included. In fact, our work just increased because we need to understand what if any impact the outlier is having on the overall distribution. This means we calculate summary statistics with and without the outlier included.
Modes produces a different issue. Multiple modes typically indicates that distinct sub-groups are present in the data. This means we'll need to determine if the sub-groups exist and then analyze each identified sub-group separately--more work again!
It is important to realize that graphing is a key step in understanding what you can and cannot do with your data. I am hopefully that students will gain a better appreciation of this fact and become better at determining appropriate statistical measures that can be applied to the data.
We generated some data and looked at graphs that displayed the data distributions. In looking at shape, I emphasized the aspect of symmetry tells us which summary statistics we can employ to describe the data. Moderately to extremely skewed data indicates that the mean and standard deviation are not appropriate to use; in this situation we should use the median and inter-quartile range (IQR).
The same goes for gaps and outliers. The presence of these adversely affects the mean and standard deviation. As I told my class, an outlier may be screwing up your analysis but you can't simply drop the outlier due to annoyance. We worked with how many movies each student watched in a movie theater this summer. The data contained an outlier. I said that unless we knew something about this data point, such as the individual was a movie critic, that we were stuck working with the outlier included. In fact, our work just increased because we need to understand what if any impact the outlier is having on the overall distribution. This means we calculate summary statistics with and without the outlier included.
Modes produces a different issue. Multiple modes typically indicates that distinct sub-groups are present in the data. This means we'll need to determine if the sub-groups exist and then analyze each identified sub-group separately--more work again!
It is important to realize that graphing is a key step in understanding what you can and cannot do with your data. I am hopefully that students will gain a better appreciation of this fact and become better at determining appropriate statistical measures that can be applied to the data.
Monday, August 26, 2013
Starting Statistics with Generating Data
This year, I have reverted back to beginning my Inferential Probability and Statistics (a one-semester, non-AP statistics class) with data collection. After teaching an introductory college statistics course this summer, I liked how starting with experiments gets students engaged quickly.
I began this semester's course with the penny stacking experiment, as I described over the summer. It was engaging and enabled me to cover many topics of experimental design and to build understanding of vocabulary.
I will be spending time on observational studies next and then move into sample surveys. Along the way we will collect data that we can analyze. I'll use these data to work through analysis techniques and graphs.
The first week is off to a good start.
I began this semester's course with the penny stacking experiment, as I described over the summer. It was engaging and enabled me to cover many topics of experimental design and to build understanding of vocabulary.
I will be spending time on observational studies next and then move into sample surveys. Along the way we will collect data that we can analyze. I'll use these data to work through analysis techniques and graphs.
The first week is off to a good start.
Friday, August 23, 2013
AP Statistics Report Writing Self-evaluation
Last post I described using mentor texts to help students become better statistical report writers. The second step in the process is to have students become more reflective about their writing. To do this, I share the scoring rubric that is for their first writing assignment. I have students go through the rubric and ask any clarifying questions they may have. I want to be sure they have a good grasp of the assignment's expectations.
Students swap papers and then use the scoring rubric to assess the paper they are handed. The scoring and paper are then discussed. Reading another student's paper give additional exposure on ways to communicate statistical thinking. Each student now must use the feedback and learning from this scoring exercise to revise and improve their report.
I will score each final draft and provide written feedback. Students will then have one last opportunity to revise their work and make a final submission. I have found that this process builds important meta-cognitive supports to their learning.
Students swap papers and then use the scoring rubric to assess the paper they are handed. The scoring and paper are then discussed. Reading another student's paper give additional exposure on ways to communicate statistical thinking. Each student now must use the feedback and learning from this scoring exercise to revise and improve their report.
I will score each final draft and provide written feedback. Students will then have one last opportunity to revise their work and make a final submission. I have found that this process builds important meta-cognitive supports to their learning.
Thursday, August 22, 2013
AP Statistics Report Writing and the Use of Mentor Texts
My school has been participating in a district-wide program on literacy. This has been going on for a number of years now. One tool which we were shown a few years ago was mentor texts. These are texts that are models of what good writing should look like. Students are able to read through and analyze these text for elements that they should be striving to achieve in their own writing.
In statistics, there are any number of studies available to use for mentor texts. Unfortunately, most of these statistical reports are multiple pages (sometimes 20 or more pages) and are written for graduate and post-graduate-level readers. Definitely not appropriate for high school students.
Since one of my first writing assignments in AP Statistics requires the writing of a newspaper article, I decided to make use of data-driven newspaper articles for mentor texts. I make use of six articles that are brief, contain specific data references and graphs, and provide a decent introduction and conclusion regarding the topic.
I have used mentor texts for the past three years and have found that it helps get students more focused on staying to the facts and referencing explicit statistics. To remind students of what they found in the mentor texts, I post a list or a wordle of their findings.
Below is this year's wordle. As you can see, words such as percentage, data, examples, and charts all are prominent in the display. This puts students on a fast-track to writing better statistical reports.
Wednesday, August 21, 2013
Document Cameras and Tablet PCs
This year I have one AP Statistics class in my normal classroom and one in my computer lab. I am used to traveling back and forth between classroom and lab, so that is not much of a big deal. The computer lab isn't ideal in terms of configuration to have students work collaboratively, but we'll work around it.
I do use a document camera on a regular basis, so I took an extra one that was in our math office and set it up in the computer lab. I tested it out and everything was working beautifully on Monday.
Today in class we were going over exploratory data analysis for categorical data. One question asked students to create a segmented bar graph. In past years, students have struggled with creating a good graph and I use the document camera to display both well and poorly drawn graphs. So, as the discussion progressed, we reached the segmented bar graph piece and I asked students to share their graphs. I picked the first graph turned toward the desk that held the document camera and NOTHING!!! Someone in my department had decided that the document camera was not be used enough across all of the periods and removed it without letting me know.
So there I sat, paper in hand with no document camera. I was not happy and I told my class, this is my not happy face (which they will see on occasion throughout the year). I had no choice but to move on.
During lunch I found out it was my good friend who made the decision. The only thing is he failed to tell me and made no plans to accommodate my need for a document camera during my AP Statistics class. Great.
We talked over the situation and as we talked, the idea bubbled up (I honestly don't recall who first mentioned it but I'll take credit since it's my blog) to take a picture of the document using a tablet PC. I then thought about simply connecting the tablet via USB and opening up the picture folder on the tablet and opening the picture in preview mode. Our school computers have Smart Notebook software and Smart Ink installed on them, so I can write comments and annotate the picture on the Smartboard and then capture it to a Smart Notebook file. My friend said it might be even better if the image could be displayed on individual computer screens. We actually have software installed in our computer lab that allows me to do this.
I tested the basic functionality out during my planning period and everything seems to work as envisioned. I'll have my next AP Statistics class in the lab on Friday. I'll try it out then to see how things go. If this works, it means we could buy tablet PCs for people in the department for roughly $200 versus spending $600 for a document camera. Plus, a tablet has so much more functionality!
So even though I was very annoyed today, it turns out that it could be a big blessing in disguise.
I do use a document camera on a regular basis, so I took an extra one that was in our math office and set it up in the computer lab. I tested it out and everything was working beautifully on Monday.
Today in class we were going over exploratory data analysis for categorical data. One question asked students to create a segmented bar graph. In past years, students have struggled with creating a good graph and I use the document camera to display both well and poorly drawn graphs. So, as the discussion progressed, we reached the segmented bar graph piece and I asked students to share their graphs. I picked the first graph turned toward the desk that held the document camera and NOTHING!!! Someone in my department had decided that the document camera was not be used enough across all of the periods and removed it without letting me know.
So there I sat, paper in hand with no document camera. I was not happy and I told my class, this is my not happy face (which they will see on occasion throughout the year). I had no choice but to move on.
During lunch I found out it was my good friend who made the decision. The only thing is he failed to tell me and made no plans to accommodate my need for a document camera during my AP Statistics class. Great.
We talked over the situation and as we talked, the idea bubbled up (I honestly don't recall who first mentioned it but I'll take credit since it's my blog) to take a picture of the document using a tablet PC. I then thought about simply connecting the tablet via USB and opening up the picture folder on the tablet and opening the picture in preview mode. Our school computers have Smart Notebook software and Smart Ink installed on them, so I can write comments and annotate the picture on the Smartboard and then capture it to a Smart Notebook file. My friend said it might be even better if the image could be displayed on individual computer screens. We actually have software installed in our computer lab that allows me to do this.
I tested the basic functionality out during my planning period and everything seems to work as envisioned. I'll have my next AP Statistics class in the lab on Friday. I'll try it out then to see how things go. If this works, it means we could buy tablet PCs for people in the department for roughly $200 versus spending $600 for a document camera. Plus, a tablet has so much more functionality!
So even though I was very annoyed today, it turns out that it could be a big blessing in disguise.
Thoughts on the First Day of Class
Classes started and I'm off and running on another school year. I spent time reviewing grading so students understand what is expected of them.
Establishing Group Norms I also spent time on formally establishing group norms since I have students sitting in groups of three or four. I didn't go through this process last year and had issues with student behavior. By letting students set the norms and expectations of working in groups I hope to eliminate many of the poor behaviors I saw last year. My first step was to have each student write down three norms or expectations they had for working in groups. I then asked the students in their groups to come to a consensus on three norms. I then had groups share out their list and added items as we moved through the room. Once the entire list was up I asked students if there were any questions or concerns about any of the items. Finally I asked students to show their agreement by raising a thumb straight up; if there were any concerns or questions they should hold their thumb out to the side. Each class gave a thumbs up to their norms and they are now posted in the classroom.
A Safe Classroom Environment? In one class, we were discussing ideas concerning why a survey was administered. We knew the facts as to who was asked questions, the types of questions asked, but were not told why the survey was given. As the discussion was wrapping up I asked if any other students had ideas. Two students who I have had for the past two years were sitting next to each other. One student indicated that the other had an idea. I encourage this second student to share his thoughts. He said several times that he really didn't. I then told the class "There are no incorrect responses and you all should feel safe sharing your thoughts, unless you say something stupid." This elicited many laughs from the class. Nothing like getting the class off to a good start.
Establishing Group Norms I also spent time on formally establishing group norms since I have students sitting in groups of three or four. I didn't go through this process last year and had issues with student behavior. By letting students set the norms and expectations of working in groups I hope to eliminate many of the poor behaviors I saw last year. My first step was to have each student write down three norms or expectations they had for working in groups. I then asked the students in their groups to come to a consensus on three norms. I then had groups share out their list and added items as we moved through the room. Once the entire list was up I asked students if there were any questions or concerns about any of the items. Finally I asked students to show their agreement by raising a thumb straight up; if there were any concerns or questions they should hold their thumb out to the side. Each class gave a thumbs up to their norms and they are now posted in the classroom.
A Safe Classroom Environment? In one class, we were discussing ideas concerning why a survey was administered. We knew the facts as to who was asked questions, the types of questions asked, but were not told why the survey was given. As the discussion was wrapping up I asked if any other students had ideas. Two students who I have had for the past two years were sitting next to each other. One student indicated that the other had an idea. I encourage this second student to share his thoughts. He said several times that he really didn't. I then told the class "There are no incorrect responses and you all should feel safe sharing your thoughts, unless you say something stupid." This elicited many laughs from the class. Nothing like getting the class off to a good start.
Saturday, August 17, 2013
Getting Ready for the 2013-2014 School Year
After teaching an introductory statistics course as MSU Denver this summer, I have been tweaking my Inferential Probability and Statistics course content. I am also planning on introducing some new investigations and tasks to my AP Statistics course this year.
For the Inferential Probability and Statistics course, I have gone back to starting with a unit on generating data. The sequence will be experimental design, observation studies, and then survey sampling. From there, we'll move into a unit on organizing data and describing data distributions using summary statistics. The third unit will cover probability with an emphasis on simulations. The fourth unit will then transition to using probabilities and simulations to make inferences.
I will not provide a day-to-day recap of each lesson, as much of the material was documented last school year. I will spend time covering new activities and I will comment on how the sequencing is either helping or hindering students' understanding of statistical analysis.
I wish everyone a successful school-year. Please feel free to post comments and questions regarding what you are doing in your classes.
For the Inferential Probability and Statistics course, I have gone back to starting with a unit on generating data. The sequence will be experimental design, observation studies, and then survey sampling. From there, we'll move into a unit on organizing data and describing data distributions using summary statistics. The third unit will cover probability with an emphasis on simulations. The fourth unit will then transition to using probabilities and simulations to make inferences.
I will not provide a day-to-day recap of each lesson, as much of the material was documented last school year. I will spend time covering new activities and I will comment on how the sequencing is either helping or hindering students' understanding of statistical analysis.
I wish everyone a successful school-year. Please feel free to post comments and questions regarding what you are doing in your classes.
Tuesday, July 23, 2013
Final Statistics Review
There's not much to report on today. Students turned in their project papers and there were some interesting topics that I am looking forward to reading. The review was mostly students working through problems of their choice and me walking around answering questions and providing guidance as needed. Students primarily worked in groups and were conscientiously working together to discuss how to work through problems and compare results. I figured given the compressed nature of the summer semester that allowing students time to work through problems would be the best use of time. While some students still seemed lost, many started to see how straight-forward many of the former quiz and test questions really were.
Monday, July 22, 2013
Statistics Review - Two Sample Hypothesis Tests for Means and Probability
Today was the first of two days of review. The quiz on one-sample hypothesis tests for means indicated that students were still struggling with this idea. A larger segment than I would care to see had found the sample mean and simply compared it to the hypothesized value. There was no referencing the sampling model nor the probability that they would draw a sample with the characteristics they saw.
To start things off, I reiterated the ideas behind statistical analysis and inference. I drew a large cloud on the board and said this was the population we were studying. Inside the cloud I drew a small circle and said this represented the sample we drew. The idea is to draw conclusions about the entire population from the small snapshot that we took via our sample.
I drew more circles throughout the population cloud, some of which overlapped. Each sample we draw provides a different snapshot of the population. We need to account for every possible sample we draw. This is where the sampling distribution model comes into play. The sampling distribution model describes what we should expect given the sample size we have drawn from the population.
We cannot simply compare our sample mean to the hypothesized population mean. Sure, this time it may be greater than our hypothesized value, the next drawn sample could show the mean less or more. Every sample could and probably will be different. We need to take the one sample we drew and use that to draw conclusions about all the possible samples that could be drawn and from this draw a conclusion about the population we are studying.
With that said, we went through the quiz questions, working through results. As students did this, I passed out two sets of die; one of the dice was colored green and the other blue. I asked students to roll both die six times and to count the number of times each colored die won. As soon as we finished going through the quiz problems I collected the die and we started working with the data that was generated.
I described the situation to the entire class, since not every student had a die set passed to them. I described the data collected and asked them what they expected to happen. Several students said they expected the number of wins for each die to be the same. From here I asked them to state null and alternative hypotheses.
H0: The mean number of blue dice wins equals the mean number of green dice wins (μb = μg)
Ha: The mean number of blue dice wins does not equal the mean number of green dice wins (μb ≠ μg)
I then asked the students if the samples we had were independent or not. Here there was some disagreement. One student said that if you knew the number of blue wins then you would also know the number of green wins. Another said that wouldn't necessarily be true since you could have ties. I then brought up that you might not know the exact number of wins but you certainly would know the maximum number of wins. This indicates that the two samples are not independent of each other. Students need to carefully consider the samples they draw as to whether or not there is any direct connection between the two samples.
I asked students to analyze the results of these samples and draw a conclusion about the data. There was a lot of confusion about how to analyze the data. With a matched pair test, you take the difference in values between the matched numbers and then analyze the differences as a single sample.
Their analysis resulted in a p-value of 0.04. Students were somewhat shocked by this result. The conclusion was to reject the null hypothesis and conclude that the number of wins for the blue die and the green die were different. I told them not to be too shocked as the green die had two number fives on its face and no number two.
Next, I asked several students to roll an individual die 3-4 times and record the values rolled. I made two columns on the board, one for blue die rolls and one for green die rolls. Students recorded their results on the board. I then asked if these two samples were independent. The response this time was a resounding "Yes!"
I asked what an appropriate null hypothesis would be and what should the alternative hypothesis be. Because students now knew that the green die was "unfair" they concluded that the alternative should be the green die rolls would exceed the blue die rolls. We had
H0: The mean roll of the blue die equals the mean roll of the green die (μb = μg)
Ha: The mean roll of the blue die is less than the mean roll of the green die (μb < μg)
I asked students to analyze the samples and draw a conclusion. In this situation, students had questions about how to calculate their degrees of freedom. While the text does provide a formula, it is long and complicated and a calculator or computer can easily do the computation for you. I told students this and for the few students who did not have a calculator that would figure the degrees of freedom for them, I told them to simply add the degrees of freedom for the two individual samples, which basically provides an upper-bound to the number of degrees of freedom. For most reasonably large samples the results will not be affected.
For this analysis, students calculated that the p-value was 0.056. Using a 5% significance level we would fail to reject the null hypothesis. A student pointed out that this was the wrong decision. For our sample, we committed a Type II error. This error was discussed versus a Type I error.
To confirm that we, in fact, committed a Type II error, I asked students to calculate the expected value (mean) of rolling the green die. This entails constructing a probability model and determining the expected value. Many students struggled with this but were able to complete the task with some assistance. I then asked the class to compute the expected value of the blue die. The values were 4 and 3.5 respectively. So, we did indeed commit a Type II error.
I then asked students to determine the probability of the green die roll exceeding the blue die roll. Students seemed baffled as to how to proceed. I told them they needed to consider all the possible outcomes and which of those met the desired criteria. Listing out the 36 possible outcomes, it becomes readily apparent that there are 18 outcomes when the green roll exceeds the blue roll, resulting in a P(green > blue) = 0.5. Proceeding further I asked what P(green = blue) equals? Students used their outcomes and came to a result of 1/6. This baffled them for some moments as they seemed to expect it would be different.
I asked students to pick 2-3 questions from past quizzes and exams or chapter review problems that they did not know how to complete. We will use these as a basis for further review next class.
I concluded by working with students who had questions about the project reports. Most of the reports that I was shown looked to be on the right track. A few needed to focus more on comparing the two data sets rather than simply viewing them as two distinct, non-related entities. I am looking forward to seeing what they produce for their final versions.
To start things off, I reiterated the ideas behind statistical analysis and inference. I drew a large cloud on the board and said this was the population we were studying. Inside the cloud I drew a small circle and said this represented the sample we drew. The idea is to draw conclusions about the entire population from the small snapshot that we took via our sample.
I drew more circles throughout the population cloud, some of which overlapped. Each sample we draw provides a different snapshot of the population. We need to account for every possible sample we draw. This is where the sampling distribution model comes into play. The sampling distribution model describes what we should expect given the sample size we have drawn from the population.
We cannot simply compare our sample mean to the hypothesized population mean. Sure, this time it may be greater than our hypothesized value, the next drawn sample could show the mean less or more. Every sample could and probably will be different. We need to take the one sample we drew and use that to draw conclusions about all the possible samples that could be drawn and from this draw a conclusion about the population we are studying.
With that said, we went through the quiz questions, working through results. As students did this, I passed out two sets of die; one of the dice was colored green and the other blue. I asked students to roll both die six times and to count the number of times each colored die won. As soon as we finished going through the quiz problems I collected the die and we started working with the data that was generated.
I described the situation to the entire class, since not every student had a die set passed to them. I described the data collected and asked them what they expected to happen. Several students said they expected the number of wins for each die to be the same. From here I asked them to state null and alternative hypotheses.
H0: The mean number of blue dice wins equals the mean number of green dice wins (μb = μg)
Ha: The mean number of blue dice wins does not equal the mean number of green dice wins (μb ≠ μg)
I then asked the students if the samples we had were independent or not. Here there was some disagreement. One student said that if you knew the number of blue wins then you would also know the number of green wins. Another said that wouldn't necessarily be true since you could have ties. I then brought up that you might not know the exact number of wins but you certainly would know the maximum number of wins. This indicates that the two samples are not independent of each other. Students need to carefully consider the samples they draw as to whether or not there is any direct connection between the two samples.
I asked students to analyze the results of these samples and draw a conclusion about the data. There was a lot of confusion about how to analyze the data. With a matched pair test, you take the difference in values between the matched numbers and then analyze the differences as a single sample.
Their analysis resulted in a p-value of 0.04. Students were somewhat shocked by this result. The conclusion was to reject the null hypothesis and conclude that the number of wins for the blue die and the green die were different. I told them not to be too shocked as the green die had two number fives on its face and no number two.
Next, I asked several students to roll an individual die 3-4 times and record the values rolled. I made two columns on the board, one for blue die rolls and one for green die rolls. Students recorded their results on the board. I then asked if these two samples were independent. The response this time was a resounding "Yes!"
I asked what an appropriate null hypothesis would be and what should the alternative hypothesis be. Because students now knew that the green die was "unfair" they concluded that the alternative should be the green die rolls would exceed the blue die rolls. We had
H0: The mean roll of the blue die equals the mean roll of the green die (μb = μg)
Ha: The mean roll of the blue die is less than the mean roll of the green die (μb < μg)
For this analysis, students calculated that the p-value was 0.056. Using a 5% significance level we would fail to reject the null hypothesis. A student pointed out that this was the wrong decision. For our sample, we committed a Type II error. This error was discussed versus a Type I error.
To confirm that we, in fact, committed a Type II error, I asked students to calculate the expected value (mean) of rolling the green die. This entails constructing a probability model and determining the expected value. Many students struggled with this but were able to complete the task with some assistance. I then asked the class to compute the expected value of the blue die. The values were 4 and 3.5 respectively. So, we did indeed commit a Type II error.
I then asked students to determine the probability of the green die roll exceeding the blue die roll. Students seemed baffled as to how to proceed. I told them they needed to consider all the possible outcomes and which of those met the desired criteria. Listing out the 36 possible outcomes, it becomes readily apparent that there are 18 outcomes when the green roll exceeds the blue roll, resulting in a P(green > blue) = 0.5. Proceeding further I asked what P(green = blue) equals? Students used their outcomes and came to a result of 1/6. This baffled them for some moments as they seemed to expect it would be different.
I asked students to pick 2-3 questions from past quizzes and exams or chapter review problems that they did not know how to complete. We will use these as a basis for further review next class.
I concluded by working with students who had questions about the project reports. Most of the reports that I was shown looked to be on the right track. A few needed to focus more on comparing the two data sets rather than simply viewing them as two distinct, non-related entities. I am looking forward to seeing what they produce for their final versions.
Thursday, July 18, 2013
Hypothesis Tests of Means Using Two Samples
The focus today was on testing hypotheses about means when you have two samples. We also had a quiz today covering confidence intervals and hypothesis test of a mean for a single sample. I reviewed material that students had questions about, working through a couple of examples that addressed their questions.
Before taking the quiz, I covered hypothesis testing for paired samples. Paired samples include before and after measurements, working with siblings or spouses, and, as in our first day experiment, comparing things such as dominant and non-dominant measurements.
I used the coin stacking data from our first class, which is readily recognizable as paired data. I asked the class to formulate null and alternative hypotheses for this problem. We discussed a one-tail alternative of the dominant hand performing better and a two-tail alternative of the two hands will perform differently. This totally depends on the question you are addressing. If you are trying to answer the question, "Will your dominant hand stack more coins than your non-dominant hand?" then your alternative hypothesis will be the mean dominant hand stack is greater than the mean non-dominant hand stack, a one-tail upper test. If you ask the question, "Will my two hands stack the same number of coins?" then your alternative hypothesis is that the mean dominant hand stack is not the same as the mean non-dominant hand stack, a two-tail test.
To perform a paired sample analysis, you take the difference between your individual paired data values and treat these differences as a single sample. All the tests, confidence intervals, and results work exactly the same as a single sample test, except that the results are for the mean of the differences.
After the quiz we covered the situation where you work with two independent samples. In this case the sample sizes may be different and there is no connection between the two samples. The combined sampling model we use is still a t-model with a mean of the difference of the two sample means. The standard deviation is calculated by adding the variances of the individual standard errors and then taking the square root. I showed students how to use their calculator to perform this test. The calculator or software will provide the degrees of freedom. As I told the class, there is a formula to determine the degrees of freedom but it is not something you want to calculate by hand. We worked through an example involving operation times using two different protocols and conducted a hypothesis test and created a confidence interval.
Students are working on finishing their project papers over the next few days and trying to get ready for the final. The project is due next week and the final is one week from today.
Below is an outline of today's lesson with italicized comments enclosed in square brackets [like this].
Before taking the quiz, I covered hypothesis testing for paired samples. Paired samples include before and after measurements, working with siblings or spouses, and, as in our first day experiment, comparing things such as dominant and non-dominant measurements.
I used the coin stacking data from our first class, which is readily recognizable as paired data. I asked the class to formulate null and alternative hypotheses for this problem. We discussed a one-tail alternative of the dominant hand performing better and a two-tail alternative of the two hands will perform differently. This totally depends on the question you are addressing. If you are trying to answer the question, "Will your dominant hand stack more coins than your non-dominant hand?" then your alternative hypothesis will be the mean dominant hand stack is greater than the mean non-dominant hand stack, a one-tail upper test. If you ask the question, "Will my two hands stack the same number of coins?" then your alternative hypothesis is that the mean dominant hand stack is not the same as the mean non-dominant hand stack, a two-tail test.
To perform a paired sample analysis, you take the difference between your individual paired data values and treat these differences as a single sample. All the tests, confidence intervals, and results work exactly the same as a single sample test, except that the results are for the mean of the differences.
After the quiz we covered the situation where you work with two independent samples. In this case the sample sizes may be different and there is no connection between the two samples. The combined sampling model we use is still a t-model with a mean of the difference of the two sample means. The standard deviation is calculated by adding the variances of the individual standard errors and then taking the square root. I showed students how to use their calculator to perform this test. The calculator or software will provide the degrees of freedom. As I told the class, there is a formula to determine the degrees of freedom but it is not something you want to calculate by hand. We worked through an example involving operation times using two different protocols and conducted a hypothesis test and created a confidence interval.
Students are working on finishing their project papers over the next few days and trying to get ready for the final. The project is due next week and the final is one week from today.
Below is an outline of today's lesson with italicized comments enclosed in square brackets [like this].
—Inference
for 2 Population Means
·
What would you try if you wanted to compare two
samples, such as heights of males and females? [reversed the order of presentation since the paired sample was good review for the quiz]
·
If the samples are independent we can take the difference
between the two
o
Will still have a t-model
o
Mean is difference of the means
o
Standard error is determined by adding variances [due to limited time just stated this result versus having students work through sample data to see that this was the case]
o
Degrees of freedom—let your calculator or
computer figure it out
·
Proceed with same process as single sample
testing except now you are working with hypotheses about the difference in
values
o
Confidence intervals
o
p-value and inference
·
Practice two independent samples [used example in book for surgery times; samples were of different sizes, which students typically bring into question. The point is were are testing a hypothesis about the difference in the means of each sample, the individual sample sizes will only affect the variances and hence standard deviations when the sampling models are combined.]
·
What if two samples are not independent, such as
coin stacking?
o
Take differences in values since they are paired
o
Now have a one-sample situation
o
Proceed as if dealing with a single sample and
infer about difference
§
The differences should be roughly symmetrical,
the raw data doesn’t matter
·
Practice paired sample [Used coin stacking from the first day; this makes a nice closure to the entire semester.]Tuesday, July 16, 2013
Statistical Testing Errors
The focus today was on statistical testing errors, specifically the nature and consequences of making Type I and Type II errors.
The class was asking good questions about hypothesis testing, so I took some time to go through these. I also discussed the differences between creating confidence intervals and conducting hypothesis tests. For confidence intervals, we are interested in estimating a value. We use our level of confidence to establish the range of values that "make sense" based upon the sample data that we collected. For hypothesis testing we are simply giving a yeah or nay to a specific value (the hypothesized population parameter). If our well-collected data is consistent with the null hypothesis it's a yeah vote, if our data is inconsistent with the null hypothesis it's a nay vote.
I used a problem from the last lesson regarding average apparel expenditures to illustrate this idea. We worked through the hypothesis test and rejected the null hypothesis. Since the null hypothesis was tossed out, it is only natural to wonder what a more reasonable value for the population mean would be. A confidence interval is used to establish a new range of possible values with our best guess simply being the mean of our sample.
From here I moved to the issue of statistical testing errors. I have a chart that I can use to discuss the different errors and how values like the level of significance and power of the test relate.
We spent quite a bit of time discussing these and the ramifications of these errors within the context of a problem. I like to use a drug test as an example. Would the drug manufacturer rather see a Type I or Type II error made and why? What about the drug manufacturing regulator? Consider the scenario of a company evaluating a sales training program: what happens if they make a Type I error and what happens if they make a Type II error? These questions and discussions really help bring meaning to the errors.
After this, I pointed out graphically the relationship between significance level and power. Basically if the level of significance is reduced (making the value smaller) the power is also reduced and vice versa. The only way to reduce the level of significance without affecting power is to increase sample size.
We then looked at non-parametric testing methods for when you have small sample sizes that are skewed. This was to make students aware that other methods are available for non-conforming data sets.
I spent some time showing students how a calculator assists in calculating confidence intervals, t-scores, and p-values. We practiced using the calculator on a couple of different problems; one we had worked before and one new one.
I also passed out sample project reports and rubrics and gave students time to look through the samples to see how items in the report represented the requirements of the rubric.
Finally, it was course evaluation time. I expect that I will be scored low on a number of fronts as many students have struggled and their grades are lower than they would like.
Below is an outline of today's lesson with italicized comments enclosed in square brackets [like this]
The class was asking good questions about hypothesis testing, so I took some time to go through these. I also discussed the differences between creating confidence intervals and conducting hypothesis tests. For confidence intervals, we are interested in estimating a value. We use our level of confidence to establish the range of values that "make sense" based upon the sample data that we collected. For hypothesis testing we are simply giving a yeah or nay to a specific value (the hypothesized population parameter). If our well-collected data is consistent with the null hypothesis it's a yeah vote, if our data is inconsistent with the null hypothesis it's a nay vote.
I used a problem from the last lesson regarding average apparel expenditures to illustrate this idea. We worked through the hypothesis test and rejected the null hypothesis. Since the null hypothesis was tossed out, it is only natural to wonder what a more reasonable value for the population mean would be. A confidence interval is used to establish a new range of possible values with our best guess simply being the mean of our sample.
From here I moved to the issue of statistical testing errors. I have a chart that I can use to discuss the different errors and how values like the level of significance and power of the test relate.
We spent quite a bit of time discussing these and the ramifications of these errors within the context of a problem. I like to use a drug test as an example. Would the drug manufacturer rather see a Type I or Type II error made and why? What about the drug manufacturing regulator? Consider the scenario of a company evaluating a sales training program: what happens if they make a Type I error and what happens if they make a Type II error? These questions and discussions really help bring meaning to the errors.
After this, I pointed out graphically the relationship between significance level and power. Basically if the level of significance is reduced (making the value smaller) the power is also reduced and vice versa. The only way to reduce the level of significance without affecting power is to increase sample size.
We then looked at non-parametric testing methods for when you have small sample sizes that are skewed. This was to make students aware that other methods are available for non-conforming data sets.
I spent some time showing students how a calculator assists in calculating confidence intervals, t-scores, and p-values. We practiced using the calculator on a couple of different problems; one we had worked before and one new one.
I also passed out sample project reports and rubrics and gave students time to look through the samples to see how items in the report represented the requirements of the rubric.
Finally, it was course evaluation time. I expect that I will be scored low on a number of fronts as many students have struggled and their grades are lower than they would like.
Below is an outline of today's lesson with italicized comments enclosed in square brackets [like this]
·
Test errors – may be wrong and won’t know
because usually won’t know population values
o
Show diagram of correct/incorrect hypotheses
versus decisions and errors
o
Discuss alpha, beta and power
o
Show chart connecting relationship between alpha
and beta
o
Discuss what happens if sample size increases
and its impact on alpha, beta, and power
·
What if you have a small sample that is skewed?
o
Use the Wilcoxon signed-rank test
o
Explained in section 9.6 of the book [provided a slide with example to describe the process and how to calculate the test statistic]Monday, July 15, 2013
Hypothesis Test for One-Sample Mean
Today we transitioned from creating confidence intervals to conducting hypothesis tests. While mechanically these are closely related, there are a few significant points to keep in mind:
- Confidence intervals are based upon a model developed using our sample statistics. We use our sample data to estimate a probable value for the true population mean.
- Hypothesis tests are based upon a hybrid model that is centered on an assumed value for the population mean but making use of our sample standard deviation to create a standard error for the model.
- Confidence intervals are equivalent to a two-tailed hypothesis test, we are excluding extremely small and extremely large values.
- Hypothesis tests can be one-tailed or two-tailed, there is not an equivalent confidence interval for a one-tailed test.
Will get into test errors and their meanings next class. Will also provided project write-up examples so students better understand the end product they need to produce. Ideally I would have a range of samples and have students rank sort the papers and we would discuss if they were A-, B-, C-level or worse. Unfortunately, I only have high quality examples, so I'll have them read through and look for characteristics from the scoring rubric that are demonstrated in the papers.
Below is the outline of today's lesson with italicized comments enclosed in square brackets [like this].
Below is the outline of today's lesson with italicized comments enclosed in square brackets [like this].
—Hypothesis
Test for One Population Mean
·
Compare hypothesis testing to trial
·
Provide hypotheses mentor texts [mentor texts are problem statements along with appropriate hypothesis statements that show what these hypothesis statements look like. Four examples were provided that included one-tail upper, one-tail lower, and two-tailed alternative hypotheses.]
o
Use always, sometimes, never [In the context of the problem statement, what do you always see, sometimes see, and never see in a hypothesis statement.]
o
Discuss what was seen [At this point students have a better sense of what is being tested]
·
Hypotheses are statements about population
parameters
o
Provide examples for H0 and Ha [Examples included commentary on hypothesis statement structure and content.]
·
Significance level and alpha values
o
Discuss picking a probability for which you
would reject your hypothesis—alpha level
o
Connect to critical values and significance
level
·
Usually don’t know the population standard
deviation so use t-test [Just stated this was the case and we were working with t-model]
o
Works the same way
o
Must include degrees of freedom so know which
t-model was used [The degrees of freedom (df) specifies the exact t-model being used from family of all possible t-models]
o
Can use with moderate to large samples, even if
data is not symmetric [The t-test is a robust test that works reasonably well with even relatively small samples that are somewhat skewed. Provided students a rule of thumb for different sample size ranges but basically said unless the sample is small and highly skewed to not worry about it.]
·
Discuss meaning of p-value
o
Conditional probability—given the null
hypothesis is true, what is the probability of seeing the random sample that
was drawn?
o
The more unusual the sample the smaller the
p-value—it’s not likely to be seen
o
Conclusion: either we drew a bad sample or the
null hypothesis is wrong
o
If we followed good data collection procedures
than the conclusion must be the null hypothesis is incorrect
·
Use sibling data and have students calculate a
p-value
o
What if our class is viewed as a sample; does
sample support claim that the mean number of siblings in the US is 1.86?
o
Work through problems in the book
Thursday, July 11, 2013
Confidence Intervals, Margin of Error and the t-model
We continued looking at confidence intervals today, specifically focusing on critical values, margin of error and the t-model. We used the class sibling data set and found critical values and confidence intervals for 80% and 98% confidence intervals. We then used these intervals to discuss margin of error. We were able to use the formula for margin of error to then see what sample size was needed in order to obtain a margin of error that was roughly half of the current margin of error.
From here we moved to the t-model. This is typically what you end up working with because you don't know the mean nor the standard deviation of the population. It is natural to replace the population standard deviation with the sample standard deviation. However, this introduces more variability into the model, specifically, every sample size results in a slightly different standardized model. This family of models is know as t-models and the specific family member used is determined by the sample size. Specifically, the degrees of freedom (df) is one less than the sample size, i.e. df = n - 1.
We worked through the same examples we used before but no longer assumed the population standard deviation and and sample standard deviation were the same. In this case, since we are using the sample standard deviation, sx, to estimate the standard deviation, we distinguish this by calling the t-model's standard deviation a standard error. Otherwise, t-models behave and are used similar to a normal model.
We were in the computer lab today, so I was able to show students how to conduct their analyses using the Minitab software. Although we won't cover hypothesis testing until next class, there was enough foundational pieces in place that students could follow the general idea and how they could proceed with their projects.
Below is an outline of the lesson along with italicized comments enclosed in square brackets [like this].
From here we moved to the t-model. This is typically what you end up working with because you don't know the mean nor the standard deviation of the population. It is natural to replace the population standard deviation with the sample standard deviation. However, this introduces more variability into the model, specifically, every sample size results in a slightly different standardized model. This family of models is know as t-models and the specific family member used is determined by the sample size. Specifically, the degrees of freedom (df) is one less than the sample size, i.e. df = n - 1.
We worked through the same examples we used before but no longer assumed the population standard deviation and and sample standard deviation were the same. In this case, since we are using the sample standard deviation, sx, to estimate the standard deviation, we distinguish this by calling the t-model's standard deviation a standard error. Otherwise, t-models behave and are used similar to a normal model.
We were in the computer lab today, so I was able to show students how to conduct their analyses using the Minitab software. Although we won't cover hypothesis testing until next class, there was enough foundational pieces in place that students could follow the general idea and how they could proceed with their projects.
Below is an outline of the lesson along with italicized comments enclosed in square brackets [like this].
·
Margin of error
o
Length of ci/2 or value of what is being added
and subtracted to mean [actually looked at value being added/subtracted first and then mentioned the interval length divided by 2 gives same value]
o
Calculate margin of error for two practice
confidence intervals
·
Estimate sample size needed
o
Solve algebraically starting with ME value [worked through a couple of problems using different confidence levels and margin of errors]
·
Typically don’t
know the population standard deviation, just like we don’t know the
population mean – what can we do
o
Use sample standard deviation for population
standard deviation
o
This introduces more error
§
No longer have standard deviation have standard
error
·
t-model
o
looks like normal model, same basic properties
o
as sample size increase looks more and more like
a normal model
·
t-table
o
in book and handout [book did not include a t-table that provides df and t-score and then shows percentage in upper tail]
·
Confidence intervals
o
Same as before except use t-score and t-table
instead of z-score and normal table
o
Practice problem as before but assume don’t know
population standard deviation
o
Use calculator [introduced this after making use of tables on several intervals]
·
Sample size for t-model – worst case is to use
z-score since don’t know sample size
o
Can get a better estimate after by recalculating [didn't get into this, just have them estimate using z-score]
Tuesday, July 9, 2013
Confidence Intervals for Population Means Assuming Known Variance
Today we transitioned from looking at the sampling model of sample means to working with this model to construct confidence intervals. I first went through looking at subjective confidence intervals. This allows students to see that as the gain confidence their interval widens to capture more values. It also allows me to communicate how confident we are that the true mean lies within the interval created.
From here I moved to asking students to consider how we could make use of the central limit theorem and the sampling model. I used the class's sibling data and assumed that the class standard deviation was in fact equal to the population standard deviation. We created the sampling model and I drew a graph of this model on the board with the mean and standard deviations labeled above and below the mean. I then asked students what would be the 68.26% confidence interval? Once students grasped that it was just the interval from one standard deviation below to one standard deviation above the mean, they were much quicker about determining the 95.44% confidence interval. I hadn't labeled the graph beyond two standard deviations, so I asked them what the 99.74% confidence interval would be? Most students found the new end points although a few still had questions.
I pointed out that what we had done to construct the confidence interval was to add or subtract a integral multiple of the sampling model standard deviation away from the mean, i.e. μ ± Nσ where N = 1, 2, 3. This is all well and good but saying we are 95.44% confident seems a bit much; it would be nicer to have our confidence intervals at integral values rather than the multiples of the standard deviation.
I asked students what z-scores would result in a 95% confidence interval rather than a 95.44% confidence interval. This threw many of them for a loop. It was obvious that they still were not comfortable working with normal models. After a bit more guidance students determined that z-scores of -1.96 and 1.96 are what were needed. I told the class the value of 1.96 was called a critical value as it was the z-score value that was needed to construct a 95% confidence interval.
For homework, I asked students to determine the critical values for a 90% and 99% confidence interval.
Below is the outline of today's lesson with italicized comments enclosed in square brackets [like this].
From here I moved to asking students to consider how we could make use of the central limit theorem and the sampling model. I used the class's sibling data and assumed that the class standard deviation was in fact equal to the population standard deviation. We created the sampling model and I drew a graph of this model on the board with the mean and standard deviations labeled above and below the mean. I then asked students what would be the 68.26% confidence interval? Once students grasped that it was just the interval from one standard deviation below to one standard deviation above the mean, they were much quicker about determining the 95.44% confidence interval. I hadn't labeled the graph beyond two standard deviations, so I asked them what the 99.74% confidence interval would be? Most students found the new end points although a few still had questions.
I pointed out that what we had done to construct the confidence interval was to add or subtract a integral multiple of the sampling model standard deviation away from the mean, i.e. μ ± Nσ where N = 1, 2, 3. This is all well and good but saying we are 95.44% confident seems a bit much; it would be nicer to have our confidence intervals at integral values rather than the multiples of the standard deviation.
I asked students what z-scores would result in a 95% confidence interval rather than a 95.44% confidence interval. This threw many of them for a loop. It was obvious that they still were not comfortable working with normal models. After a bit more guidance students determined that z-scores of -1.96 and 1.96 are what were needed. I told the class the value of 1.96 was called a critical value as it was the z-score value that was needed to construct a 95% confidence interval.
For homework, I asked students to determine the critical values for a 90% and 99% confidence interval.
Below is the outline of today's lesson with italicized comments enclosed in square brackets [like this].
—Confidence
Interval for One Population Mean
·
Confidence questions
o
Want students to realize as they gain confidence
the spread of values increases
·
Take a sample, what is estimate of mean?
o
Use CLT to say best estimate is mean of sample
·
How can you account for sample variation?
o
Can the normal model help?
·
Every sample creates a different confidence
interval [showed graphs of 20 confidence intervals developed from 20 samples, able to indicate that the interval may not contain the true population mean]
·
Calculating confidence intervals
o
Find z-scores, multiply by standard deviation,
add and subtract from sample mean [focused on just adding and subtracting integral values and had students use practice problem below to first construct a 95.44% confidence interval]
o
Use calculator [decided to hold off on this until next class]
·
Practice – age of civilian workforce
o
Calculate 90% and 95% confidence intervals [used 95.44% confidence interval and then asked students to find 95% critical value, will continue this next class]Monday, July 8, 2013
Building understanding of the Central Limit Theorem
Today I focused on building understanding of the Central Limit Theorem and how it can be used. I used a series of problems to help students work directly with sampling distributions and then moved to using the results of the central limit theorem.
To start things off, we had a data set of 5 basketball players and their heights. We looked at all possible samples of two individuals. Students found the mean height for each of the 10 samples. They calculated out the mean and standard deviation of their 10 samples and we compared that to the mean and standard deviation of the population. We also constructed a histogram of the sample mean distribution. We then compared these results to the theoretical model. In this case, the means aligned and the standard deviation was off slightly but it was close. As I explained to the class, models are useful to help explain behavior but they may not be accurate. We also calculated the probability of sample mean equaling the population mean and the probability of the sample mean being within 1.0 inches of the sample mean.
Next, we looked at a couple of situations, discussed the population and variable of interest and then compared the sampling distribution models for two different sample sizes. This helped students to get comfortable with specifying the sampling distribution model we were using.
Finally, we looked at two problem situations that assumed specific population parameters and then asked what percent samples of a certain size would fall within given ranges. This basically gets back to finding z-scores and working with a normal model. Students still wanted to use the population standard deviation when determining probabilities, but with some reinforcement, most students seemed to understand why the sampling distribution had a different standard deviation.
Afterward, we went through a review activity that I use often and described in a previous post. The second mid-term is tomorrow, after which we will begin developing the concept of confidence intervals.
To start things off, we had a data set of 5 basketball players and their heights. We looked at all possible samples of two individuals. Students found the mean height for each of the 10 samples. They calculated out the mean and standard deviation of their 10 samples and we compared that to the mean and standard deviation of the population. We also constructed a histogram of the sample mean distribution. We then compared these results to the theoretical model. In this case, the means aligned and the standard deviation was off slightly but it was close. As I explained to the class, models are useful to help explain behavior but they may not be accurate. We also calculated the probability of sample mean equaling the population mean and the probability of the sample mean being within 1.0 inches of the sample mean.
Next, we looked at a couple of situations, discussed the population and variable of interest and then compared the sampling distribution models for two different sample sizes. This helped students to get comfortable with specifying the sampling distribution model we were using.
Finally, we looked at two problem situations that assumed specific population parameters and then asked what percent samples of a certain size would fall within given ranges. This basically gets back to finding z-scores and working with a normal model. Students still wanted to use the population standard deviation when determining probabilities, but with some reinforcement, most students seemed to understand why the sampling distribution had a different standard deviation.
Afterward, we went through a review activity that I use often and described in a previous post. The second mid-term is tomorrow, after which we will begin developing the concept of confidence intervals.
Subscribe to:
Posts (Atom)