Kellogg’s has issued Froot Loops fruit snacks in the shape of digits. (Side note: Cheez-Its need to get on board with this! There have been Scrabble tile Cheez-Its for years. We want numbers, operations and relational symbols!)

Naturally I bought some.

Tabitha (8 years old) asked—as she does in these scenarios which occur with great frequency—Are you just buying that because it’s mathy?

Yes, sweetie. Yes I am.

But how to put them to use?

After many rejected ideas, here’s my favorite.

The task

Here are the contents of one pack.

That’s 5, 2, 9, 1, 3, 2, 4, 3, 9. Their sum is 38.

I’m setting the over/under on the sum of the next pack at 41. Do you want the over or the under? Why?

Play along with your questions and answers in the comments.

Michael Pershan boldly posted How I Teach Probabilityrecently. I have struggled mightily with the teaching of probability over the years, so I took his video as an invitation to share and discuss.

The following is a copy/paste of an email I sent him (with, perhaps, some light editing).

I have been working on a longer, better edited version of the basic idea that I lay out here. But I’m working on tons of other stuff, too, so I cannot promise that the other one will ever see the light of day.

So here you are. Enjoy.

—

An important bit of honesty first: I am crap at teaching probability. Many reasons, mostly having to do with the ephemerality and abstraction of the topic, in contrast to the cold, hard demonstrable reality of (say) fractions.

With that said, I seem to have been making progress with an approach that has one important extended feature to the sort of work you show in your video.

Your video has (1) a scenario, (2) guesses, (3) small group data collection, and (4) large group data collection.

To that I add (2.5) explicit discussion of students’ probability models that inform their guesses, and then (5) some related follow-up activities.

After collecting guesses, I ask someone in class to describe WHY they said what they said. My job, then, is to press for details and to capture what they are thinking as carefully and accurately as possible on the board.

Once I have that, I ask for someone to describe a different way of thinking about it. Even if slightly different, we get it on the board. My writing isn’t abaout me dispensing wisdom, it is about making a permanent record of a student’s model in enough detail that we will be able to test it later. For anything even moderately complicated (such as rolling two dice and considering their sum), I am disappointed if we don’t get at least four different models.

Now the data collection doesn’t just tell us who guessed closest; it can rule out at least some of our models.

As an example, in your scenario I would expect something like this:

Model 1: Zero multiples of three and and one multiple of three are equally likely, so I’ll bet on either one. There are only four possibilities: 0, 1, 2 and 3 multiples of three, so the probability of each of these outcomes is 1/4.

Model 2: There is more than one way to get 1 multiple of 3. Our model should account for that. There are three ways to get 1 multiple of 3 (on Die1, Die2 or Die3), three ways to get 2 multiples of 3, and only 1 way each to get 0 or 3 multiples of 3. That’s eight possibilities, so the probability of getting 1 multiple of 3 is 3/8, while 0 multiples is 1/8.

Model 3: Each die can come up either “Mo3″ or “NotMo3″. Getting all “NotMo3″ has probability 1/8 (1/2*1/2*1/2), while getting one “Mo3″ and 2 “NotMo3″ is also 1/8, but there are 3 ways to do it, so it’s 3/8 altogether.

Et cetera

As many different ways of thinking about it as my students have, I will dutifully record. I encourage argument, as each new argument suggests a new model, which we can test.

Now before we roll those dice, we set up a way to test these models. In my experience, I have to devise that test. My students are not sophisticated enough to think this way yet.

In your example, and with these models, I might identify an important difference to be that Model 1 predicts equal numbers of 0 multiples of 3 and 1 multiple of 3, while Models 2 and 3 predict three times as many 1s as 0s. Hopefully we also have a model that predicts something in between.

So now as we roll, we are not just looking for which happens more often, but to relative frequency. Because the model that better predicts the relative frequency that actually happens has got to be the better model.

I had a lovely argument with Chris Lusto about luck yesterday. If you’re going to take the time to dig into this, read his post first.

As I see it, we have outlined two stands on the meaning of luck.

Luck describes the fact that what happened is not what we expected to happen. When what happened is better than we expected, that is good luck. When it is worse, that is bad luck. In this view, luck is not causal. It doesn’t make stuff happen. It simply describes the size and direction of variation from the mean.

Luck causes things to happen, or at least makes them more (or less) likely to happen. When what happened is better than we expected, we don’t just describe this as lucky, we attribute the cause of the event to luck. In this view, luck is viewed as a semi-controllable bias; a property with which some people or objects are imbued.

I argue that (1) is a probabilistically sophisticated view, but that (2) is much closer to the commonly held meaning. Furthermore, I argue that probability and statistics instruction that doesn’t directly address (2) is unlikely to make useful shifts in people’s thinking about probability and chance.

I got to thinking about all of this in reading Thinking Fast and Slow by Daniel Kahneman this summer. Kahneman doesn’t define luck. But he uses the term.

A few years ago, John Brockman, who edits the online magazine Edge, asked a number of scientists to report their “favorite equation”. These were my offerings:

success=talent+luck
great success=a little more talent+a lot of luck

The unsurprising idea that luck often contributes to success has consequences when we apply it…

He goes on to describe a common scenario in which a golfer scores very well on day 1 of a tournament and more poorly on day 2.

I don’t question whether Kahneman himself understands luck in the sense of either (1) or (2) above. I do think, however, that he is writing in a way that supports (2), and I think that’s unfortunate (heh).

The key for me is that Kahneman doesn’t write: “success=talent+random variation from the mean”. He writes “success=talent+luck” and he writes “luck often contributes to success”.

Luck contributes. In the same sense that talent contributes. That’s causal, and that’s conception (2).

Failing to see eye-to-eye with both Lusto and Kahneman (two formidable minds, to be sure), I went to my next favorite source of wisdom.

Me: What is luck? What does it mean to be lucky?

Griffin (8 years old): Well, if you found five dollars on the street, you would be lucky. Or if I opened my drawer and there was a bunch of gold in there, that would be lucky.

Me: Why would that be lucky?

G: Cause five dollars is a lot of money. And because gold is very valuable.

Me: What if you found a penny? Would that be lucky?

G: If it was heads-up it would be. Some people think that finding a penny heads-up is lucky.

Me: Right. But what would that mean for it to be lucky?

G: Well, if it was New Year’s Eve, you could make three wishes and they would come true.

Me: OK. And what if it weren’t New Year’s Eve. What if it’s just a regular day and you’re walking up to Romolo’s [our neighborhood pizza joint] and you found a penny on the ground, heads-up. What would it mean for that to be a lucky thing to have happen?

G: Well maybe only one wish would come true.

Me: So if something is lucky, it makes good things happen, like wishes coming true. Is that right?

My son (Griffin, nearly 7) saw Hex Bugs in a store months ago and was instantly smitten. His dream came true recently when my wife bought him one.

The Hex Bug now does shows in his home/stadium. See video.

There is something quite lovely about this guy’s random walk. But I can’t quite figure out what scenario I can put him to generate a real probability problem. So I ask, What can you do with this?

As with all useful management strategies, I have no idea where I got this idea. If there’s something useful in it, please adapt and share widely.

At the college level, class attendance is very very different from the K-12 schools. Most college instructors don’t take attendance regularly. Of those who do, a large proportion pass around a sign in sheet.

Coming out of a K-12 tradition, I know that taking roll call is an important part of learning my students’ names (this, incidentally, is not universally valued in higher ed). But I don’t have time to take roll every class period.

Furthermore, I know that attendance is better when students know I am keeping track. And it’s even better if attendance is part of their grade. It turns out that it doesn’t seem to matter how much of their grade-just that it’s part.

I used to factor attendance in by counting the number of missed classes, and running that number through a rubric. But as I mentioned I don’t have time to take attendance every class period, and sometimes I forget. And then I am imperfect at reconstructing attendance after class is over (strong personalities tend to get noticed as absent more often than wallflowers do, for example).

You see the problem.

Enter the attendance spinner.

On 22 randomly selected days each semester, I begin class by putting the spinner on the document projector and spinning with great dramatic flourish. Each student present when the spinner is spun receives the number of attendance points that comes up on the spinner. Attendance is taken, attendance points are recorded and we begin class.

In order to get maximum credit for attendance towards your grade, you need to accumulate more than 100 attendance points. The next category is for 90-100 attendance points, etc. There are five categories.

You don’t know in advance on which days I will spin, but I do; they’re determined by a Fathom document at the beginning of the semester and posted where I can refer to it in my office.

If the 22 spins don’t total at least 101 points, we will adjust the cutoffs for the categories proportionately. This has never happened.

If you are absent or come after the spin, you get zero attendance points for the day.

I will not tell you whether we will spin next class, nor whether we spun last class (ask your classmates-odds are you have at least 40 of them).

If I was supposed to spin and I forget, I’ll do it next class. You’ll never know about it, though.

Recall that, in playing and analyzing dice games appropriate for middle school students’ study of probability, I was challenging my secondary methods students (to whom I refer as “483 students” after the course number, not because of how many of them there are) to justify that there are 3 ways, not 2, to roll a 10 when rolling two six-sided dice. The idea is that my future teachers know that there are 3 equally likely possibilities: (1) a 6 and a 4, (2) a 4 and a 6, and (3) two 5’s. But lots of seventh grade students do not know this. Instead, many will view the first two as being the same.

I pushed hard on this point. My students suggested making a 6 by 6 chart, which is useful for some seventh graders. They suggested rolling one die at a time, or rolling two different color dice, or rolling one die twice. Each of these has the same theoretical probability as rolling two identical dice simultaneously. But not all seventh graders know this. I pushed on.

In particular, I was hoping to challenge my 483 students to wrestle with the complicated relationship between theoretical and experimental probability. Most of the time in middle school classrooms we study both of these but we dismiss discrepancies by waving our hands and saying We don’t expect these to be exactly equal; we expect them to be close, and therefore we shouldn’t worry about goofy experimental probabilities.

I was pressing my 483 students to consider whether experimental probabilities can ever provide convincing evidence that our theoretical model is incorrect. A recent article in Mathematics Teaching in the Middle School described a lesson in which seventh graders were asked to decide which dice were loaded and which were fair. I recall a lesson in my educational statistics class in which the professor opened a new deck of cards, shuffled several times and drew cards from the top of the deck. She was curious how many red cards in a row we would have to see before we suspected that something was up.

My challenge to my students was in a similar spirit but I wanted to push them to design statistical tests that would demonstrate that “two ways to roll a 10″ is a flawed model. This meant they needed to outline their procedures in full, state the data they would collect and (most importantly) which results would support their theoretical model. I added the additional constraints that the test could not take longer than 10 minutes to run and that they needed to be willing to stake their teaching licenses on the outcome. OK, I was flexible on that last constraint, but it helped lend seriousness to their thinking.

So here are two of the tests they devised:

(1) We will roll two dice 100 times. We will count the number of doubles and the number of non-doubles. If there are only two ways to roll 10, then there are 15 non-doubles and 6 doubles. If there are three ways to roll 10, then there are 30 non-doubles and 6 doubles. In 100 rolls with our theoretical model, we expect 83 non-doubles. With the competing model, we expect 71 non-doubles. We’ll split the difference. If there are 77 or more non-doubles in 100 rolls, then our model is correct.

(2) Keep rolling until you get ten 10’s. If there are only two ways to roll a 10, then we should expect to have to roll 105 times to get ten 10’s. If there are three ways to roll a 10, then we should expect to roll 120 times. Again, we can split the difference; if our test yields more than 112 rolls, this indicates that there are three ways to roll a 10.

BEFORE READING FURTHER, jot down which of these two tests you think is better for demonstrating which model is correct (Hint: one of them is much better than the other).

Notice that test (1) relies on common denominators while test (2) relies on common numerators. That is, test (1) sets the total number of rolls and asks how many 10’s we got, while test (2) sets the number of 10’s and asks how many total rolls we made.

Each of these tests confirmed the correct model in a single trial in class.

But probability isn’t about one-time outcomes. It is about long-term results. So it’s worth asking whether our results in class were typical. In other words, how likely were these tests to work?

I have lately become curious about the potential for the software Fathom to help students to make these connections between experimental and theoretical probability. The software does lots of things well, but what makes it unique is its ability to do probability simulations (see my article in Mathematics Teacher).

We ran each test with dice in class a small number of times. In the time it took to run each test once, I set up a Fathom simulation, which can then be run many, many times. For the record, I think electronic simulations only make sense after collecting real-world data; otherwise they are too abstract for many students to learn from.

In 100 Fathom trials, test (2) only “works” 51 times. That is, the test is no better than a coin flip. Increase the number of 10’s required to 20 and the test still only succeeds 64% of the time.

Test (1) is much better. In 100 Fathom trials, the test “worked” 95 times.

It turns out that devising a good experiment to determine which model is better (order matters vs. order does not matter) is hard. Therefore, we shouldn’t be surprised (1) that middle school students find it challenging to decide which model is correct, (2) that their own models, which are based on their informal observation of experimental probabilities in the world around them, get in the way of analyzing theoretical probabilities, nor (3) that teaching probability is hard.

I taught a class at MSU, Mankato titled “Math 483: Advanced Viewpoints on 5-8 Mathematics”. The class had a variety of goals, including pedagogical and mathematical ones. On the pedagogy side of things, we planned lessons, we read The Teaching Gap, we viewed the TIMSS videos and others, etc. On the mathematics side, we worked problems that came directly from middle school curricula, and we investigated questions that go deeper than we would expect middle school students to go, but that form an important foundation for making instructional decisions with middle school students.

In this last category, I tried something new last semester that I wanted to share with a larger audience. I would love critical feedback and questions about the activity and readers’ ideas about the mathematics involved.

My Math 483 students (I will refer to them as my “483 students” from here on out although there are not 483 of them) as a group had quite limited experience with middle school students-this was their first class examining the teaching and learning of mathematics, and most of them have tended to envision becoming high school, not middle school, teachers. One of the roles I played in class was the voice of a middle school student.

We were playing two dice games in class this spring: the Sum Game and the Product Game. For those without experience with these two games, here is how they work. Two players are playing against each other, one is player A, the other is player B. They alternate turns rolling two dice. In the Sum Game, no matter who rolls, if the sum is odd, player A gets 1 point. If the sum is even, player B gets 1 point. The Product Game is the same except we use the product instead of the sum of the dice. In either case, the players roll some set of number of times (say 20) and the person with the most points at the end wins.

In analyzing whether each game is fair (in the sense of each person having the same probability of winning), my students made the claim that the probability of rolling an even sum is 18/36 because there are 36 equally likely outcomes and 18 of them are even.

My inner middle schooler questioned this calculation. My experience with seventh graders and probability is that they commonly consider (4,6) and (6,4) to be the same outcome: a 4 and a 6. The idea that the order of the dice matters is not intuitive to many middle school students. So I posed the question to my 483 students, “How would you convince a seventh grader that (4,6) and (6,4) are different rolls?”

As we worked through a variety of strategies, I came to realize that this wasn’t quite the right question. One of these might be closer to what I intended:

(1) How do you know your model (there are 36 different equally likely rolls of two six-sided dice) is the correct one? How do you really know that?

…or maybe…

(2) What evidence would it take to convince you that your model is incorrect?

…or maybe…

(3) Imagine we were not sure which model was correct, what experiment could we perform that would help us to decide?

In the next post, I’ll share my students’ answers to my original question, and the statistical tests they concocted to answer question number 3.