Category Archives: Lessons

Some lightly edited thoughts about teaching probability

Michael Pershan boldly posted How I Teach Probability recently. I have struggled mightily with the teaching of probability over the years, so I took his video as an invitation to share and discuss.


The following is a copy/paste of an email I sent him (with, perhaps, some light editing).

I have been working on a longer, better edited version of the basic idea that I lay out here. But I’m working on tons of other stuff, too, so I cannot promise that the other one will ever see the light of day.

So here you are. Enjoy.

An important bit of honesty first: I am crap at teaching probability. Many reasons, mostly having to do with the ephemerality and abstraction of the topic, in contrast to the cold, hard demonstrable reality of (say) fractions.

With that said, I seem to have been making progress with an approach that has one important extended feature to the sort of work you show in your video.

Your video has (1) a scenario, (2) guesses, (3) small group data collection, and (4) large group data collection.

To that I add (2.5) explicit discussion of students’ probability models that inform their guesses, and then (5) some related follow-up activities.

After collecting guesses, I ask someone in class to describe WHY they said what they said. My job, then, is to press for details and to capture what they are thinking as carefully and accurately as possible on the board.

Once I have that, I ask for someone to describe a different way of thinking about it. Even if slightly different, we get it on the board. My writing isn’t abaout me dispensing  wisdom, it is about making a permanent record of a student’s model in enough detail that we will be able to test it later. For anything even moderately complicated (such as rolling two dice and considering their sum), I am disappointed if we don’t get at least four different models.

Now the data collection doesn’t just tell us who guessed closest; it can rule out at least some of our models.

As an example, in your scenario I would expect something like this:

Model 1: Zero multiples of three and and one multiple of three are equally likely, so I’ll bet on either one. There are only four possibilities: 0, 1, 2 and 3 multiples of three, so the probability of each of these outcomes is 1/4.

Model 2: There is more than one way to get 1 multiple of 3. Our model should account for that. There are three ways to get 1 multiple of 3 (on Die1, Die2 or Die3), three ways to get 2 multiples of 3, and only 1 way each to get 0 or 3 multiples of 3. That’s eight possibilities, so the probability of getting 1 multiple of 3 is 3/8, while 0 multiples is 1/8.

Model 3: Each die can come up either “Mo3” or “NotMo3”. Getting all “NotMo3” has probability 1/8 (1/2*1/2*1/2), while getting one “Mo3” and 2 “NotMo3” is also 1/8, but there are 3 ways to do it, so it’s 3/8 altogether.

Et cetera

As many different ways of thinking about it as my students have, I will dutifully record. I encourage argument, as each new argument suggests a new model, which we can test.

Now before we roll those dice, we set up a way to test these models. In my experience, I have to devise that test. My students are not sophisticated enough to think this way yet.

In your example, and with these models, I might identify an important difference to be that Model 1 predicts equal numbers of 0 multiples of 3 and 1 multiple of 3, while Models 2 and 3 predict three times as many 1s as 0s. Hopefully we also have a model that predicts something in between.

So now as we roll, we are not just looking for which happens more often, but to relative frequency. Because the model that better predicts the relative frequency that actually happens has got to be the better model.

Much more to say, examples to offer, etc.

Skills practice [#NCTMDenver]

I attended E. Paul Goldenberg’s session on Thursday of NCTM in Denver. It was not at all, as advertised, in keeping with the proof strand. But that does not matter.

What matters is this. Goldenberg shared the video below. The whole video is worth your time, but I have queued it up to the 2-minute mark, where a beautiful classroom sequence unfolds (give yourself about 5 minutes for it).

My eyes tear up watching this sequence. I am neither kidding nor exaggerating. It gives me hope for quality classroom instruction in elementary mathematics.

Be sure to notice the transition to a new task at the 4-minute mark, and how the teacher deals with the struggle that occurs at the 6-minute mark.

Also please look in the kids’ eyes. Watch their body language and their waving hands. Watch them think.

Kids are practicing facts in this classroom. The teacher is providing instruction. Contrast with this.

[NOTE: As of 5/2/2013, the video referred to seems to have been removed from YouTube. My apologies. Go search YouTube for “EDI math” and you’ll find plenty of examples that are essentially equivalent to the one I refer to below.]

You can flip this latter instructional sequence because it involves telling and choral response.

You cannot flip the first instructional activity because it involves  adapting instruction in response to student ideas, and it involves students justifying their thinking to the teacher and to each other.

You can’t flip that.

[NOTE: I have edited some of the comments below in order to focus on the practices that were exemplified in the videos (one of which is now private), rather than on the teachers in them. See my post on norms a while back. My apologies to anyone who feels their words have been altered in ways that do not convey their original meaning.]

Experiments with datasets

Go out and collect a modest-sized, discrete dataset. Name lengths of all of the students in your classroom, say, or the number of people in each of their households.

This bar graph is only tangentially relevant, being more of a case-value plot of four different populations. But it breaks up a texty post. So deal with it.

This bar graph is only tangentially relevant, being more of a case-value plot of four different populations. But it breaks up a texty post. So deal with it.

Now play with that data.

If we add one or more new (hypothetical) cases, can we…

  1. Increase both the median and the mean?
  2. Decrease both the median and the mean?
  3. Increase the mean while decreasing the median?
  4. Vice versa?
  5. Increase the Mean Absolute Deviation (MAD) while decreasing the mean?
  6. Vice versa?
  7. Decrease both the MAD and the range?
  8. Decrease the MAD while increasing the range?
  9. Vice versa?

If we delete one or more actual cases, can we…

  1. [same list as before]

Thanks to Susan Friel, Connected Mathematics and tons of other creative folks for getting me started with this. Future elementary teachers to tackle this shortly. I’ll report back.

Measurement, explored

This idea started with someone else, but I do not remember his name. I believe he’s a shop teacher in a Twin Cities suburb. Inver Grove Heights, maybe? In any case, he was in a professional development session I was helping to run this year on the topic of fractions. We had a conversation over lunch in which he recounted a lesson he did that became the basis of the activity I am about to describe. If I can dig up the originator, I’ll revise to give credit.

In any case, while the kernel of this idea originated with someone else, I have given it the usual OMT treatment—expanding and complexifying in many ways.

Regular readers will know that I am always in search of ways to get my future elementary teachers to explore old ideas in new ways. Consider the cases of place value and the hierarchy of quadrilaterals. In that spirit, I give you the measurement exploration extravaganza. Do with it what you will.

The premise

Groups of three are each given a dowel (or, in this year’s case, a paper strip). The dowels vary in length. The lengths are chosen to provide a useful combination of compatability and incompatability. One may be 9 inches long, while another is 15 inches long. Choose numbers according to the skill level and age of your students (and yourself!)

But-and this is important-THESE LENGTHS ARE NEVER SPOKEN OF! You will never refer to these dowels using standardized lengths.

Each group names its unit. In recent semesters, we have had:

  • Stick
  • Woody
  • Shroydelshnop
  • Oompa Loomp
  • BOG
  • Ablue
  • Pen
  • Et cetera


The members of the group measure some stuff with their units. They make a tape measure to use for this purpose, and they decide how long a tape measure they would like to have.

For example How tall are you in Sticks? requires (in all likelihood) a tape measure that is several Sticks long. Well, it does not require such a thing, but such a thing facilitates this measurement.

At this point, students are measuring only with their own units. It usually occurs to them to subdivide the unit in some way, and they will frequently report out fractions of (say) a Stick.

Next, each group is responsible for creating a partitioned unit from their original. They choose how many of these smaller units make up the original, and they name the smaller unit.

And then they create a composed unit from their original. Again, the choice is theirs to determine the number of original units that make up a composed unit. And again they are tasked with naming the composed unit.

interlude for important observations

The fun has only just begun and already we stumble upon some beautiful insights. Among them are these:

  1. Students nearly always partition in 4ths, 8ths and 16ths.
  2. Students almost never partition into 10ths.
  3. Students may group in threes or sixes, but they never ever partition this way.
  4. Students rarely think to group the same way they partition. That is, if they made 8ths, they might very well group in sixes. The convenience that would be afforded by consistency does not tend to occur to them in advance.

back to the instructional sequence

Now that we have the units, we need to measure some stuff. I typically choose things in our classroom environment. It is important that we all measure the same things and that these things range from smaller than the original unit to larger than the composed unit.

We need to express our measurements in (1) partitioned units only, (2) original units only, and (3) composed units only.

unitsThis semester I had students look at this table and I asked What do you notice? and What do you wonder? (These questions are, of course, not original to me. But this was a productive place to ask them.)

Working across systems

Next, it’s time to switch things up. We put the table away. Each group passes their  original unit, together with instructions for creating a partitioned unit and a composed unit (and the names of these) to another group.

Now each group is charged with these tasks:

  1. Get to know the three units that have been handed to you.
  2. Express relationships between your units and these new ones.
  3. For each thing you measured (table, licorice fish, etc.), make this prediction: If you were to measure that thing with these new units, would you end up with a greater or lesser value than when you measured in your own units? (In this step, do not compute; make a qualitative comparison instead.)
  4. Compute your height in these new units, and compute at least 6 of the measurements in the grid.

You have never seen such fraction computation work as proceeds from this sequence of tasks. 

Now we list these computed measurements on the board, compare to the table we generated earlier and discuss reasons for discrepancies.

We write about these reflection questions:

  1.  How do your three units compare to a standard measurement system?
  2. How is using someone else’s units like (or unlike) converting between standard and metric systems?
  3. How did your choices for partitioning, composing and naming support or impede your work?
  4. What do you need in order to be able to do these computations on your own?

On to area

Next, students build each of their units into square units.

We consider the essential questions:

  1. How many square partitioned units in a square original unit?
  2. How many square original units in a square composed unit?
  3. How many square partitioned units in a square composed unit?
  4. Most importantly: How do you know each of these?

Sample student observations at this point: 

  • Wow. The square partitioned unit looks a lot smaller relative to the square original unit than I expected.
  • Oh no! Why did we decide to put so many original units together to make the composed unit?

Now we measure something. 

This time around, I had them measure the area of a whiteboard in our classroom. Not the most exciting measurement to make, but straightforward and accessible. Working with these new square units is challenging enough; no need to get too fancy. It is important that the measurement be concrete and tangible, not abstract.

Students are encouraged to use known relationships in order to avoid tedious measurements, and to measure in order to avoid tedious computations.

Importantly (I think), most students want to use these square units to measure, rather than to measure with their tape measures and compute.


We use these experiences to discuss differences—both practical and conceptual—among measuring by (1) iterating and counting units, (2) using tools, and (3) computation.

We reflect on what these experiences can tell us about working within and across measurement systems.

We build on our fraction work and on the meanings of multiplication and division that were the focus of the preceding course.

I have not had students move to cubic units.

A place value thought experiment

The hundreds chart is a fixture of elementary classrooms. Such a fixture that most of us probably don’t stop to think about it.

That’s where I come in.


The trouble with place value is that it is too easy.

Q: Why does the hundreds chart have 10 columns?

A: Because our number system is base-10.

Q: Why does the hundreds chart have 10 rows?

A: Because our number system is base-10.

These answers are so simple that they mask the conceptual complexity underlying place value.

The hundreds chart is rich with patterns.

The double-digit numbers lie diagonally.


If you start with a number in the top row and read diagonally down and to the left, the digits of the numbers sum to the number in the top row.

Example starting with 9.


Example starting with 8.


And on and on.

Why do these patterns exist? Because the structure of the hundreds chart matches the structure of the number system.

But there is something unsatisfying about this answer.

So here’s a thought experiment.

The Mayans had a quasi-base-20 place value number system. Quasi-base-20 because the third place was not worth 20 twenties, but only 18 twenties. All other places have value 20 times the previous.

Imagine stepping into a second-grade classroom in a modern society that used the Mayan numeration system. What chart would they have on their walls instead of our hundreds chart?

What would a Mayan “hundreds” chart look like?

I have used this question as one of two parts of an A assignment in my math content course for future elementary and special ed teachers for several years*.

A common answer in students’ first drafts is the following (this image from wikipedia):

This is no good.

We want to represent place value in the hundreds chart and this chart does not do that. All of these are single digit numbers as far as Mayan place value is concerned.

That chart above is the equivalent of one that goes 0—9 in our decimal place value system.

Another common example has literally 100 cells, in 10 rows of 10.

Also no good. That chart is based on the structure of our number system, not the structure of the Mayan number system.

No, we want a Mayan “hundreds” chart that has patterns equivalent to those we find in our hundreds chart. Patterns such as the ones highlighted above.

Here is what we need.

Credit to student Angela Drietz for the complete chart.

Credit to student Angela Drietz for the complete chart.

Here are the double-digit numbers.



And if you look closely, you can add the “digits” on the leftward-running diagonal to get the number in the top row.



That, my friends, is the beauty of place value. It’s not 10, the quantity, that is special. It’s the set of symbols. It’s the 1 and the 0.


* The other part is to create number language to reflect the place value structure of the Mayan number system. How might the Mayans have read these numbers aloud? As far I know, no one knows the answer to how they did read them aloud, so the task allows for structured creativity.