Unit 8Big Ideas

Data, Variability, and Statistical Questions

This week, your student will work with data and use data to answer statistical questions. Questions such as “Which band is the most popular among students in sixth grade?” or “What is the most common number of siblings among students in sixth grade?” are statistical questions. They can be answered using data, and the data are expected to vary (i.e. the students do not all have the same musical preference or the same number of siblings).

Students have used bar graphs and line plots, or  dot plots, to display and interpret data. Now they learn to use histograms to make sense of numerical data. The following dot plot and histogram display the distribution of the weights of 30 dogs.

A dot plot shows individual data values as points. In a histogram, the data values are grouped. Each group is represented as a vertical bar. The height of the bar shows how many values are in that group. The tallest bar in this histogram shows that there are 10 dogs that weigh between 20 and 25 kilograms.

The shape of a histogram can tell us about how the data are distributed. For example, we can see that more than half of the dogs weigh less than 25 kilograms, and that a dog weighing between 25 and 30 kilograms is not typical.

Here is a task to try with your student:

This histogram shows the weights of 143 bears.

A histogram. The horizontal axis is labeled "weight in pounds" and the numbers 0 through 550, in increments of 50, are indicated. On the vertical axis, the numbers 0 through 40, in increments of 5, are indicated. There are also tick marks midway between. The approximate data for the bars are as follows: From 0 up to 50 pounds, 6 bears  From 50 up to 100 pounds, 18 bears From 100 up to 150 pounds, 40 bears  From 150 up to 200 pounds, 28 bears  From 200 up to 250 pounds, 14 bears  From 250 up to 300 pounds, 7 bears  From 300 up to 350 pounds, 11 bears   From 350 up to 400 pounds, 10 bears  From 400 up to 450 pounds, 6 bears From 450 up to 500 pounds, 2 bears From 500 up to 550 pounds, 1 bear
  1. About how many bears weigh between 100 and 150 pounds?

  2. About how many bears weigh less than 100 pounds?

  3. Noah says that because almost all the bears weigh between 0 and 500 pounds, we can say that a weight of 250 pounds is typical for the bears in this group. Using the histogram, explain why this is incorrect.

Solution:

  1. About 40 bears. This is the height of the tallest bar of the histogram.
  2. About 24 bears. The two leftmost bars represent the bears that weigh less than 100 pounds. Add the heights of these two bars.
  3. We can visually tell from the histogram that most bears weigh less than 250 pounds: the bars to the left of 250 are taller than those to the right. If we add the heights of bars, fewer than 40 bears weigh more than 250 pounds, while over 100 bears weigh less than 250 pounds, so it is not accurate to say that 250 pounds is a typical weight.

Mean

This week, your student will learn to calculate and interpret the mean, or the average, of a data set. We can think of the mean of a data set as a fair share—what would happen if the numbers in the data set were distributed evenly. Suppose a runner ran 3, 4, 3, 1, and 5 miles over five days. If the total number of miles she ran, 16 miles, was distributed evenly across five days, the distance run per day, 3.2 miles, would be the mean. To calculate the mean, we can add the data values and then divide the sum by how many there are.

If we think of data points as weights along a number line, the mean can also be interpreted as the balance point of the data. The dots show the travel times, in minutes, of Lin and Andre. The triangles show each mean travel time. Notice that the data points are “balanced” on either side of each triangle.

Here is a task to try with your student:

  1. Use the data on Lin’s and Andre’s dot plots to verify that the mean travel time for each student is 14 minutes.
  2. Andre says that the mean for his data should be 13 minutes, because there are two numbers to the left of 13 and two to the right. Explain why 13 minutes cannot be the mean.

Solution:

  1. For Lin’s data, the mean is \frac{8 + 11 + 11 + 18 + 22}{5} = \frac{70}{5} , which equals 14. For Andre’s data, the mean is \frac{12 + 12 + 13 + 16 + 17}{5} = \frac{70}{5} , which also equals 14.

  2. Explanations vary. Sample explanations:

    • The mean cannot be 13 minutes because it does not represent a fair share.
    • The mean cannot be 13 minutes because the data would be unbalanced. The two data values to the right of 13 (16 and 17) are much further away from the two that are to the left (12 and 12).

Median and Box Plots

This week, your student will learn to use the median to summarize the distribution of data.

The median is the middle value of a data set whose values are listed in order. To find the median, arrange the data in order from least to greatest, and look at the middle of the list.

Suppose nine students reported the following numbers of hours of sleep on a weeknight.

6 7 7 8 9 9 10 11 12

The middle number in 9, so the median number of hours of sleep is 9 hours. This means that half of the students slept for less than or equal to 9 hours, and the other half slept for greater than or equal to 9 hours.

Suppose eight teachers reported these numbers of hours of sleep on a weeknight.

5 6 6 6 7 7 7 8

This data set has an even number of values, so there are two numbers in the middle—6 and 7. The median is the number exactly in between them: 6.5. In other words, if there are two numbers in the middle of a data set, the median is the average of those two numbers.

The median marks the 50th percentile of sorted data. It breaks a data set into two halves. Each half can be further broken down into two parts so that we can see the 25th and 75th percentiles. The 25th, 50th, and 75th percentiles are called the first, second, and third quartiles (or Q1, Q2, and Q3).

A box plot is a way to represent the three quartiles of a data set, along with its maximum and minimum. This box plot shows those five numbers for the data on the students’ hours of sleep.

Here is a task to try with your student:

  1. Here is a table showing the number of points Jada scored in 10 basketball games.
    10 14 6 12 38 12 8 7 10 23

    What is her median score?

Solution:

  1. 11 points. First, sort the data: 6, 7, 8, 10, 10, 12, 12, 14, 23, 38. Then look at the middle of the list: the numbers 10 and 12 are the fifth and sixth numbers in the list. The median is the average of these numbers: \frac{10+12}{2} = 11 .