Lesson 16Comparing Populations Using Samples

Learning Goal

Let’s compare different populations using samples.

Learning Targets

  • I can calculate the difference between two medians as a multiple of the interquartile range.

  • I can determine whether there is a meaningful difference between two populations based on a sample from each population.

Warm Up: Same Mean? Same MAD?

Problem 1

Without calculating, tell whether each pair of data sets have the same mean and whether they have the same mean absolute deviation.

  1. set A

    • 1

    • 3

    • 3

    • 5

    • 6

    • 8

    • 10

    • 14

    set B

    • 21

    • 23

    • 23

    • 25

    • 26

    • 28

    • 30

    • 34

  2. set X

    • 1

    • 2

    • 3

    • 4

    • 5

    set Y

    • 1

    • 2

    • 3

    • 4

    • 5

    • 6

  3. set P

    • 47

    • 53

    • 58

    • 62

    set Q

    • 37

    • 43

    • 68

    • 72

Activity 1: With a Heavy Load

Problem 1

Consider the question: Do tenth-grade students’ backpacks generally weigh more than seventh-grade students’ backpacks?

Here are dot plots showing the weights of backpacks for a random sample of students from these two grades:

Two dot plots for “backpack weight in pounds” are labeled "grade 7" and "grade 10," and the numbers 1 through 22 are indicated. The data are as follows:  Grade 7: 2 pounds, 1 dot. 3 pounds, 2 dots. 4 pounds, 3 dots. 5 pounds, 2 dots. 6 pounds, 1 dot. 7 pounds, 2 dots. 9 pounds, 1 dot. 11 pounds, 1 dot. 12 pounds, 1 dot. 13 pounds, 1 dot.  Grade 10: 10 pounds, 1 dot. 11 pounds, 2 dots. 12 pounds, 1 dot. 13 pounds, 2 dots. 14 pounds, 2 dots. 15 pounds, 2 dots. 16 pounds, 1 dot. 18 pounds, 2 dots. 20 pounds, 1 dot. 22 pounds, 1 dot.
  1. Did any seventh-grade backpacks in this sample weigh more than a tenth-grade backpack?

  2. The mean weight of this sample of seventh-grade backpacks is 6.3 pounds. Do you think the mean weight of backpacks for all seventh-grade students is exactly 6.3 pounds?

  3. The mean weight of this sample of tenth-grade backpacks is 14.8 pounds. Do you think there is a meaningful difference between the weight of all seventh-grade and tenth-grade students’ backpacks? Explain or show your reasoning.

Activity 2: Do They Carry More?

Ten dot plots of ten samples of student backpack weights ranging from 0 to 18.

Here are 10 more random samples of seventh-grade students’ backpack weights.

sample
number

mean weight
(pounds)

Problem 1

  1. Which sample has the highest mean weight?

  2. Which sample has the lowest mean weight?

  3. What is the difference between these two sample means?

Problem 2

All of the samples have a mean absolute deviation of about 2.8 pounds. Express the difference between the highest and lowest sample means as a multiple of the MAD.

Problem 3

Are these samples very different? Explain or show your reasoning.

Problem 4

Remember our sample of tenth-grade students’ backpacks had a mean weight of 14.8 pounds. The MAD for this sample is 2.7 pounds. Your teacher will assign you one of the samples of seventh-grade students’ backpacks to use.

  1. What is the difference between the sample means for the the tenth-grade students’ backpacks and the seventh-grade students’ backpacks?

  2. Express the difference between these two sample means as a multiple of the larger of the MADs.

Problem 5

Do you think there is a meaningful difference between the weights of all seventh-grade and tenth-grade students’ backpacks? Explain or show your reasoning.

Activity 3: Steel from Different Regions

When anthropologists find steel artifacts, they can test the amount of carbon in the steel to learn about the people that made the artifacts. Here are some box plots showing the percentage of carbon in samples of steel that were found in two different regions:

Region 1 box plot. Min (0.42), LQ (0.62), Median (0.64, UQ (0.67), Max (0.70). Region 2 box plot. Min (0.37), LQ (0.45), Median (0.46), UQ (0.48), Max (0.57). (Values approximate)

Problem 1

Was there any steel found in region 1 that had:

  1. more carbon than some of the steel found in region 2?

  2. less carbon than some of the steel found in region 2?

Problem 2

Do you think there is a meaningful difference between all the steel artifacts found in regions 1 and 2?

Problem 3

Which sample has a distribution that is not approximately symmetric?

Problem 4

What is the difference between the sample medians for these two regions?

sample
median (%)

IQR
(%)

region 1

region 2

Problem 5

Express the difference between these two sample medians as a multiple of the larger interquartile range.

Problem 6

The anthropologists who conducted the study concluded that there was a meaningful difference between the steel from these regions. Do you agree? Explain or show your reasoning.

Lesson Summary

Sometimes we want to compare two different populations. For example, is there a meaningful difference between the weights of pugs and beagles? Here are histograms showing the weights for a sample of dogs from each of these breeds:

A histogram for two different populations: On the horizontal axis, the numbers 6 through 11, in increments of zero point 5, are indicated. The label “pug weights in kilograms” is indicated for the numbers 6 through 8 and “beagle weights in kilograms” is indicated for the numbers 9 through 11. On the vertical axis, the numbers 0 through 8 are indicated. The data represented by the bars are as follows:   Pug weights in kilograms: Weight from 6 up to 6 point 5, 5. Weight from 6 point 5 up to 7, 5. Weight from 7 up to 7 point 5, 7. Weight from 7 point 5 up to 8, 3. A triangle is located at 6 point 9 kilograms.  Beagle weights in kilograms: Weight from 9 up to 9 point 5, 3. Weight from 9 point 5 up to 10, 3. Weight from 10 up to 10 point 5, 8. Weight from 10 point 5 up to 11, 6. A triangle is located at 10 point 1.

The red triangles show the mean weight of each sample, 6.9 kg for the pugs and 10.1 kg for the beagles. The red lines show the weights that are within 1 MAD of the mean. We can think of these as “typical” weights for the breed. These typical weights do not overlap. In fact, the distance between the means is or 3.2 kg, over 6 times the larger MAD! So we can say there is a meaningful difference between the weights of pugs and beagles.

Is there a meaningful difference between the weights of male pugs and female pugs? Here are box plots showing the weights for a sample of male and female pugs:

Two box plots labeled “male pug weights in kilograms” and “female pug weights in kilograms” are indicated. The numbers 4 through 8 point 5, in increments of zero point 5, are indicated. The five-number summary for the box plots are as follows:   Male pug weights in kilograms: Minimum value, 6 point 4. Maximum value, 8 point 3. Q1, 7 point 2. Q2, 7 point 6. Q3, 7 point 9.  Female pug weights in kilograms: Minimum value, 6 point 2. Maximum value, 8. Q1, 6 point 4. Q2, 6 point 9. Q3, 7 point 3.

We can see that the medians are different, but the weights between the first and third quartiles overlap. Based on these samples, we would say there is not a meaningful difference between the weights of male pugs and female pugs.

In general, if the measures of center for two samples are at least two measures of variability apart, we say the difference in the measures of center is meaningful. Visually, this means the range of typical values does not overlap. If they are closer, then we don’t consider the difference to be meaningful.