Skip to content

S2 Chapter 6: Sampling Distributions

From Single Numbers to Patterns: Understanding the Nature of Statistical Investigation

Section titled “From Single Numbers to Patterns: Understanding the Nature of Statistical Investigation”

Imagine you’re the quality control manager at a smartphone factory producing 10,000 phones daily. How do you ensure quality without testing every single phone? Or consider a political poll predicting election results from just 1,500 voters out of millions. How can such small samples reveal meaningful truths about vast populations?

This chapter explores the mathematical foundation that makes statistical inference possible — the theory of sampling distributions.

1.1 A Real-World Mystery: The Mobile Game Investigation

Section titled “1.1 A Real-World Mystery: The Mobile Game Investigation”

Before diving into formal definitions, let’s explore these concepts through a scenario that might be very familiar to you.

This investigation perfectly illustrates why we need to study sampling distributions. Let’s now build the formal vocabulary to analyze such problems systematically.

1.2 Building Our Vocabulary — The Five Fundamental Concepts

Section titled “1.2 Building Our Vocabulary — The Five Fundamental Concepts”

Now that we’ve seen these concepts in action, let’s define them precisely:

1.3 Real-World Examples: Connecting Concepts to Life

Section titled “1.3 Real-World Examples: Connecting Concepts to Life”

1.4 Statistics: The Bridge Between Sample and Population

Section titled “1.4 Statistics: The Bridge Between Sample and Population”

Now let’s focus on the most crucial concept: what exactly makes something a “statistic”?

Let’s test your understanding with concrete examples:

The Critical Insight: Statistics are our “messengers” — they carry information from the sample to help us learn about the unknown population. But they’re imperfect messengers because they vary from sample to sample!

2. The Revolutionary Concept: Sampling Distributions

Section titled “2. The Revolutionary Concept: Sampling Distributions”

Remember our SSR investigation? We observed p^=0.015\hat{p} = 0.015 from 200 draws, which is higher than the claimed 0.01. But before concluding the game company is lying, we need to understand: How much should p^\hat{p} vary due to random sampling?

2.2 Discovering Sampling Distributions Through Simulation

Section titled “2.2 Discovering Sampling Distributions Through Simulation”

This experiment demonstrates the revolutionary insight: instead of thinking of p^\hat{p} (or any statistic) as just a number, we recognize it as a random variable with its own distribution.

The Key Insight: Every time you take a sample, your statistic will be different. The sampling distribution tells you how these different values are distributed and helps you distinguish between “normal variation” and “something unusual is happening.”

2.4 Mathematical Analysis: From Simulation to Theory

Section titled “2.4 Mathematical Analysis: From Simulation to Theory”

Now that we’ve experienced sampling distributions through simulation, let’s see how to construct them mathematically. We’ll use a different discrete example to build our theoretical understanding.

Now we can return to our original question with the proper theoretical framework!

3.1 Deeper Analysis: Using the Right Distribution

Section titled “3.1 Deeper Analysis: Using the Right Distribution”

Now let’s approach this problem with the most appropriate statistical model. Since we’re dealing with rare events (low probability, large sample), the Poisson distribution is perfect!

4. Preview: The World of Hypothesis Testing

Section titled “4. Preview: The World of Hypothesis Testing”

What we’ve just done is the foundation of statistical hypothesis testing — the subject of our next chapter!

The Process We Followed:

  1. Null Hypothesis: Assume the company is honest: “True SSR rate = 1%”
  2. Choose Right Statistic: Count of SSR cards: X=0X = 0 (better than proportion for rare events)
  3. Find Sampling Distribution: Under null hypothesis, XPoisson(2)X \sim \text{Poisson}(2)
  4. Calculate p-value: P(X0)=0.135P(X \geq 0) = 0.135 (probability of our evidence or stronger)
  5. Make Decision: 13.5% is quite high → insufficient evidence to reject company’s claim

Why This Approach is Powerful:

  • Objective: We use precise probability calculations instead of subjective judgment
  • Calibrated: We quantify exactly how unusual our observation is
  • Fair: We give the company the “benefit of the doubt” (assume innocence first)
  • Systematic: The same process works for any claim about any population parameter

Coming Next Chapter — Formal Hypothesis Testing:

  • How to set up null and alternative hypotheses systematically
  • Decision rules: When is evidence “strong enough” to reject a claim?
  • One-tailed vs two-tailed tests: Directional vs non-directional claims

The Revolution: We’ve moved from “That seems suspicious…” to “There’s a 13.5% chance of this happening by coincidence.” This precision transforms business decisions, scientific conclusions, and public policy!

Statistical Wisdom: You’ve now experienced the evolution from “gut feeling” → “precise probability” → “rational decision.” This is the essence of scientific thinking!