S2 Chapter 1: The Binomial Distribution
S2 Statistics: Chapter 1 — The Binomial Distribution
Section titled “S2 Statistics: Chapter 1 — The Binomial Distribution”Preface: Journey to 1654 France
Section titled “Preface: Journey to 1654 France”Welcome, fellow mathematical detectives! Today, we embark on a journey back to 1654 France, where we’ll step into the shoes of mathematicians to solve a puzzle that baffled the brightest minds of the era. This challenge not only gave birth to an entirely new branch of mathematics but directly leads us to our chapter’s central topic: The Binomial Distribution.
The story unfolds with two equally skilled knights engaged in a contest that was abruptly interrupted, creating a problem that would revolutionize mathematical thinking forever.
Act I: The Interrupted Game — A Historical Dilemma
Section titled “Act I: The Interrupted Game — A Historical Dilemma”Setting the Stage: The Problem of Points
Section titled “Setting the Stage: The Problem of Points”Picture this: Two equally skilled knights, Antoine and Blaise, are engaged in a dice-throwing competition in the royal court of France. The rules are elegantly simple:
- The first knight to win 3 rounds claims the entire prize of 64 gold coins
- Each round has an equal probability of being won by either knight
- The rounds are independent of each other
Current situation: Antoine leads with a score of 2:1.
Suddenly, a royal summons arrives! The King requires their immediate presence, and the game must be terminated at once. This creates our central dilemma:
Student Voting: Intuitive Approaches
Section titled “Student Voting: Intuitive Approaches”Before we dive into the mathematical solution, let’s consider some intuitive approaches:
Act II: The Genius Solution — Letters Between Mathematical Giants
Section titled “Act II: The Genius Solution — Letters Between Mathematical Giants”The Revolutionary Insight
Section titled “The Revolutionary Insight”The knight Blaise (who happened to be the mathematician Blaise Pascal) wrote to his friend Pierre de Fermat seeking a solution. Their correspondence revealed a revolutionary insight:
Reframing the Problem
Section titled “Reframing the Problem”To apply this insight, we need to determine what each knight needs to win:
- Antoine needs to win 1 more round to reach 3 total wins
- Blaise needs to win 2 more rounds to reach 3 total wins
Since each knight has equal skill ( for each round), and rounds are independent, we can reframe our question:
The Mathematical Solution
Section titled “The Mathematical Solution”Exhaustive Case Analysis
Section titled “Exhaustive Case Analysis”The game will end within 2 rounds maximum. Let’s enumerate all possible sequences:
Tree Diagram:
Sequence Analysis:
- A: Antoine wins in round 1 → Game over, Antoine wins ()
- BA: Blaise wins round 1, Antoine wins round 2 → Antoine wins ()
- BB: Blaise wins both rounds → Blaise wins ()
Probability Calculations
Section titled “Probability Calculations”
Fair Distribution: The 64 coins should be divided in the ratio
- Antoine receives: coins
- Blaise receives: coins
Deep Dive: Uncovering the Binomial Pattern
Section titled “Deep Dive: Uncovering the Binomial Pattern”Guided Discovery Questions
Section titled “Guided Discovery Questions”The Binomial Distribution: Formal Framework
Section titled “The Binomial Distribution: Formal Framework”Historical Context
Section titled “Historical Context”Jacob Bernoulli generalized this “fixed number of independent trials with constant success probability” model, creating what we now call the Binomial Distribution. Carl Friedrich Gauss later discovered that the probability sequence corresponds exactly to the terms in the binomial expansion where , hence the name.
Definition: Binomial Distribution
A random variable follows a binomial distribution, denoted , if it satisfies the BINS conditions:
- Binary outcomes: Each trial has exactly two possible outcomes (success/failure)
- Independence: Trials are mutually independent
- Number fixed: The number of trials is predetermined
- Same probability: The probability of success remains constant across trials
Where:
- = number of trials
- = probability of success on each trial
- = number of successes in trials
Theorem: Binomial Probability Mass Function
For , the probability of exactly successes is:
where and .
Theorem: Expectation and Variance
For :
- Expected value:
- Variance:
The Emergence of the Binomial Formula
Section titled “The Emergence of the Binomial Formula”Pattern Recognition: In our opening problem, Antoine winning is equivalent to him winning at least 1 round out of the next 2 possible rounds.
If we let = number of rounds Antoine wins in the next 2 rounds, then .
Using the binomial probability formula:
This matches our exhaustive calculation and naturally leads us to the binomial distribution!
Guided Practice: Building Understanding
Section titled “Guided Practice: Building Understanding”Binomial Cumulative Distribution Table (Extract)
Section titled “Binomial Cumulative Distribution Table (Extract)”The tabulated value is , where has a binomial distribution with index and parameter .
| p = | 0.05 | 0.10 | 0.15 | 0.20 | 0.25 | 0.30 | 0.35 | 0.40 | 0.45 | 0.50 |
|---|---|---|---|---|---|---|---|---|---|---|
| n=8, x=0 | 0.6634 | 0.4305 | 0.2725 | 0.1678 | 0.1001 | 0.0576 | 0.0319 | 0.0168 | 0.0084 | 0.0039 |
| x=1 | 0.9428 | 0.8131 | 0.6572 | 0.5033 | 0.3671 | 0.2553 | 0.1691 | 0.1064 | 0.0632 | 0.0352 |
| x=2 | 0.9942 | 0.9619 | 0.8948 | 0.7969 | 0.6785 | 0.5518 | 0.4278 | 0.3154 | 0.2201 | 0.1445 |
| x=3 | 0.9996 | 0.9950 | 0.9786 | 0.9437 | 0.8862 | 0.8059 | 0.7064 | 0.5941 | 0.4770 | 0.3633 |
| x=4 | 1.0000 | 0.9996 | 0.9971 | 0.9896 | 0.9727 | 0.9420 | 0.8939 | 0.8263 | 0.7396 | 0.6367 |
| x=5 | 1.0000 | 1.0000 | 0.9998 | 0.9988 | 0.9958 | 0.9887 | 0.9747 | 0.9502 | 0.9115 | 0.8555 |
| x=6 | 1.0000 | 1.0000 | 1.0000 | 0.9999 | 0.9996 | 0.9987 | 0.9964 | 0.9915 | 0.9819 | 0.9648 |
| x=7 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.9999 | 0.9998 | 0.9993 | 0.9983 | 0.9961 |
Applications
Section titled “Applications”Example: CATL Battery Production
Background: CATL produces lithium-ion batteries for electric vehicles. Based on historical data, their production process has a 95% success rate, meaning each battery independently has a 95% probability of meeting quality standards.
Scenario: A batch of 50 batteries has just been produced.
Part A: Basic Probability Questions
- What’s the probability of exactly 48 working batteries?
- What’s the expected number of defective batteries in this batch?
- What’s the standard deviation of the number of defective batteries?
Part B: Quality Control Decisions
- The company’s policy is to reject a batch if it contains 4 or more defective components. What’s the probability that this batch will be rejected?
- If the batch is accepted, what’s the probability that it contains at most 1 defective component?
Part C: Cost Analysis
- Each defective component costs $20 to replace under warranty. What’s the expected warranty cost for this batch?
- If the company wants to be 90% confident that warranty costs won’t exceed $100 for this batch, is the current quality level sufficient?
Past Paper Questions
Section titled “Past Paper Questions”Example (June 05 Q1):
It is estimated that of people have green eyes. In a random sample of size , the expected number of people with green eyes is .
- Calculate the value of .
The expected number of people with green eyes in a second random sample is 3.
- Find the standard deviation of the number of people with green eyes in this second sample.
Example (WST02/01/Jan17/1):
The random variable has the binomial distribution .
- Find .
- Find the probability that lies within one standard deviation of its mean.
Example: Chikungunya Fever Testing
The AL High school of Guangdong Country Garden School decides to implement a Chikungunya fever testing for all 1000 students. At the time of testing, the prevalence of Chikungunya fever is 0.5% (i.e., 0.005).
Test Characteristics:
- Sensitivity: 95% — If a student has Chikungunya fever, the test correctly identifies them 95% of the time
- Specificity: 98% — If a student doesn’t have Chikungunya fever, the test correctly identifies them as negative 98% of the time
- Let be the number of students who actually have Chikungunya fever. What distribution does follow? Calculate the expected number of infected students.
Given that the number of infected students is 6:
- Among the infected students, let be the number who test positive (true positives). What distribution does follow? Calculate .
- Among the non-infected students, let be the number who test positive (false positives). What distribution does follow? Calculate the expected number of false positives.
- The Paradox: If a randomly selected student tests positive, what is the probability they actually have Chikungunya fever? Use your previous results to explain why such seemingly surprising results can occur.
- The school decides to retest all positive cases with a second, independent test (same sensitivity and specificity). If a student tests positive on both tests, what is the probability they actually have Chikungunya fever?
Challenge Tasks: Probability Generating Function
Section titled “Challenge Tasks: Probability Generating Function”Just as Pascal and Fermat used exhaustive enumeration, and Bernoulli provided us with a powerful formula, we now seek the most elegant and unified expression: the Probability Generating Function (PGF). This remarkable tool can ‘generate’ all probabilities, expectations, and variances from a single function, like Gauss discovered with the binomial expansion.