S2 Chapter 2: The Poisson Distribution
Preface: From Battlefield Statistics to Modern Modeling
Section titled “Preface: From Battlefield Statistics to Modern Modeling”Welcome, mathematical explorers! Today we embark on a fascinating journey through time, where we’ll discover how the study of rare events — from deadly horse kicks in the Prussian army to cosmic phenomena — led to one of the most powerful tools in modern statistics: The Poisson Distribution.
Our story begins with a French mathematician whose name became synonymous with rare events, and whose work continues to illuminate patterns in everything from traffic flow to radioactive decay.
1. The Quest for Modeling Events in Continuous Time
Section titled “1. The Quest for Modeling Events in Continuous Time”Setting the Stage: The Baozi Shop Dilemma
Section titled “Setting the Stage: The Baozi Shop Dilemma”Imagine you’re the proud owner of a breakfast shop in Guangdong Country Garden School. Through careful observation over many weeks, you’ve discovered that on average, you sell exactly 10 baozi during the morning hour (7:00–8:00 AM).
First Instinct: “This Sounds Like Binomial!”
Section titled “First Instinct: “This Sounds Like Binomial!””Your first thought might be: “I’ll use the binomial distribution!” But then you pause and ask yourself:
What exactly are my ‘trials’?
Let’s think about dividing the hour into smaller time intervals:
The Pattern:
- As we divide the hour into smaller intervals, increases
- The probability of selling a baozi in each tiny interval decreases
- But their product remains constant at 10 (our average sales rate)
The Mathematical Insight: We’re witnessing the transition from discrete binomial trials to a continuous process!
Why This Matters for Your Business
Section titled “Why This Matters for Your Business”This isn’t just a mathematical curiosity — it has real implications for your baozi shop:
- Customer arrivals are unpredictable: You can’t pinpoint exactly when each customer will arrive
- Sales happen continuously: A customer could arrive at any moment during the hour
- Rate is consistent: While individual sales are random, the average rate (10 per hour) is stable
This is exactly the situation where the Poisson distribution becomes our perfect tool!
Historical Context: From Battlefield to Breakfast Shop
Section titled “Historical Context: From Battlefield to Breakfast Shop”Your baozi shop problem isn’t unique — mathematicians have been tackling similar “rare event” challenges for centuries. Let’s briefly explore how this powerful distribution was discovered:
Abraham de Moivre (1711): First discovered the mathematical pattern, though it remained largely unnoticed.
Siméon Denis Poisson (1837): Rediscovered and popularized the distribution in his work on legal statistics, modeling wrongful convictions.
Ladislaus Bortkiewicz (1898): Applied it to model Prussian cavalry deaths from horse kicks — rare, unpredictable events occurring at a steady average rate, just like your baozi sales!
2. The Mathematical Framework — Definition and Conditions
Section titled “2. The Mathematical Framework — Definition and Conditions”The Poisson Distribution Revealed
Section titled “The Poisson Distribution Revealed”Definition (Poisson Distribution): A discrete random variable follows a Poisson distribution with parameter , denoted , if its probability mass function is:
Where:
- represents the average rate of occurrence
- is Euler’s constant
- is the factorial of
The Conditions: When Poisson Applies
Section titled “The Conditions: When Poisson Applies”The Poisson distribution is not universal — it requires three fundamental conditions that directly determine the model’s accuracy:
Key Insight: Independence in Poisson processes leads to a fascinating property called memorylessness.
What this means: If no baozi has been sold in the last 30 minutes, this doesn’t increase the probability of selling one in the next 30 minutes. The process “forgets” its history.
Mathematical Statement: For a Poisson process, the probability of an event occurring in the next time interval is independent of how long we’ve already been waiting.
Business Implication: Even if you’ve had no customers for a while, don’t expect a sudden rush — each moment is statistically independent!
2. Singly: In any infinitesimally small interval of time or space, at most one event can occur. The probability of two cars arriving at exactly the same microsecond is negligible.
3. Constant Rate: The average rate of occurrence remains constant over time. The rate doesn’t change between morning and afternoon (if we’re modeling a period with consistent conditions).
Determine whether each scenario follows a Poisson distribution:
- Number of radioactive particles emitted by a certain source in one minute
- Number of rolls until a fair die hits six
- Number of lottery jackpot winners in Guangdong over one year
- Number of phone calls received at a call center during a specific hour
- Number of phone calls from one person to a call center during a specific hour
Solving the Baozi Shop Problem
Section titled “Solving the Baozi Shop Problem”Now let’s return to our opening challenge and solve it using the Poisson distribution!
Recall the Problem: You sell an average of 10 baozi per hour. How many should you prepare to ensure 80% of the time customers don’t walk away empty-handed?
Mathematical Translation: Let = number of baozi sold per hour. We model .
We want to find the minimum number such that .
Solution Strategy: We need the 80th percentile of the Poisson distribution Po(10).
Using Poisson tables for , we find cumulative probabilities:
| Interpretation | ||
|---|---|---|
| 10 | 0.583 | Only 58.3% service level |
| 11 | 0.697 | Only 69.7% service level |
| 12 | 0.792 | Only 79.2% service level |
| 13 | 0.864 | 86.4% service level |
Business Decision: Prepare 13 baozi each morning.
Business Impact:
- 86.4% of days: All customers satisfied (exceeds 80% target)
- 13.6% of days: Some customers disappointed (but this is acceptable)
- Expected daily waste: baozi on average
Fundamental Properties
Section titled “Fundamental Properties”Theorem (Expectation and Variance of Poisson): For :
- Expected value:
- Variance:
- Standard deviation:
Key Insight: In a Poisson distribution, the mean equals the variance!
Practical Application: If you have a dataset where the sample mean approximately equals the sample variance, this suggests the data might follow a Poisson distribution.
Example: If the average number of emails received per hour is 12, then , and the variance is also 12.
The Additivity Property: A Powerful Tool
Section titled “The Additivity Property: A Powerful Tool”Theorem (Additivity of Independent Poisson Variables): If and are independent, then:
Example (Real-World Additivity):
Scenario: A website receives an average of 15 visitors per hour from search engines () and 8 visitors per hour from social media ().
Total Traffic: The total number of visitors per hour follows .
Interpretation: Combining independent Poisson processes creates another Poisson process with the sum of their rates.
Remark: The proof of properties above will be given in the Challenge Exercise where we will derive the probability generating function of the Poisson distribution and prove various properties of the Poisson distribution.
3. Guided Practice: Mastering Poisson Calculations
Section titled “3. Guided Practice: Mastering Poisson Calculations”Poisson Cumulative Distribution Table
Section titled “Poisson Cumulative Distribution Table”The tabulated value is , where has a Poisson distribution with parameter .
| 0.5 | 1.0 | 1.5 | 2.0 | 2.5 | 3.0 | 3.5 | 4.0 | 4.5 | 5.0 | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.6065 | 0.3679 | 0.2231 | 0.1353 | 0.0821 | 0.0498 | 0.0302 | 0.0183 | 0.0111 | 0.0067 | |
| 1 | 0.9098 | 0.7358 | 0.5578 | 0.4060 | 0.2873 | 0.1991 | 0.1359 | 0.0916 | 0.0611 | 0.0404 | |
| 2 | 0.9856 | 0.9197 | 0.8088 | 0.6767 | 0.5438 | 0.4232 | 0.3208 | 0.2381 | 0.1736 | 0.1247 | |
| 3 | 0.9982 | 0.9810 | 0.9344 | 0.8571 | 0.7576 | 0.6472 | 0.5366 | 0.4335 | 0.3423 | 0.2650 | |
| 4 | 0.9998 | 0.9963 | 0.9814 | 0.9473 | 0.8912 | 0.8153 | 0.7254 | 0.6288 | 0.5321 | 0.4405 | |
| 5 | 1.0000 | 0.9994 | 0.9955 | 0.9834 | 0.9580 | 0.9161 | 0.8576 | 0.7851 | 0.7029 | 0.6160 | |
| 6 | 1.0000 | 0.9999 | 0.9991 | 0.9955 | 0.9858 | 0.9665 | 0.9347 | 0.8893 | 0.8311 | 0.7622 | |
| 7 | 1.0000 | 1.0000 | 0.9998 | 0.9989 | 0.9958 | 0.9881 | 0.9733 | 0.9489 | 0.9134 | 0.8666 | |
| 8 | 1.0000 | 1.0000 | 1.0000 | 0.9998 | 0.9989 | 0.9962 | 0.9901 | 0.9786 | 0.9597 | 0.9319 | |
| 9 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.9997 | 0.9989 | 0.9967 | 0.9919 | 0.9829 | 0.9682 | |
| 10 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.9999 | 0.9997 | 0.9990 | 0.9972 | 0.9933 | 0.9863 |
A call center receives calls at an average rate of 3 per hour. Let be the number of calls in one hour.
Given:
Use the Poisson distribution table to find:
- The probability of receiving exactly the expected number of calls
Key Formulas:
- = table value at table value at
4. Real-World Applications
Section titled “4. Real-World Applications”Example (Cybersecurity):
Background: A cybersecurity team monitors attempted intrusions on their network. Historical data shows that intrusion attempts occur at an average rate of 2.5 per day, and these attempts appear to be independent and random.
Modeling Decision: Let = number of intrusion attempts per day. We model .
- What’s the probability of no intrusion attempts on a given day?
- What’s the probability of more than attempts in one day?
- What’s the expected number of intrusion attempts in a week?
- The security team can handle up to 5 attempts per day effectively. Calculate the probability that in a week of 7 days, the intrusion attempts are handled effectively everyday.
- If they want to be adequately prepared 95% of the time, what should be their daily capacity?
Example (Quality Control in Manufacturing):
Context: A textile manufacturer produces large rolls of fabric. Quality control data shows that defects appear randomly at an average rate of 0.3 defects per square meter.
Question Series:
- In a 5 square meter section, what’s the probability of finding exactly 2 defects?
- What’s the probability that a 10 square meter section has no defects?
- If defects cost $15 each to repair, what’s the expected repair cost for a 20 square meter section?
- Two independent fabric sections of 3 square meters each are inspected. What’s the distribution of the total number of defects?
Homework Exercises
Section titled “Homework Exercises”Example (June 07 Q3): An engineering company manufactures an electronic component. At the end of the manufacturing process, each component is checked to see if it is faulty. Faulty components are detected at a rate of 1.5 per hour.
- Suggest a suitable model for the number of faulty components detected per hour. (1)
- Describe, in the context of this question, two assumptions you have made in part (a) for this model to be suitable. (2)
- Find the probability of 2 faulty components being detected in a 1 hour period. (2)
- Find the probability of at least one faulty component being detected in a 3 hour period. (3)
Example (Jan 10 Q3): A robot is programmed to build cars on a production line. The robot breaks down at random at a rate of once every 20 hours.
- Find the probability that it will work continuously for 5 hours without a breakdown. (3)
Find the probability that, in an 8 hour period, 2. the robot will break down at least once, (3) 3. there are exactly 2 breakdowns. (2)
In a particular 8 hour period, the robot broke down twice. 4. Write down the probability that the robot will break down in the following 8 hour period. Give a reason for your answer. (2)
Example (Jan 09 Q1): A botanist is studying the distribution of daisies in a field. The field is divided into a number of equal sized squares. The mean number of daisies per square is assumed to be 3. The daisies are distributed randomly throughout the field.
Find the probability that, in a randomly chosen square there will be
- more than 2 daisies, (3)
- either 5 or 6 daisies. (2)
The botanist decides to count the number of daisies, , in each of 80 randomly selected squares within the field. The results are summarised below
- Calculate the mean and the variance of the number of daisies per square for the 80 squares. Give your answers to 2 decimal places. (3)
- Explain how the answers from part (c) support the choice of a Poisson distribution as a model. (1)
- Using your mean from part (c), estimate the probability that exactly 4 daisies will be found in a randomly selected square. (2)
Example (Jan 08 Q3):
- State two conditions under which a Poisson distribution is a suitable model to use in statistical work. (2)
The number of cars passing an observation point in a 10 minute interval is modelled by a Poisson distribution with mean 1.
- Find the probability that in a randomly chosen 60 minute period there will be
- (i) exactly 4 cars passing the observation point,
- (ii) at least 5 cars passing the observation point. (5)
The number of other vehicles, other than cars, passing the observation point in a 60 minute interval is modelled by a Poisson distribution with mean 12.
- Find the probability that exactly 1 vehicle, of any type, passes the observation point in a 10 minute period. (4)
(Optional) The Binomial–Poisson Connection
Section titled “(Optional) The Binomial–Poisson Connection”The Great Revelation
Section titled “The Great Revelation”The Poisson distribution emerges naturally as a limiting case of the binomial distribution under specific conditions.
The Setup: Consider a binomial distribution where:
- becomes very large ()
- becomes very small ()
- The product remains constant
The Result: Under these conditions,
Setup: Let where is constant. Prove that as :
Step 1: Write the binomial probability:
Step 2: Rewrite as:
Step 3: Evaluate the limit of each component:
- ? as
- ? as
- ? as (Hint: Use )
In practice, when is large and is small, we can approximate with .
Problem: A quality inspector examines 200 items where each has a 2% probability of being defective.
- Calculate the exact probability of finding exactly 3 defective items using the binomial distribution
- Approximate this probability using the Poisson distribution
- Compare your results and comment on the accuracy
Challenge Extension: Probability Generating Functions for Poisson
Section titled “Challenge Extension: Probability Generating Functions for Poisson”Definition: The Probability Generating Function of a discrete random variable is:
From Binomial Chapter: You learned that for :
And the magical moment formulas:
Part I: Deriving the Poisson PGF — Two Approaches
Section titled “Part I: Deriving the Poisson PGF — Two Approaches”Challenge 1: For , derive directly from the definition.
Setup: We know
Step 1: Write out the PGF definition:
Step 2: Factor out :
Step 3: Recognize the series! What famous expansion is ?
Step 4: Complete the derivation to show
Challenge 2: Derive the Poisson PGF as a limit of the binomial PGF.
Recall the Setup: converges to as
Step 1: Write the binomial PGF with :
Step 2: Rewrite this as:
Step 3: Apply the fundamental limit :
Beautiful Result: Both methods give us !
Part II: Extracting Properties Using PGF Magic
Section titled “Part II: Extracting Properties Using PGF Magic”Given: for
- Evaluate to find
- Use the variance formula: to verify that .
Recall from Binomial Chapter: If and are independent, then:
Application: Let and be independent.
By computing , show the additivity property of the Poisson distribution that
What we’ve accomplished:
- Derived the Poisson PGF using two different approaches
- Computed expectation and variance using differentiation
- Proved the additivity property using PGF multiplication
The Bigger Picture: PGFs provide a unified framework for understanding discrete distributions.