S2 Chapter 4: Continuous Random Variables

Two Different Kinds of Randomness

Imagine you’re working at a customer service call center. Throughout our previous chapters, we’ve mastered the art of describing “counts” - binomial distributions and Poisson distributions tell us how many times events occur. But now, we want to ask a fundamentally different question: How long do we need to wait until the first event occurs?

The Language of Continuous Random Variables

Probability Density Function (PDF) - Continuous “Probability Mass”

The fundamental challenge with continuous random variables is that there are infinitely many possible values, so the probability of any exact value is zero. Instead, we think about probability density.

Visualizing the Transition: From Discrete to Continuous

Discrete to continuous distribution transition

Let’s understand this concept by observing what happens as we increase the number of possible values:

Discrete: Few Values — High probabilities (each bar is tall)

Discrete: More Values — Lower probabilities (bars shrink as more values share the total)

Continuous: Infinite Values — Area under the curve equals probability

$P(a < X < b) = \text{Area under } f(x) \text{ between } a \text{ and } b$

This visualization shows us why we need a new mathematical framework for continuous random variables. The “height” of the curve at any point represents the probability density, and the area under the curve between two points gives us the actual probability.

Definition (Probability Density Function):

For a continuous random variable $X$ , we describe its probability distribution using a function $f(x)$ called the probability density function (PDF). It satisfies:

$f(x) \geq 0$ for all $x$ (probability density is non-negative)
$P(a < X < b) = \int_a^b f(x) \, dx$ (probability is the area under the curve)
$\int_{-\infty}^{\infty} f(x) \, dx = 1$ (total area represents total probability = 1)

Example:

Consider a random variable $X$ with probability density function:

$f(x) = \begin{cases} \frac{2x}{9} & \text{if } 0 \leq x \leq 3 \\ 0 & \text{otherwise} \end{cases}$

Part (a): Verify this is a valid PDF

We need to check two fundamental conditions:

Condition (i): $f(x) \geq 0$ for all $x$

For $0 \leq x \leq 3$ : $f(x) = \frac{2x}{9} \geq 0$ since $x \geq 0$ ✓
For $x < 0$ or $x > 3$ : $f(x) = 0 \geq 0$ ✓

Condition (ii): $\int_{-\infty}^{\infty} f(x) \, dx = 1$

$\int_{-\infty}^{\infty} f(x) \, dx = \int_0^3 \frac{2x}{9} \, dx = \frac{2}{9} \cdot \frac{x^2}{2}\Big|_0^3 = \frac{1}{9} \cdot 9 = 1$

Part (b): Find $P(1 < X < 2)$

$P(1 < X < 2) = \int_1^2 \frac{2x}{9} \, dx = \frac{2}{9} \cdot \frac{x^2}{2}\Big|_1^2 = \frac{1}{9}(4-1) = \boxed{\frac{1}{3}}$

Cumulative Distribution Function (CDF) - Accumulation from Beginning to Now

Definition (Cumulative Distribution Function):

The cumulative distribution function is defined as:

$F(x) = P(X \leq x) = \int_{-\infty}^{x} f(t) \, dt$

This represents the probability that the random variable takes a value less than or equal to $x$ .

Theorem (Fundamental Relationship between PDF and CDF):

For continuous random variables:

$f(x) = \frac{d}{dx} F(x)$

This means:

The CDF is the integral of the PDF
The PDF is the derivative of the CDF

This is the Fundamental Theorem of Calculus perfectly embodied in probability theory!

Example (From CDF to PDF):

A continuous random variable $X$ has cumulative distribution function:

$F(x) = \begin{cases} 0 & x < 0 \\ \frac{x^3}{8} & 0 \leq x \leq 2 \\ 1 & x > 2 \end{cases}$

Part (a): Find the PDF $f(x)$

Using the fundamental relationship $f(x) = \frac{d}{dx} F(x)$ , we differentiate each piece:

For $x < 0$ : $f(x) = \frac{d}{dx}(0) = 0$
For $0 \leq x \leq 2$ : $f(x) = \frac{d}{dx}\left(\frac{x^3}{8}\right) = \frac{3x^2}{8}$
For $x > 2$ : $f(x) = \frac{d}{dx}(1) = 0$

Therefore:

$\boxed{f(x) = \begin{cases} \frac{3x^2}{8} & 0 \leq x \leq 2 \\ 0 & \text{otherwise} \end{cases}}$

Part (b): Verify this is a valid PDF

Check that $\int_{-\infty}^{\infty} f(x) \, dx = 1$ :

$\int_{-\infty}^{\infty} f(x) \, dx = \int_0^2 \frac{3x^2}{8} \, dx = \frac{3}{8} \cdot \frac{x^3}{3}\Big|_0^2 = \frac{1}{8} \cdot 8 = 1$

Part (c): Find $P(0.5 < X < 1.5)$

Using the CDF method: $P(0.5 < X < 1.5) = F(1.5) - F(0.5)$

$\begin{aligned} P(0.5 < X < 1.5) &= F(1.5) - F(0.5) \\ &= \frac{(1.5)^3}{8} - \frac{(0.5)^3}{8} \\ &= \frac{3.375 - 0.125}{8} \\ &= \frac{3.25}{8} = \boxed{0.40625} \end{aligned}$

Example (In-Class Exercise):

A continuous random variable $Y$ has the following cumulative distribution function:

$F(y) = \begin{cases} 0 & y < 1 \\ a(y-1)^2 & 1 \leq y \leq 3 \\ 1 & y > 3 \end{cases}$

where $a$ is a positive constant.

Find the value of $a$
Determine the probability density function $f(y)$
Calculate $P(Y > 2)$ using both the CDF and PDF methods

Your solutions:

Numerical Characteristics - Mean, Variance, and Transformations

From Discrete Sums to Continuous Integrals: The Natural Evolution

In our study of discrete random variables, we learned to calculate expected values using weighted sums:

$E(X) = \sum_{i} x_i \cdot P(X = x_i)$

But what happens when we transition to continuous variables where $P(X = x_i) = 0$ for any specific value? The answer lies in a beautiful mathematical evolution: sums become integrals.

Definition (Expected Value and Variance for Continuous Random Variables):

For a continuous random variable $X$ with PDF $f(x)$ :

Expected Value (Mean):

$E(X) = \int_{-\infty}^{\infty} x \cdot f(x) \, dx$

This represents the “center of gravity” or average value of the distribution.

Variance:

$\text{Var}(X) = E[(X-\mu)^2] = \int_{-\infty}^{\infty} (x-\mu)^2 f(x) \, dx = E(X^2) - [E(X)]^2$

This measures the “spread” or dispersion around the mean.

Expected Value of a Function:

$E[g(X)] = \int_{-\infty}^{\infty} g(x) f(x) \, dx$

This powerful formula allows us to find the expected value of any transformation of $X$ .

Theorem (Linear Transformations):

For a continuous random variable $X$ and constants $a$ , $b$ :

$E(aX + b) = aE(X) + b$
$\text{Var}(aX + b) = a^2 \text{Var}(X)$

Example (Beta-type Distribution):

Consider the continuous random variable $X$ with PDF:

$f(x) = \begin{cases} 6x(1-x) & 0 \leq x \leq 1 \\ 0 & \text{otherwise} \end{cases}$

Complete Solution:

Part (a): Verify this is a valid PDF

Check that $\int_{-\infty}^{\infty} f(x) \, dx = 1$ :

$\begin{aligned} \int_0^1 6x(1-x) \, dx &= 6\int_0^1 (x - x^2) \, dx \\ &= 6\left[\frac{x^2}{2} - \frac{x^3}{3}\right]_0^1 \\ &= 6\left(\frac{1}{2} - \frac{1}{3}\right) = 6 \cdot \frac{1}{6} = 1 \end{aligned}$

Part (b): Calculate $E(X)$

$E(X) = \int_0^1 x \cdot 6x(1-x) \, dx = 6\int_0^1 (x^2 - x^3) \, dx$

$= 6\left[\frac{x^3}{3} - \frac{x^4}{4}\right]_0^1 = 6\left(\frac{1}{3} - \frac{1}{4}\right) = \boxed{\frac{1}{2}}$

Part (c): Calculate $\text{Var}(X)$

Step 1: Find $E(X^2)$

$E(X^2) = \int_0^1 x^2 \cdot 6x(1-x) \, dx = 6\int_0^1 (x^3 - x^4) \, dx$

$= 6\left[\frac{x^4}{4} - \frac{x^5}{5}\right]_0^1 = 6\left(\frac{1}{4} - \frac{1}{5}\right) = \frac{3}{10}$

Step 2: Apply variance formula

$\text{Var}(X) = E(X^2) - [E(X)]^2 = \frac{3}{10} - \left(\frac{1}{2}\right)^2 = \frac{3}{10} - \frac{1}{4} = \boxed{\frac{1}{20}}$

Percentiles and Mode - Distributional Landmarks

Percentiles: Dividing Probability Mass

Definition (Percentiles and Quantiles):

The $p$ -th percentile (or quantile) of a continuous distribution is the value $q_p$ such that:

$P(X \leq q_p) = F(q_p) = \frac{p}{100}$

Interpretation: $p\%$ of the probability mass lies to the left of $q_p$ , and $(100-p)\%$ lies to the right.

Special cases:

Median ( $q_{50}$ ): The “middle” value where $F(q_{50}) = 0.5$
First Quartile ( $q_{25}$ ): 25% of values are below this point
Third Quartile ( $q_{75}$ ): 75% of values are below this point

Mode: The Peak of Probability Density

Definition (Mode):

The mode of a continuous distribution is the value of $x$ that maximizes the PDF $f(x)$ . It can be found by solving:

$\frac{d}{dx} f(x) = 0 \text{ and } \frac{d^2}{dx^2} f(x) < 0$

The mode represents the most “dense” point of the distribution - where the probability is most concentrated.

Assessing Distribution Shape: Understanding Skewness

Skewed distributions comparison

Visual Methods for Detecting Skewness

Example (Visual Skewness Detection Practice):

Look at these three distributions. Roughly denote the mode, median, and mean, and identify their skewness:

This is the most practical method for determining skewness:

Distribution Type	Relationship	Interpretation
Symmetric	Mean $=$ Median $=$ Mode	No skewness
Right-skewed	Mean $>$ Median $>$ Mode	Positive skew
Left-skewed	Mean $<$ Median $<$ Mode	Negative skew

Why this works: The mean is “pulled” toward the tail because it’s sensitive to extreme values, while the median and mode are more resistant to outliers.

Distribution A: Right-skewed (long right tail)

Distribution B: Symmetric (bell-shaped)

Distribution C: Left-skewed (long left tail)

Example (In-Class Exercise):

A continuous random variable $Z$ has probability density function:

$f(z) = \begin{cases} ce^{-2z} & z \geq 0 \\ 0 & z < 0 \end{cases}$

where $c$ is a positive constant.

Find the value of $c$
Calculate $E(Z)$ and $\text{Var}(Z)$
Find $P(Z > E(Z))$ and comment on this result
Find $E(3Z + 2)$ and $\text{Var}(3Z + 2)$ using transformation properties
Calculate $P(Z > 1 | Z > 0.5)$ and interpret this result
Calculate the median and interquartile range of $Z$
Sketch $f(z)$ , demonstrate its mode, median, and mean of this distribution on the graph, discuss the skewness of this distribution

Your solutions:

Homework Exercises

(Optional) From Poisson to Exponential Distribution

Key Insight: We know how to count events, but what about waiting times between events?

Scenario: Suppose a process (like calls to a customer service center) is a Poisson process with rate $\lambda$ .

Define: Let $T$ be the random variable representing the waiting time until the first event occurs.

Challenge Mission: Find the probability distribution of random variable $T$

Derivation Objectives

Find the cumulative distribution function $F_T(t) = P(T \leq t)$
Find the probability density function $f_T(t)$ by differentiating $F_T(t)$
Calculate the expected value of $T$

Step 1: Finding the Cumulative Distribution Function

Setup: Let $N(t)$ be the number of events that occur in the time interval $[0, t]$ .

We know: $N(t) \sim \text{Po}(\lambda t)$

Key Question: What does $F_T(t) = P(T \leq t)$ represent?

Answer: “The probability that the first event occurs within time $t$ ”

Strategic Insight: Use the complement approach!

“First event occurs within time $t$ ” $\Leftrightarrow$ “At least one event occurs within time $t$ ”

Therefore: $P(T \leq t) = 1 - P(\text{no events occur within time } t) = 1 - P(N(t) = 0)$

Your Turn: Calculate $P(N(t) = 0)$ using the Poisson PMF and find $F_T(t)$ .

For $t \geq 0$ ,

$F_T(t) = P(T \leq t) = 1 - P(N(t) = 0) = \_\_\_$

Step 2: Finding the Probability Density Function

Recall: For continuous random variables, $f_T(t) = \frac{d}{dt} F_T(t)$

Your Turn: Differentiate the CDF you found in Step 1 to find the PDF.

$f_T(t) = \frac{d}{dt} F_T(t) = \_\_\_ \quad \text{for } t \geq 0$

Step 3: Computing the Expected Value

Your Turn: Using the PDF you found in Step 2, calculate $E(T)$ .

$E(T) = \int_0^{\infty} t \cdot f_T(t) \, dt = \_\_\_$

Hint: Use integration by parts.