Skip to content

S2 Chapter 4: Continuous Random Variables

Imagine you’re working at a customer service call center. Throughout our previous chapters, we’ve mastered the art of describing “counts” - binomial distributions and Poisson distributions tell us how many times events occur. But now, we want to ask a fundamentally different question: How long do we need to wait until the first event occurs?

The Language of Continuous Random Variables

Section titled “The Language of Continuous Random Variables”

Probability Density Function (PDF) - Continuous “Probability Mass”

Section titled “Probability Density Function (PDF) - Continuous “Probability Mass””

The fundamental challenge with continuous random variables is that there are infinitely many possible values, so the probability of any exact value is zero. Instead, we think about probability density.

Visualizing the Transition: From Discrete to Continuous

Section titled “Visualizing the Transition: From Discrete to Continuous”

Discrete to continuous distribution transition

Let’s understand this concept by observing what happens as we increase the number of possible values:

Discrete: Few Values — High probabilities (each bar is tall)

Discrete: More Values — Lower probabilities (bars shrink as more values share the total)

Continuous: Infinite Values — Area under the curve equals probability

P(a<X<b)=Area under f(x) between a and bP(a < X < b) = \text{Area under } f(x) \text{ between } a \text{ and } b

This visualization shows us why we need a new mathematical framework for continuous random variables. The “height” of the curve at any point represents the probability density, and the area under the curve between two points gives us the actual probability.

Definition (Probability Density Function):

For a continuous random variable XX, we describe its probability distribution using a function f(x)f(x) called the probability density function (PDF). It satisfies:

  1. f(x)0f(x) \geq 0 for all xx (probability density is non-negative)
  2. P(a<X<b)=abf(x)dxP(a < X < b) = \int_a^b f(x) \, dx (probability is the area under the curve)
  3. f(x)dx=1\int_{-\infty}^{\infty} f(x) \, dx = 1 (total area represents total probability = 1)

Example:

Consider a random variable XX with probability density function:

f(x)={2x9if 0x30otherwisef(x) = \begin{cases} \frac{2x}{9} & \text{if } 0 \leq x \leq 3 \\ 0 & \text{otherwise} \end{cases}


Part (a): Verify this is a valid PDF

We need to check two fundamental conditions:

Condition (i): f(x)0f(x) \geq 0 for all xx

  • For 0x30 \leq x \leq 3: f(x)=2x90f(x) = \frac{2x}{9} \geq 0 since x0x \geq 0
  • For x<0x < 0 or x>3x > 3: f(x)=00f(x) = 0 \geq 0

Condition (ii): f(x)dx=1\int_{-\infty}^{\infty} f(x) \, dx = 1

f(x)dx=032x9dx=29x2203=199=1\int_{-\infty}^{\infty} f(x) \, dx = \int_0^3 \frac{2x}{9} \, dx = \frac{2}{9} \cdot \frac{x^2}{2}\Big|_0^3 = \frac{1}{9} \cdot 9 = 1

Part (b): Find P(1<X<2)P(1 < X < 2)

P(1<X<2)=122x9dx=29x2212=19(41)=13P(1 < X < 2) = \int_1^2 \frac{2x}{9} \, dx = \frac{2}{9} \cdot \frac{x^2}{2}\Big|_1^2 = \frac{1}{9}(4-1) = \boxed{\frac{1}{3}}


Cumulative Distribution Function (CDF) - Accumulation from Beginning to Now

Section titled “Cumulative Distribution Function (CDF) - Accumulation from Beginning to Now”

Definition (Cumulative Distribution Function):

The cumulative distribution function is defined as:

F(x)=P(Xx)=xf(t)dtF(x) = P(X \leq x) = \int_{-\infty}^{x} f(t) \, dt

This represents the probability that the random variable takes a value less than or equal to xx.

Theorem (Fundamental Relationship between PDF and CDF):

For continuous random variables:

f(x)=ddxF(x)f(x) = \frac{d}{dx} F(x)

This means:

  • The CDF is the integral of the PDF
  • The PDF is the derivative of the CDF

This is the Fundamental Theorem of Calculus perfectly embodied in probability theory!

Example (From CDF to PDF):

A continuous random variable XX has cumulative distribution function:

F(x)={0x<0x380x21x>2F(x) = \begin{cases} 0 & x < 0 \\ \frac{x^3}{8} & 0 \leq x \leq 2 \\ 1 & x > 2 \end{cases}


Part (a): Find the PDF f(x)f(x)

Using the fundamental relationship f(x)=ddxF(x)f(x) = \frac{d}{dx} F(x), we differentiate each piece:

  • For x<0x < 0: f(x)=ddx(0)=0f(x) = \frac{d}{dx}(0) = 0
  • For 0x20 \leq x \leq 2: f(x)=ddx(x38)=3x28f(x) = \frac{d}{dx}\left(\frac{x^3}{8}\right) = \frac{3x^2}{8}
  • For x>2x > 2: f(x)=ddx(1)=0f(x) = \frac{d}{dx}(1) = 0

Therefore:

f(x)={3x280x20otherwise\boxed{f(x) = \begin{cases} \frac{3x^2}{8} & 0 \leq x \leq 2 \\ 0 & \text{otherwise} \end{cases}}

Part (b): Verify this is a valid PDF

Check that f(x)dx=1\int_{-\infty}^{\infty} f(x) \, dx = 1:

f(x)dx=023x28dx=38x3302=188=1\int_{-\infty}^{\infty} f(x) \, dx = \int_0^2 \frac{3x^2}{8} \, dx = \frac{3}{8} \cdot \frac{x^3}{3}\Big|_0^2 = \frac{1}{8} \cdot 8 = 1

Part (c): Find P(0.5<X<1.5)P(0.5 < X < 1.5)

Using the CDF method: P(0.5<X<1.5)=F(1.5)F(0.5)P(0.5 < X < 1.5) = F(1.5) - F(0.5)

P(0.5<X<1.5)=F(1.5)F(0.5)=(1.5)38(0.5)38=3.3750.1258=3.258=0.40625\begin{aligned} P(0.5 < X < 1.5) &= F(1.5) - F(0.5) \\ &= \frac{(1.5)^3}{8} - \frac{(0.5)^3}{8} \\ &= \frac{3.375 - 0.125}{8} \\ &= \frac{3.25}{8} = \boxed{0.40625} \end{aligned}


Example (In-Class Exercise):

A continuous random variable YY has the following cumulative distribution function:

F(y)={0y<1a(y1)21y31y>3F(y) = \begin{cases} 0 & y < 1 \\ a(y-1)^2 & 1 \leq y \leq 3 \\ 1 & y > 3 \end{cases}

where aa is a positive constant.

  1. Find the value of aa
  2. Determine the probability density function f(y)f(y)
  3. Calculate P(Y>2)P(Y > 2) using both the CDF and PDF methods

Your solutions:

Numerical Characteristics - Mean, Variance, and Transformations

Section titled “Numerical Characteristics - Mean, Variance, and Transformations”

From Discrete Sums to Continuous Integrals: The Natural Evolution

Section titled “From Discrete Sums to Continuous Integrals: The Natural Evolution”

In our study of discrete random variables, we learned to calculate expected values using weighted sums:

E(X)=ixiP(X=xi)E(X) = \sum_{i} x_i \cdot P(X = x_i)

But what happens when we transition to continuous variables where P(X=xi)=0P(X = x_i) = 0 for any specific value? The answer lies in a beautiful mathematical evolution: sums become integrals.

Definition (Expected Value and Variance for Continuous Random Variables):

For a continuous random variable XX with PDF f(x)f(x):

Expected Value (Mean):

E(X)=xf(x)dxE(X) = \int_{-\infty}^{\infty} x \cdot f(x) \, dx

This represents the “center of gravity” or average value of the distribution.

Variance:

Var(X)=E[(Xμ)2]=(xμ)2f(x)dx=E(X2)[E(X)]2\text{Var}(X) = E[(X-\mu)^2] = \int_{-\infty}^{\infty} (x-\mu)^2 f(x) \, dx = E(X^2) - [E(X)]^2

This measures the “spread” or dispersion around the mean.

Expected Value of a Function:

E[g(X)]=g(x)f(x)dxE[g(X)] = \int_{-\infty}^{\infty} g(x) f(x) \, dx

This powerful formula allows us to find the expected value of any transformation of XX.

Theorem (Linear Transformations):

For a continuous random variable XX and constants aa, bb:

  • E(aX+b)=aE(X)+bE(aX + b) = aE(X) + b
  • Var(aX+b)=a2Var(X)\text{Var}(aX + b) = a^2 \text{Var}(X)

Example (Beta-type Distribution):

Consider the continuous random variable XX with PDF:

f(x)={6x(1x)0x10otherwisef(x) = \begin{cases} 6x(1-x) & 0 \leq x \leq 1 \\ 0 & \text{otherwise} \end{cases}


Complete Solution:

Part (a): Verify this is a valid PDF

Check that f(x)dx=1\int_{-\infty}^{\infty} f(x) \, dx = 1:

016x(1x)dx=601(xx2)dx=6[x22x33]01=6(1213)=616=1\begin{aligned} \int_0^1 6x(1-x) \, dx &= 6\int_0^1 (x - x^2) \, dx \\ &= 6\left[\frac{x^2}{2} - \frac{x^3}{3}\right]_0^1 \\ &= 6\left(\frac{1}{2} - \frac{1}{3}\right) = 6 \cdot \frac{1}{6} = 1 \end{aligned}

Part (b): Calculate E(X)E(X)

E(X)=01x6x(1x)dx=601(x2x3)dxE(X) = \int_0^1 x \cdot 6x(1-x) \, dx = 6\int_0^1 (x^2 - x^3) \, dx

=6[x33x44]01=6(1314)=12= 6\left[\frac{x^3}{3} - \frac{x^4}{4}\right]_0^1 = 6\left(\frac{1}{3} - \frac{1}{4}\right) = \boxed{\frac{1}{2}}

Part (c): Calculate Var(X)\text{Var}(X)

Step 1: Find E(X2)E(X^2)

E(X2)=01x26x(1x)dx=601(x3x4)dxE(X^2) = \int_0^1 x^2 \cdot 6x(1-x) \, dx = 6\int_0^1 (x^3 - x^4) \, dx

=6[x44x55]01=6(1415)=310= 6\left[\frac{x^4}{4} - \frac{x^5}{5}\right]_0^1 = 6\left(\frac{1}{4} - \frac{1}{5}\right) = \frac{3}{10}

Step 2: Apply variance formula

Var(X)=E(X2)[E(X)]2=310(12)2=31014=120\text{Var}(X) = E(X^2) - [E(X)]^2 = \frac{3}{10} - \left(\frac{1}{2}\right)^2 = \frac{3}{10} - \frac{1}{4} = \boxed{\frac{1}{20}}


Percentiles and Mode - Distributional Landmarks

Section titled “Percentiles and Mode - Distributional Landmarks”

Definition (Percentiles and Quantiles):

The pp-th percentile (or quantile) of a continuous distribution is the value qpq_p such that:

P(Xqp)=F(qp)=p100P(X \leq q_p) = F(q_p) = \frac{p}{100}

Interpretation: p%p\% of the probability mass lies to the left of qpq_p, and (100p)%(100-p)\% lies to the right.

Special cases:

  • Median (q50q_{50}): The “middle” value where F(q50)=0.5F(q_{50}) = 0.5
  • First Quartile (q25q_{25}): 25% of values are below this point
  • Third Quartile (q75q_{75}): 75% of values are below this point

Definition (Mode):

The mode of a continuous distribution is the value of xx that maximizes the PDF f(x)f(x). It can be found by solving:

ddxf(x)=0 and d2dx2f(x)<0\frac{d}{dx} f(x) = 0 \text{ and } \frac{d^2}{dx^2} f(x) < 0

The mode represents the most “dense” point of the distribution - where the probability is most concentrated.

Assessing Distribution Shape: Understanding Skewness

Section titled “Assessing Distribution Shape: Understanding Skewness”

Skewed distributions comparison

Example (Visual Skewness Detection Practice):

Look at these three distributions. Roughly denote the mode, median, and mean, and identify their skewness:

Distribution A: Right-skewed (long right tail)

Distribution B: Symmetric (bell-shaped)

Distribution C: Left-skewed (long left tail)

Example (In-Class Exercise):

A continuous random variable ZZ has probability density function:

f(z)={ce2zz00z<0f(z) = \begin{cases} ce^{-2z} & z \geq 0 \\ 0 & z < 0 \end{cases}

where cc is a positive constant.

  1. Find the value of cc
  2. Calculate E(Z)E(Z) and Var(Z)\text{Var}(Z)
  3. Find P(Z>E(Z))P(Z > E(Z)) and comment on this result
  4. Find E(3Z+2)E(3Z + 2) and Var(3Z+2)\text{Var}(3Z + 2) using transformation properties
  5. Calculate P(Z>1Z>0.5)P(Z > 1 | Z > 0.5) and interpret this result
  6. Calculate the median and interquartile range of ZZ
  7. Sketch f(z)f(z), demonstrate its mode, median, and mean of this distribution on the graph, discuss the skewness of this distribution

Your solutions:

(Optional) From Poisson to Exponential Distribution

Section titled “(Optional) From Poisson to Exponential Distribution”

Key Insight: We know how to count events, but what about waiting times between events?

Scenario: Suppose a process (like calls to a customer service center) is a Poisson process with rate λ\lambda.

Define: Let TT be the random variable representing the waiting time until the first event occurs.

Challenge Mission: Find the probability distribution of random variable TT

  1. Find the cumulative distribution function FT(t)=P(Tt)F_T(t) = P(T \leq t)
  2. Find the probability density function fT(t)f_T(t) by differentiating FT(t)F_T(t)
  3. Calculate the expected value of TT

Step 1: Finding the Cumulative Distribution Function

Setup: Let N(t)N(t) be the number of events that occur in the time interval [0,t][0, t].

We know: N(t)Po(λt)N(t) \sim \text{Po}(\lambda t)

Key Question: What does FT(t)=P(Tt)F_T(t) = P(T \leq t) represent?

Answer: “The probability that the first event occurs within time tt

Strategic Insight: Use the complement approach!

“First event occurs within time tt\Leftrightarrow “At least one event occurs within time tt

Therefore: P(Tt)=1P(no events occur within time t)=1P(N(t)=0)P(T \leq t) = 1 - P(\text{no events occur within time } t) = 1 - P(N(t) = 0)

Your Turn: Calculate P(N(t)=0)P(N(t) = 0) using the Poisson PMF and find FT(t)F_T(t).

For t0t \geq 0,

FT(t)=P(Tt)=1P(N(t)=0)=___F_T(t) = P(T \leq t) = 1 - P(N(t) = 0) = \_\_\_

Step 2: Finding the Probability Density Function

Recall: For continuous random variables, fT(t)=ddtFT(t)f_T(t) = \frac{d}{dt} F_T(t)

Your Turn: Differentiate the CDF you found in Step 1 to find the PDF.

fT(t)=ddtFT(t)=___for t0f_T(t) = \frac{d}{dt} F_T(t) = \_\_\_ \quad \text{for } t \geq 0

Step 3: Computing the Expected Value

Your Turn: Using the PDF you found in Step 2, calculate E(T)E(T).

E(T)=0tfT(t)dt=___E(T) = \int_0^{\infty} t \cdot f_T(t) \, dt = \_\_\_

Hint: Use integration by parts.