Discrete Random Variables and PMF/CDF – One-page Summary
- Discrete random variable (X): maps outcomes of an experiment to countable values such as (0, 1, 2, \dots).
- PMF (p_X(k) = P(X = k)): probability assigned to each value (k).
- Properties: (0 \le p_X(k) \le 1), (\sum_k p_X(k) = 1).
- CDF (F(x) = P(X \le x)): for discrete variables, this is a step-shaped cumulative function.
Practical tips
- Always distinguish PMF (model) from frequency table (sample).
- Use complements to compute cumulative probabilities quickly:
(P(X \ge a) = 1 - P(X \le a - 1)).
Small Example (PMF vs. Observed Frequencies)
Suppose we test an antihypertensive drug on 4 patients.
The manufacturer provides an expected PMF for the possible response counts, and frequencies from 100 clinics show a similar pattern.
This qualitative agreement suggests that the model is reasonable.
Formulas for Mean, Variance, Moments, and CDF
-
Expectation (distribution mean)
\[\mu = E[X] = \sum_k k \, p_X(k)\]The sample mean is an estimator of (\mu), and by the law of large numbers (LLN) it converges to (\mu).
-
Variance
\[\sigma^2 = \sum_k (k - \mu)^2 p_X(k) = E[X^2] - (E[X])^2\] -
Moments
- (m)-th moment: (E[X^m]).
- Central moments: (E[(X - \mu)^m]). The first central moment is 0, and the standardized third central moment is a measure of skewness.
-
CDF (cumulative distribution function)
\[F(x) = P(X \le x)\]For a discrete random variable, the CDF is a step plot, jumping at integer values.
Combinatorics Refresher (Preparation for the Binomial)
-
Permutation:
\[P(n, k) = \frac{n!}{(n - k)!}\]order matters.
-
Combination:
\[\binom{n}{k} = \frac{n!}{k!(n - k)!}\]order does not matter.
Class mini example
-
Selecting 3 students (same role) out of 10. Probability that a particular student is selected:
\[\frac{\binom{9}{2}}{\binom{10}{3}} = \frac{36}{120} = 0.3.\] -
Assigning three different roles to 3 students out of 10.
The probability that a specific student is assigned the “presentation” role is (1/10).
Binomial Distribution (\mathrm{Binomial}(n, p))
Definition and PMF
-
Background: the binomial distribution models a sequence of trials with only two outcomes (for example, coin toss: success / failure). Under suitable conditions it can be approximated by a normal distribution, so it also serves as a bridge to the normal distribution.
-
Idea: we perform (n) independent trials, each with success probability (p), and count how many times the event of interest occurs.
- Assumptions:
- (n) independent trials,
- constant success probability (p) on each trial,
- failure probability (q = 1 - p).
-
PMF:
\[P(X = k) = \binom{n}{k} p^k q^{\,n - k}, \quad k = 0, 1, \dots, n.\] - Mean (E[X] = np), variance (\mathrm{Var}(X) = npq).
Normalization check
\(\sum_{k = 0}^{n} \binom{n}{k} p^k q^{n - k} = (p + q)^n = 1.\)
Simulation Histograms (Empirical Probabilities)
(n = 10,\ p = 0.05)

(n = 10,\ p = 0.95)

(n = 10,\ p = 0.50)

Example 1 – Sex at Birth
Suppose the probability of a male birth is (p = 0.51).
What is the probability of having exactly 2 sons among 5 children?
Example 2 – Infant Bronchitis (“At Least 3 Cases?”)
Assume the national average probability of infant bronchitis is (p = 0.05).
Consider 20 independent families; then
The probability of observing at least 3 cases is
\[\begin{aligned} P(X \ge 3) &= 1 - \big[ P(X = 0) + P(X = 1) + P(X = 2) \big] \\ &\approx 1 - \big(0.358 + 0.377 + 0.189\big) \\ &\approx \mathbf{0.077}. \end{aligned}\]Interpretation: the chance of seeing 3 or more cases purely by random variation is about 7.7%.
Whether this is considered “unusually high” depends on the chosen significance level and the broader context (multiple comparisons, prior expectations, etc.).
Key Points and Cheat Sheet
- Use PMF / CDF to model discrete probabilities and compare them to sample frequencies.
- Mean and variance describe location and spread; sample statistics are their estimators.
- Moments and skewness help characterize tail behavior and asymmetry.
- The binomial distribution is the basic model for repeated “success / failure” experiments.
- For cumulative binomial probabilities, use complements whenever convenient:
(P(X \ge a) = 1 - P(X \le a - 1)).