$
\def\P{\mathsf{P}}
\def\R{\mathbb{R}}
\def\c{\,|\,}
$

Some knowledge of combinatorics is essential for probability. For example, computing the probability $\P(E)$ in the classical model on finite sample spaces $\P(E)=|E|/|\Omega|$ is equivalent to the combinatorial problem of enumerating the elements in $E$ and $\Omega$.

In the case of a two-stage experiment where stage 1 has $k$ outcomes and stage 2 has $l$ outcomes and every combination of results in the two stages is possible, the total number of combinations of results of stage 1 and 2 is $k\cdot l$. A formal generalization appears below.

expand.grid(S1 = 1:2, S2 = 1:3, S3 = 1:2)

## S1 S2 S3 ## 1 1 1 1 ## 2 2 1 1 ## 3 1 2 1 ## 4 2 2 1 ## 5 1 3 1 ## 6 2 3 1 ## 7 1 1 2 ## 8 2 1 2 ## 9 1 2 2 ## 10 2 2 2 ## 11 1 3 2 ## 12 2 3 2

We refer to the function $f(n)=n!$ as the factorial function and to $\begin{pmatrix}n\\r\end{pmatrix}$ as $n$-choose-$r$.

The factorial function grows very rapidly as $n\to\infty$. The R code below shows the considerable magnitude of $n!$ even for small $n$.

factorial(1:8)

## [1] 1 2 6 24 120 720 5040 ## [8] 40320

The following proposition shows that the growth rate of $n!$ is similar to the growth rate of $(n/e)^n$ (see Chapter B in the appendix for a definition of the limit notation below).

Proofs are available in (Feller, 1968) or (Rudin, 1976).

The proposition above is illustrated by the following graph comparing the growth rate on a log scale of the factorial with Stirling's approximation, an exponential function, and a linear function $x$. Stirling's approximation overlaps the factorial line indicating extremely good approximation.

x = 1:100 R = stack(list(`$x!$` = lfactorial(x), Stirling = log(2 * pi)/2 + (x + 1/2) * log(x) - x, `exp($x$)` = x, `$x$` = log(x))) names(R) = c("lf", "Function") R$x = x qplot(x, lf, color = Function, lty = Function, geom = "line", xlab = "$x$", ylab = "$\\log(f(x))$", data = R, size = I(2), main = "Growth Rate on Log Scale")

The following code generates all possible orderings of the letters a, b, and c. There are $3!=6$ such orderings.

# generate all 6 permutations over three letters library(gtools) permutations(3, 3, letters[1:3])

## [,1] [,2] [,3] ## [1,] "a" "b" "c" ## [2,] "a" "c" "b" ## [3,] "b" "a" "c" ## [4,] "b" "c" "a" ## [5,] "c" "a" "b" ## [6,] "c" "b" "a"

r = 1:50 p = exp(lfactorial(365) - lfactorial(365 - r) - r * log(365)) qplot(main = "Probability of $r$ People Having All Different Birthdays", x = r, y = p, size = I(2), xlab = "$r$", ylab = "$P(A_r)$")

# list all possible combinations of 3 out of 5 # letters combn(letters[1:5], 3)

## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] ## [1,] "a" "a" "a" "a" "a" "a" "b" "b" "b" ## [2,] "b" "b" "b" "c" "c" "d" "c" "c" "d" ## [3,] "c" "d" "e" "d" "e" "e" "d" "e" "e" ## [,10] ## [1,] "c" ## [2,] "d" ## [3,] "e"

The description above corresponds to choosing $r$ elements out of $n$ distinct elements, or alternatively placing $n$ distinct elements into two bins --- one with $r$ elements and one with $n-r$ elements. A useful generalization is placing $n$ distinct elements in $k$ bins with $r_i$ elements being placed at bin $i$, for $i=1,\ldots,k$.