Cumulative Distribution Function

What is a Cumulative Distribution Function?

A cumulative distribution function (CDF) describes the cumulative probability of any given function below, above or between two points. Similar to a frequency table that counts the accumulated frequency of an occurrence up to a certain value, the CDF tracks the cumulative probabilities up to a certain threshold.

In algebraic terms, this function provides the cumulative value from negative infinity to a random variable (X). Expressed as:

F(x) = P(X≤x)


How is Cumulative Distribution Function Used?

Besides finding the probability of a random variable below or between two points, you can find the probability of a random distribution above a particular threshold. The latter is a technique called the complementary cumulative distribution function, or tail distribution, and as is quite useful in hypothesis testing.

For a simple example, if a machine learning logistics program for a hospital used a CDF that tracked venomous snake bites patients, you could determine:

  • The probability of receiving more than 12 snake bite patients per year.
  • The probability of receiving less than 12 snake bite patients per year.
  • The probability of receiving between 12-15 snake bite patients per year.

In this example, the hospital could more accurately predict how much anti-venom doses they should keep in stock.