  # Probability Theory

## What is Probability Theory?

Probability theory describes probabilities in terms of a probability space, typically assigning a value between 0 and 1, known as the probability measure, and a set of outcomes known as the sample space. Outcomes are often referred to as the results of an event. Probability theory in general attempts to apply mathematical abstractions of uncertain, also known as non-deterministic, processes. The tools that are common in probability theory are discrete and continuous random variables, probability distributions, and stochastic processes.

## How does Probability Theory work?

The foundation of probability theory is the idea that every possible outcome from a sample space is assigned a numerical value between 0 and 1. These numbers represent the likelihood that the event will occur. The sum total of probabilities of a set of events is known as a probability distribution.

## Applications of Probability Theory

Probabilities are a cornerstone of mathematical understanding and are extremely common in representing outcomes both in abstract situations, and in real-world scenarios. Below are two examples, one that uses probability theory in an abstract sense as a way of understanding phenomena, and the other that is a real-world application of probability theory.

### Law of Large Numbers

The law of large numbers is a mathematical theorem used to describe the outcome of real world experiments when repeated a large number of times. Imagine a coin toss, where the probability of the coin landing on either the "heads" or "tails" side is .5. The law of large numbers suggests that upon extended repetition of the coin toss, the average ratio of "heads" to "tails" would approach unity. In short, there is likely going to be a near even amount of "heads" vs. "tails" after a long enough duration. Because the law of large numbers is a natural occurrence, deriving its probabilities from the real world, it is understood as a theorem rather than a theory and a central figure to modern statistics.

By Pred - Own work, CC0, https://commons.wikimedia.org/w/index.php?curid=58536069

### Probability Theory and Machine Learning

Probability theory is incorporated into machine learning, particularly the subset of artificial intelligence concerned with predicting outcomes and making decisions. In computer science, softmax functions are used to limit the functions outcome to a value between 0 and 1. These functions, also known as squashing functions, are useful in an algorithms process of assigning outcomes a probability value. The values assigned by these functions assist the neural network in making better decisions, and is often the final step in a neural network function.