What is Probability Theory?
How does Probability Theory work?
Applications of Probability Theory
Law of Large Numbers
The law of large numbers is a mathematical theorem used to describe the outcome of real world experiments when repeated a large number of times. Imagine a coin toss, where the probability of the coin landing on either the "heads" or "tails" side is .5. The law of large numbers suggests that upon extended repetition of the coin toss, the average ratio of "heads" to "tails" would approach unity. In short, there is likely going to be a near even amount of "heads" vs. "tails" after a long enough duration. Because the law of large numbers is a natural occurrence, deriving its probabilities from the real world, it is understood as a theorem rather than a theory and a central figure to modern statistics.
By Pred - Own work, CC0, https://commons.wikimedia.org/w/index.php?curid=58536069
Probability Theory and Machine Learning
Probability theory is incorporated into machine learning, particularly the subset of artificial intelligence concerned with predicting outcomes and making decisions. In computer science, softmax functions are used to limit the functions outcome to a value between 0 and 1. These functions, also known as squashing functions, are useful in an algorithms process of assigning outcomes a probability value. The values assigned by these functions assist the neural network in making better decisions, and is often the final step in a neural network function.