What is Logistic Regression?
Logistic Regression is a statistical model used to determine if an independent variable has an effect on a binary dependent variable. This means that there are only two potential outcomes given an input. For example, it may be used to determine if an email is spam, or not, using the rate of misspelled words, a common sign of spam. Other forms of regression analysis, like a linear regression, require the definition of a threshold to distinguish the binary classes (e.g. <50% misspelled = not spam, >50% misspelled= spam). Linear regression allows for a probability to be established, but it must then be applied to a logistic regression to make the distinct classification.
How does Logistic Regression work?
A commonly used model is a sigmoid function. In the sigmoid function, also known as a squashing function, outputs are contained between the boundaries of 0 and 1. Here, we can use the model:
Logistic Regression and Machine Learning
As logistic regression analysis is a great tool for understanding probability, it is often used by neural networks in classification. A machine learning algorithm can take a given data set, analyze for weights and biases, and based upon a defined decision boundary, can make predictions about a variable within the context of the function.