Saturday, January 11, 2014

Linear Regression vs Logistic Regression

Linear Regression is used to establish a relationship between Dependent and Indipendent variables, which is useful in estimating the resultant dependent variable in case indipendent variable change. For example -

Using a Linear Regression, the relationship between Rain (R) and Umbrella Sales (U) is found to be - U = 2R + 5000

This equation says that for every 1mm of Rain, there is a demand for 5002 umbrellas. So, using Simple Regression, you can estimate the value of your variable.

Logistic Regression on the other hand is used to ascertain the probability of an event. And this event is captured in binary format, i.e. 0 or 1.

Example - I want to ascertain if a customer will buy my product or not. For this, I would run a Logistic Regression on the (relevant) data and my dependent variable would be a binary variable (1=Yes; 0=No).

In terms of graphical representation, Linear Regression gives a linear line as an output, once the values are plotted on the graph. Whereas, the logistic regression gives an S-shaped line.