# Understanding Logistic Regression

**Introduction**

Companies need to make data – led decisions. Often these decisions have simple Yes or No outcomes. In Data parlance we call this a Binary outcome. Some examples of this can be as below

- Will recipients open our Marketing Email OR Not?
- Is an E-mail Spam or Not?
- Should we give an applicant a loan or Not?

Basically, we need a logical method or an algorithm to *classify *the E-mail or applicant into one or two categories. But how do we do this, as there are many factors influencing the ‘Yes’ or ‘No’ decision or the binary outcome. In another words the outcome is a varies or depends on many other variables. The answer to these problems is *Logistic regression.*

**A sample problem**

Let’s take an example of whether an e-mail will be labelled as Spam or not to learn the important terms. Whether an *e-mail is spam or not* can depend on many factors which are also called *Independent Variables*:

- Whether the site sending it is secure or not
- The number of links in the email
- The kind of attachments in the email
- Whether other users have flagged that e-mail as spam
- The IP Address of the sender

In this case the whether the E-mail is opened or not is the dependent variable as it depends on variables like the five listed above.

**Terms used in Logistic Regression**

Technically, Logistic regression is a **machine learning classification** algorithm. It helps us to find the chances of probability of a **categorical dependent variable** like an E-mail being labelled Spam or not with the help of logistic functions.

The outcome of logistic regression is any **binary value** such as Male or Female, Yes or No, 1 or 0, Spam or Not Spam. Nowadays it is widely used for **classifying things**. There are many examples which are evident that machine learning algorithms have made our lives easier as an accurate Logistics regression model helps us take decisions faster once it has been trained on training data.

**Training data** is the data that the algorithm learns from. It contains sample values of dependent and independent variables that the algorithm can be trained on. Once it is trained it can be checked on the test data. This is a new data set that the algorithm is exposed to see if it can make accurate predictions.

**Business Case Study **

**Case Study Santam Insurance saves $2.4 million using Analytics **

The South African Insurance Company Santam used Analytics to process the insurance claims and determine if they are *fraudulent or not*. This is a clear Logistics Regression problem. It enabled Santam to make faster payouts for legitimate claims and reduce losses on fraudulent claims.

Se we see Logistic regression increased the speed and efficiency of the processing methodologies and improved the customer experience. Santam sets a wonderful example how the companies can protect their business from frauds as well as customers from being deceived. Another benefit was that they were able to reduce the on the ground investigators, evaluator and workforce that would manually investigate claims for genuineness.

**How does Logistic regression work?**

Logistic regression uses functions called the *logit functions* to derive a relationship between the dependent variable and independent variables by estimating the best probabilities or chances of occurrence.

The logistic functions (also known as the *sigmoid functions*) convert the probabilities into binary values which could be further used for predictions.

**Logit Functions – Don’t worry if it looks mathematical **

Logistic regression is the estimate of the logit functions which could be calculated as the logarithm of the odd ratios. There are simple functions that help us do this in Excel, R or Python. Or we can also use the formula for the function.

This is the simple logistic regression equation for prediction of probability of the occurrence of interested outcome

**Log-Likelihood Function**

The Log-Likelihood function is as below. This is the function that gives us the final likely hood or result. Such as to give the loan or not. This is the final output we desire of the Logistic regression model.

LL = ∑ Y_{i}*P(X_{i}) + (1 – Y_{i})(1 – P(X_{i}))

**Advantages**

- It is easy to implement and very efficient to train.
- Because of its simplicity it can be implemented quickly.

**Disadvantages**

It cannot solve non-linear problems or problems where we can’t use a line to plot the relationship between variables.

It needs all of its independent variables to be identified properly. This can be a big task in a business where there are 300+ variables in the data and additionally macroeconomic factors such as inflation, dollar exchange rates etc.

** ****Logistic regression using Excel**

We have dataset of 20 similar machines and we want to predict if all of them produce output as per our specifications.

Machine
meets Specification |
Machine
Age (Months) |
Average
Number of shifts per week |

1 | 57 | 4 |

0 | 73 | 5 |

1 | 22 | 5 |

0 | 59 | 4 |

1 | 15 | 4 |

1 | 36 | 2 |

0 | 68 | 5 |

0 | 49 | 5 |

0 | 27 | 7 |

1 | 59 | 3 |

1 | 10 | 6 |

0 | 78 | 8 |

1 | 22 | 6 |

1 | 36 | 4 |

0 | 57 | 7 |

0 | 73 | 8 |

1 | 38 | 5 |

0 | 71 | 7 |

0 | 35 | 4 |

1 | 44 | 5 |

We have assumed our binary output in the following manner: –

1 – Machine meets the specifications

0 – Machine does not meet the specifications

** **

**Step 1**– First we will **sort our data**

Using excel sorting tool just sort the data on the basis of dependent variable.

Machine
meets Specification |
Machine
Age (Months) |
Average
Number of shifts per week |

0 | 78 | 8 |

0 | 73 | 8 |

0 | 73 | 5 |

0 | 71 | 7 |

0 | 68 | 5 |

0 | 59 | 4 |

0 | 57 | 7 |

0 | 49 | 5 |

0 | 35 | 4 |

0 | 527 | 7 |

1 | 59 | 3 |

1 | 57 | 4 |

1 | 44 | 5 |

1 | 38 | 5 |

1 | 36 | 4 |

1 | 36 | 2 |

1 | 22 | 6 |

1 | 22 | 5 |

1 | 15 | 4 |

1 | 10 | 6 |

** **

**Step 2**– Calculating **Logit Function**

We will have a logit function with explanatory variables as **Age** and **Average number of shifts**

** **L = b0 + b1*Age + b2*Average number of shifts

We will take arbitrary values for bo, b1 and b2 as 0.1

Now we will calculate the L (logit function) value for the data

Machine
meets Specification |
Machine
Age (Months) |
Average
Number of shifts per week |
Logit values
(L) |

0 | 78 | 8 | 8.7 |

0 | 73 | 8 | 8.2 |

0 | 73 | 5 | 7.9 |

0 | 71 | 7 | 7.9 |

0 | 68 | 5 | 7.4 |

0 | 59 | 4 | 6.4 |

0 | 57 | 7 | 6.5 |

0 | 49 | 5 | 5.5 |

0 | 35 | 4 | 4 |

0 | 527 | 7 | 3.5 |

1 | 59 | 3 | 6.3 |

1 | 57 | 4 | 6.2 |

1 | 44 | 5 | 5 |

1 | 38 | 5 | 4.4 |

1 | 36 | 4 | 4.1 |

1 | 36 | 2 | 3.9 |

1 | 22 | 6 | 2.9 |

1 | 22 | 5 | 2.8 |

1 | 15 | 4 | 2 |

1 | 10 | 6 | 1.7 |

** **

**Step 3**– Calculate P(x) for each data record

P(x) =e^{L} /(1 + e^{L})

Machine
meets Specification |
Machine
Age (Months) |
Average
Number of shifts per week |
Logit values
(L) |
e^{L} |
P(x) |

0 | 78 | 8 | 8.7 | 6002.912247 | 0.990833442 |

0 | 73 | 8 | 8.2 | 3630.950324 | 0.999725422 |

0 | 73 | 5 | 7.9 | 2697.28234 | 0.999629394 |

0 | 71 | 7 | 7.9 | 2697.28234 | 0.999629394 |

0 | 68 | 5 | 7.4 | 1635.984437 | 0.999389121 |

0 | 59 | 4 | 6.4 | 601.8450401 | 0.998341199 |

0 | 57 | 7 | 6.5 | 665.1416355 | 0.998498818 |

0 | 49 | 5 | 5.5 | 244.691933 | 0.995929862 |

0 | 35 | 4 | 4 | 54.59815016 | 0.98201379 |

0 | 527 | 7 | 3.5 | 33.11545202 | 0.970687769 |

1 | 59 | 3 | 6.3 | 544.5719121 | 0.998167061 |

1 | 57 | 4 | 6.2 | 492.7490428 | 0.99797468 |

1 | 44 | 5 | 5 | 148.4131595 | 0.993307149 |

1 | 38 | 5 | 4.4 | 81.45086887 | 0.987871565 |

1 | 36 | 4 | 4.1 | 60.34028774 | 0.983697501 |

1 | 36 | 2 | 3.9 | 49.40244921 | 0.980159694 |

1 | 22 | 6 | 2.9 | 18.1741454 | 0.947846437 |

1 | 22 | 5 | 2.8 | 16.4446468 | 0.942675824 |

1 | 15 | 4 | 2 | 7.389056107 | 0.880797078 |

1 | 10 | 6 | 1.7 | 5.473947397 | 0.845534735 |

**Step4- **Calculating **Log-Likelihood Function**

We will now calculate the values

-8.700166577 |

-8.200274621 |

-7.900370679 |

-7.900370679 |

-7.40061107 |

-6.401660182 |

-6.501502314 |

-5.504078446 |

-4.01814993 |

-3.52975042 |

-0.001834621 |

-0.002027374 |

-0.006715348 |

-0.012202585 |

-0.016436847 |

-0.020039767 |

-0.053562776 |

-0.059032826 |

-0.126928011 |

-0.167786029 |

**Step5- **Calculating **Maximum Log-Likelihood** Function

Earlier we assumed the value of the coefficients for our logit function. Now we will find those values of coefficients which maximize our log-likelihood function.

For this purpose we will use an Excel tool that is **Excel Solver**. Its purpose is to adjust the numeric value in a particular cell so that we could maximize or minimize the value in desired cells.

We will take columns 2, 3 and 4 for evaluation.

After running the Excel Solver we will get the following output. According to this new values of coefficient will be b0, b1 and b2.

bo | 12.48285608 |

b1 | -0.117031374 |

b2 | -1.469140055 |

Values of all log-likelihood will change according to the values of b0, b1, and b2.

The new value of maximum log-likelihood (summation of all log-likelihood) will be: – -6.654560484.

** ****Step6**– Evaluate values

For any values of age and average number of shifts you can predict the probabilities using the new values of coefficients.

In this case the probability of the desired output being produced by the Machine is only 8% which is very low. Now, co-relation is not causation so we cannot blame this on Age and weekly shifts alone. In a full blown Logistic regression model, we would have many more variables and accuracy.

Similarly, the other values show that the logit function is consistent with the given data. Hence we can predict any probability with this equation with the help of excel solver.

### Leave A Reply

You must be logged in to post a comment.

## 15 Comments

Could you let us know the formula for Step-4. Neither LN(P(x)) or LOG(P(x)) or LOG10(P(x)) gives the desired output.

Hot galleries, thousands new daily.

http://rickardsville.hidden.camera.porn.alypics.com/?alejandra

porn free video full length jodie bailey porn fee picture porn gauge mobile porn donna marie porn vids free

Nude Sex Pics, Sexy Naked Women, Hot Girls Porn

http://niggagirlsporn.instakink.com/?willow

sex on a plane porn watchable free porn videos free flexible lesbian porn porn industry demonic petite pantie porn movies

Free Porn Galleries – Hot Sex Pictures

http://hightsvilleporn.wet.energysexy.com/?mollie

doctor strokes porn erica ellyson porn 100 porn sites in 1 most beautiful new porn star good teen porn

Sexy teen photo galleries

http://gangbang.porn.lincoln.energysexy.com/?maggie

best selling porn films free porn preview downloads porn leecher lair free sleeping teenager streaming porn videos haley watson porn surf emma

Sexy photo galleries, daily updated pics

http://blonds.sexy.fetlifeblog.com/?jaclyn

full length amateur porn videos good porn movies free dirty cartoon porn series only free porn meterial porn theater movie

Hot teen pics

http://lafayette.latina.sex.hoterika.com/?alison

young boy porn video free sexy christmas porn vids travesty porn adult porn breasst milk kathy cox porn

Nude Sex Pics, Sexy Naked Women, Hot Girls Porn

http://sexyvideoebony.xblognetwork.com/?aliya

free uk milf porn graveey porn blonde snake women vagina porn free extica girl porn porn sex clips

New super hot photo galleries, daily updated collections

http://europorntorrentantreville.miyuhot.com/?angelique

viewer submitted animal porn porn with cats swedish vintage porn movies savanah deaf porn star huge gothic pussy porn pics

Young Heaven – Naked Teens & Young Porn Pictures

http://smsdating.instasexyblog.com/?jessica

free galleries of classic porn toon porn king of the hill teen porn fucked hard lilttle girl porn esurance agent porn

Free Porn Galleries – Hot Sex Pictures

http://cedarglenlakesfirsttimporn.moesexy.com/?ivy

porn comedy videos online free cast porn lana kane porn videos snow white fantasy porn strip uniform pouty lips porn

College Girls Porn Pics

http://drebonyandivory.miaxxx.com/?janelle

free black teens porn free fantasy creature porn free my sex teacher porn pov porn on iphone free american porn videos

Teen Girls Pussy Pics. Hot galleries

http://ashemale.alexysexy.com/?yvonne

jamaican lesbian porn list of search engine porn private young porn tapes freered haired porn guy parker porn

New super hot photo galleries, daily updated collections

http://crossville.massage.porn.amandahot.com/?amanda

free porn videos expoited college girls best porn movie ever made free gay mobile porn downloads anime comics porn 70s porn golden age

https://gidra-zerkalo8.site Гидра зеркало