Skip to main content


 It is the basic algorithm at which everybody would like to start their learning in Data Science.

Now, What exactly the Linear Regression is

1.    Linear Regression is the supervised learning algorithm where it’s main aim is to find the line that best fits the given data.

2.    Here ‘Fitting the best line for given data’ means finding the relation between dependent and independent variables present in the data.

Note 1: you need to use Linear regression only when your dependent and independent variables have linear relationship.

Note 2: Here Independent variables can be both discreet or continuous data, but dependent variables should be continuous data.

Ok, Let me explain with good example,


In the above example,

If we observe the data, As ‘years of Experience’ is increasing, ‘Salary’ also increasing. It means they have linear relationship. So here we can apply Linear regression.

Ok, we observed the linear relation, Now how can you find best fit line?

We know that,

‘years of Experience’ (Y) directly proportional to ‘Salary’ (S) which means we can write it as follows.

      Y = m * S

Now we need to add bias ‘b’ so that we will be more accurate.    

 Y=m*S + b

Ok, we got the line equation, what now?

Yes, we got a line equation, but for same data we get so many lines, because ‘m’ & ‘b’ can be any values.



Here now comes The ultimate part to find ‘m’ & ‘b’ such that we need to find an optimal line that best fits the data and also should outperform all other lines with less errors.


What exactly the error means, How do we quantify it in Linear regression? (Quantifying the error nothing but the cost function we can also call it as Loss function or Error function.)

·      By finding the best fit line we need to decrease the error between original value and predicted value.

·      During finding the errors we get both positive and negative errors, In order to quantify both we have a cost function known as Root Mean Squared Error.




Now, our main aim is to minimize the cost function. How do we do it? 

There comes Optimizers like Gradient descent, Stochastic Gradient descent, Adagrad, Adam etc... 

So, with the help of optimizers we update ‘m’ & ‘b’ so that we will get optimal values there by we get best fit line. With the help of that line we can able to predict the Future data.

Gradient descent is the basic optimizer, lets discuss about it.

We use gradient descent to minimize the cost function iteratively by updating parameters ‘m’ & ‘b’.

Step 1: lets initialize ‘m’ & ‘b’ randomly.

Step 2: Now we need to update ‘m’ & ‘b’ as follows:


Similarly, For ‘b’ we need to update accordingly.

We need to do it iteratively till,

1.          mnew almost equals to mold   

2.          bnew almost equals to bold

Thus we will get the optimal parameters thereby we get best-fitted line so that we can use it for prediction.


1. linearly related

2. Error must be Gaussian distributed

3. Features must be non multicollinear

4. The variance of residual is the same for any value of X - Homoscedasticity


1.    weather forecasting

2.    predicting the price of the house

3.    predicting the stock price in the stock market.


Popular posts from this blog

Complete Tutorial on Business Analytics

CRISP-DM (Cross Industry Standard Process for Data Mining)                                    The framework is made up of 6 steps: Business Issue Understanding Data Understanding Data Preparation Analysis/Modeling Validation Presentation/Visualization The map outlines two main scenarios for a business problem: Data analysis Predictive analysis Data analysis refers to the more standard approaches of blending together data and reporting on trends and statistics and helps answer business questions that involve understanding more about the dataset such as "On average, how many people order coffee and a donut per transaction in my store in any given week?" Predictive analysis will help businesses predict future behavior based on existing data such as "Given the average coffee order, how much coffee can I expect to sell next week if I were to add a new brand of coffee?" Business Issue Understanding "This initial phase focuses on understanding the project objectives a

What is a Type II Error?

  A Type II error is a false negative in a test outcome, where something is falsely inferred to not exist. This usually means incorrectly accepting the null hypothesis (H0), which is the testing statement that whatever is being studied has no statistically significant effect on the problem. An example would be a drug trial that incorrectly concludes the prescribed medication had no effect on the patient’s ailment, when in fact the disease was cured, but subsequent exams caused a false positive showing the patient was still sick. Null Hypothesis and Statistical Significance In practice, the difference between a false positive and a false negative is usually not so clear-cut. Since the tests are most often quantitively rather than qualitatively based, the results tend to be expressed in a confidence interval value less than 100%, rather than a simple Yes/No decision. This question of how likely the results are to be found if the null hypothesis is true is called statistical significance.

The Art of Winning Kaggle Competitions

 What is Kaggle? Learning data science can be overwhelming. Finding a community to share code, data, and ideas can se e m also seem like an overwhelming as well as farfetched task. But, there is one spot where all of these characteristics come together. That place is called Kaggle. Looked at more comprehensively, Kaggle is an online community for data scientists that offers machine learning competitions, datasets, notebooks, access to training accelerators, and education. Anthony Goldbloom (CEO) and Ben Hamner (CTO) founded Kaggle in 2010, and Google acquired the company in 2017. Kaggle competitions have improved the state of the machine learning art in several areas. One is mapping dark matter; another is HIV/AIDS research. Looking at the winners of Kaggle competitions, you’ll see lots of XGBoost models, some Random Forest models, and a few deep neural networks. The Winning Recipe of Kaggle Competitions involves the following steps: Step one   is to start by reading the competition gu