Hello and Welcome back. So far we have been able to develop a simple regression model and solved it using linear algebra. In this article, I am going to explore another alternative approach called Gradient Descent. It’s an iterative method to optimize a cost function by finding a local/global minimum by taking steps proportional to the gradient of the cost function. To apply the Gradient Descent algorithm we need to first define our cost function for our machine learning problem.
We have a set of features for which the actual output is given we want to develop a prediction model to predict the output for the given set of features. We can represent our feature vectors for the complete training set as

**matrix X**, the corresponding output values as**vector y**, and the predicted values as vector**\hat{y}**. So our prediction model is**\hat{y} = X*a**and the error vector for the training set in the model is**err = \hat{y} - y**. We would like to choose a cost function such that the overall error of t…