Skip to main content

Gradient Descent in Dynamo


Hello and Welcome back. So far we have been able to develop a simple regression model and solved it using linear algebra. In this article, I am going to explore another alternative approach called Gradient Descent. It’s an iterative method to optimize a cost function by finding a local/global minimum by taking steps proportional to the gradient of the cost function. To apply the Gradient Descent algorithm we need to first define our cost function for our machine learning problem.
We have a set of features for which the actual output is given we want to develop a prediction model to predict the output for the given set of features. We can represent our feature vectors for the complete training set as matrix X, the corresponding output values as vector y, and the predicted values as vector \hat{y}. So our prediction model is \hat{y} = X*a and the error vector for the training set in the model is err = \hat{y} - y. We would like to choose a cost function such that the overall error of the model could be minimized. Thus we choose the mean square error as our cost function. The dot product of the error vector with itself gives the sum of the square of errors and dividing it by the number of training examples gives us the mean square error. Here is the Dynamo graph to compute mean square error for a given training set.


Since the cost function we have chosen is quadratic (square of error), it has one global minima and we can find it by traversing along the gradient. Now the gradient of the cost function is obtained by finding the partial derivation of the cost function with the coefficient vector and it can be simplified as G = \alpha (\hat{y} - y) * X^{T}, the numeric constant alpha is the learning rate. Using this gradient we can now update the value of coefficient vector ‘a’ as a := a - \alpha (\hat{y} - y) * X^{T} The Dynamo graph to update any given coefficient vector ‘a’ is shown below. Note that one extra iterator input node, when we wrap it into a custom node LinearRegressionGradientDescent. This iterator is used to run an iteration in Dynamo using List.Reduce node.


We now assemble these pieces together to evaluate our regression model. We need to initialize the coefficient vector with ones and create a list of repeated items with this coefficient vector and the amount of iteration we want to do for evaluation. This list is then fed to the List.Reduce node with initial/seed coefficient vector and the custom node function as the reducer. This is a unique way to run imperative loop and update the value using nodes in Dynamo. We could have used imperative code block and run a loop to achieve the same, but I prefer to use map, reduce a concept from functional programming paradigm in Dynamo. To get a better result and faster conversion we should also normalize our feature vectors. The complete Dynamo graph is shown below. We can again use test and validation set to find the optimum alpha for our dataset.


Conclusion: In this article, I have shown how to define the cost function and implement gradient descent in Dynamo. In fact, the gradient descent concept can be used to optimize any other cost function for which we know how to compute the gradient. I have also used List.Reduce higher-order function node to do the iterations. I hope you liked this post, please do send your feedback and comments.

Comments

Popular posts from this blog

Linear Regression in Dynamo

Today we are collecting more data than ever but making sense out of the data is very challenging. Machine learning helps us analyze the data and build an analytical model to help us with future prediction. To create a machine learning prediction model we usually develop a hypothesis based on our observation of the collected data and fine tune the model to reduce the cost function by training the model. One of the very simple hypothesis we can develop by assuming a linear relationship between input and output parameters. Suppose we have data on housing price and we assume that housing prices are linearly related to the floor area of the house then we can use linear regression to predict the price of a house with specific floor area. One of the most important steps towards building the hypothesis is being able to visualize the data and understand the trend. So let first draw a scatter plot of our data as shown in the figure below. I am using Grapher package to draw the scatter p

Associativity in AutoCAD

CAD applications often implement features or entities which are associative in nature, for example creating an extrude surface in AutoCAD 2011, by extruding a profile creates associative extrude surface. When you modify the profile the extruded surface will update itself to follow the profile. AutoCAD developers now can develop fascinating applications using new associative framework. There are APIs available in C++ as well as .NET to implement associative features in AutoCAD. You can download the latest ObjectARX to make use of associative framework. The associative framework in AutoCAD helps us to maintain relationship between objects. Relations are represented as hierarchical Networks of Objects, Actions and Dependencies between them. The core building blocks are: Action - Defines behavior or intelligence of application or feature. Dependency- Maintains associativity/ relations between objects and notifies changes on dependent objects. Network- The associative model Object-