How AlmaBetter created an
IMPACT!Shubham Dubey
Trainee at AlmaBetter at almaBetter
The professor and his team are planning the heist in the RBI bank because they came to know that ..
Disclaimer: This article is written with humorous intent. The Reserve Bank of India is admired globally and is the epitome of excellence in monetary policy and economic decision-making. Indian Police officers, in fact all UPSC personnel, are crème de la crème meritocrats. We do not condone any heinous activities.
The professor and his team are planning the heist in the RBI bank because they came to know that India is developing at an incredible rate. The professor is explaining how to minimize the risk of getting caught, by the Indian police, using the Gradient Descent Algorithm.
Tokyo: What is this Gradient Descent Algorithm?
Professor: Gradient Descent Algorithm is used to minimize a function by optimizing its parameter.
Raquel: Which function does it minimize?
Professor: It minimizes the cost function.
Denver: What is the cost function, and what is the need to minimize it?
Professor: The cost function represents the error between the actual value and the predicted value. Put another way: by minimizing the cost function, we are minimizing the error between the actual value and the predicted value.
Rio: How will this FUNCTION help us not to get caught?
Professor: By minimizing the function of the risk of getting caught. The Gradient Descent Algorithm is used to minimize the given function. In our case, it is a risk function. It does so by first calculating the slope of a function by the first-order derivative of the function and, second, it moves in the opposite direction to the gradient by alpha times the gradient at that point. (Professor took the chalk and wrote this on the blackboard)
New Value = Old Value - Step Size
Professor: In the world of mathematics, the ‘Step Size’ will be equal to the product of Learning Rate and Slope. So the above equation can be written like this.
Tokyo: Professor! In a simple language, please.
Professor: (Professor looked at Tokyo and slowly nodded and took the black color remote of the projector and presented this slide) In simple language, suppose this is a function that we want to minimize. And we want to find the value of X when Y is minimum. For this, we will be using Gradient Descent Algorithm.
Denver ((hastily answered): zero, zero X value should be zero for Y to be minimum.
Raquel: Why use Gradient Descent when we can clearly see the minimum value?
Professor: In this case, we are using a single variable. So the minimum value of a function is apparent. But what if there are multiple variables in the function? Say, 10 or 50? Then, how would you inspect the minimum value of that function? And by the minimum, I mean “local minima”, not global minima; because this algorithm finds the local minima of the function. For this very reason, we are using the gradient descent algorithm. Here we will start with some approximate values, and then we will be moving towards local minima. Let’s see this diagram.
Tokyo: Hold on, Professor. Do you want to risk our life by taking some approximate values?
Rio: Tokyo… calm down Tokyo.
Raquel: I am still not satisfied with this algorithm. Ah…Just show me what the graph of risk looks like after we enter the bank.
Rio: Yeah and also the risk function that we have to minimize. We’re supposed to minimize this function with respect to what? I am just asking what is X in our risk function?
Professor(calmly stares at Rio and then Raquel for a few seconds and then again to Rio and says): Time…We have to minimize the risk function with respect to time. Here’s how the graph of risk versus time looks like (professor took the remote and changed the slide) We need to escape from the bank at point A because after that it will become almost impossible to escape.
Denver: Okay. I want to ask why we are focusing so much on the Gradient Descent Algorithm. Can you please give some of the benefits of this algorithm?
Professor: Sure. There are around 4 major benefits of using this algorithm.
Fast enough to scale on big data.
The exact method will take more time. And we will never do it exactly, so we have to go with approximation.
The loss function gives the direction of the Optimal Solution.
Easy to understand.
See this graph of what this algorithm does. (Professor took the remote and showed this slide to everyone)
Professor: It is relatively very easy to understand.
Tokyo: Easy! What do you mean that I am a fool and everyone here is a genius? Come on, Professor. We will just keep on calculating these so-called minimum values of a function, and the Indian police will get us and maybe even shoot us. The lecture is over, and I am done with this ALGORITHM (Tokyo angrily left the class).
Meanwhile, the Indian Government came to know about this plan from their intelligence agency, and they have decided to contact AlmaBetter for a better understanding of this algorithm.
Related Articles
Top Tutorials
Python
Python is a popular and versatile programming language used for a wide variety of tasks, including web development, data analysis, artificial intelligence, and more.
Javascript
JavaScript Fundamentals is a beginner-level course that covers the basics of the JavaScript programming language. It covers topics on basic syntax, variables, data types and various operators in JavaScript. It also includes quiz challenges to test your skills.
SQL
The SQL for Beginners Tutorial is a concise and easy-to-follow guide designed for individuals new to Structured Query Language (SQL). It covers the fundamentals of SQL, a powerful programming language used for managing relational databases. The tutorial introduces key concepts such as creating, retrieving, updating, and deleting data in a database using SQL queries.