Skip to main content

Posts

(AI Blog#4) : Normalization and Optimizers in a Neural Network

We are going to discuss about Normalization & Optimizers which will be used across AI, we will use them in LLM's, Agentic AI framework etc.  As we discussed in our previous blogs(links for previous blogs are mentioned at the end of this blog), while calculating gradients, when the previous weights are almost similar to new weights, we will land into a problem called Vanishing Gradient . Similarly when current weight is too large than the previous weights, then we will land into Exploding Gradient issue.  Using  Normalization problem we are going to prevent Vanishing & Exploding Gradient problems.  This blogs agenda : Normalization Batch Normalization (Useful in Neural Network) Layer Normalization (Useful in LLM's) Weights Initialization Xavier He EWMA (Exponential Weighted Moving Average) We will use it in Optimizers  like Momentum, NAG, Adagrad, RMSProp, Adam Normalization           Normalization in Neural Networks and De...
Recent posts

(AI Blog#3) Deep Learning Foundations - Activation & Loss Functions, Gradient Descent algorithms & Optimization techniques

It is extremely important to have a deep knowledge while designing a machine learning model, otherwise we will end up creating ML models which are of no use. We have to have a clear understanding on certain techniques to confidently build a ML model, train it using "training data", finalize the model and to deploy it in production. So far, from blog #1, #2, we have seen about the fundamentals of Deep Learning and Neural Network, architecture of a Neural Network, internal layers and components etc.  Providing the links of Blogs #1 , #2 below for quick reference. Deep Learning & Neural Networks : https://arunsdatasphere.blogspot.com/2026/01/deep-learning-and-neural-networks.html Building a real world neural network: A practical usecase explained : https://arunsdatasphere.blogspot.com/2026/01/building-real-world-neural-network.html Now let's dive through below concepts/criteria to help gaining confidence on building your ML model: Activation Functions (Forward Propaga...