I was curious to learn Artificial Intelligence and thinking what is the best place to start learning, and then realized that Deep Learning and Neural Networks is the heart of AI. Hence started diving into AI from this point. Starting from today, I will write continuous blogs on AI, especially Gen AI & Agentic AI. Incase if you are interested on above topics then please watch out this space.
What is Artificial Intelligence, Machine Learning & Deep Learning ?
AI can be described as the effort to automate intellectual tasks normally performed by Humans. Is this really possible ? For example, when we see an image with our eyes, we will identify it within a fraction of milliseconds. Isn't it ? For a computer, is it possible to do the same within same time limit ? That's the power we are talking about. To be honest, things seems to be far advanced than we actually thing about AI.
BTW, starting from this blog, it is not just a technical journal, we talk about internals here. We will see modules, programing involved, concepts and connected information to gain sounds knowledge on AI and its related topics.
Machine Learning :
A Machine Learning system is trained rather than explicitly programmed. In classical programming, we define some set of rules in program and based on the data that comes in, program executes and gives us some results. But it actually won't perform anything which is not programmed right ? we need to handle each and every possible scenario while programming, so when that particular data/activity hits that piece of code, it will execute based in the conditions we define in the program, and perform what it intends to perform. Correct ? But in Machine Learning this is not the situation! We feed huge amount of data to train the ML model and it learn what it intends to learn from that data. We will also feed some answers to it so that our ML model comes up with some results. We will see more in depth information about what we are talking about now. Please see below diagram.
Circles in the Hidden layers are nothing but artificial neurons where actual processing happens. It transform input data into output after huge numbers of computations & calculations. Each hidden layer will have multiple neurons based on the complexity of the data. It will inject both Linearity & Non-Linearity to the data which we will see further in this blog.
Starting from Input Layer, each neuron is connected to every other neuron in the next layer. Also every neuron in each layer will receive input from the every other neuron from it's previous layer.
Also note that every connection has a number associated with it as shown in above diagram. These numbers are called WEIGHT's. For example weight of input X(1) to 1st neuron in hidden layer is 0.2. Each neuron has a number inside is, it is called BIAS. Both Weights and Biases are called parameters of the Neural Network.
How many neuron's should present in each layer and how many such layers should exist are controlled by Hyper parameters in NN. These are tunable based on the situation.
When we say a NN is trained, it means that a NN came to a point where they have certain Weights and Biases which has learned from the input data in each layer than can finally transformed to a meaningful output. Training also involved adjusting these Weights and Biases based on the error signals.
Try to digest below sentences carefully, read them with intensity to inject in mind(if multiple reads needed then please do, you have to remember it lifetime):
- In input layer X(1), X(2), X(3) are the input parameters
- At start, before first hidden layer, we will allocate some random numbers to these neuron's as weights & Biases and these will be adjusted down the length in each iteration based on the errors. We will ask program to initialize these numbers to some random numbers.
- I will provide detailed example to have a clear understanding, it is a huge process hence I will write it at end of this blog or a new blog.
- Lets say, for example the predicted value is 100 but the actual value is 120, so the loss is (100 - 120 = -20), now there is something called back propagation algorithm which will adjust the weight based on this loss and it will be iterated to next level of processing. I know it is still not clear, don't worry! Every single doubt will be clarified slowly. Just keep moving further.
- So, all these numbers in the above picture are learned and adjusted accordingly
Neural Network Parameter calculation :
- Below NN has 5 input nodes, 3 hidden layers with 5 neurons in each layer and a output layer.
- No. of parameters in a layer is equal to
- = (neurons in previous layer * neurons in current layer) + neurons in current layer
- = number of connections + biases
- = weights + biases
- Total number of parameters = Sum of parameters in each layer
- Ex, below NN has 69 parameters
Alright, let's see what is a Tensor.
Tensor :
- A Tensor is a data container or mathematical object that represent data in N dimension
- These are used for data processing(we will see in programming, I will add colab notes with proper description)
- It is a structured way to store numbers for computers to process
- Core properties of a Tensor is as below :
- Rank (Order) : Number of dimensions
- Shape : Tuple indicating size along each dimension (Matrix : 3 * 4 - 3 rows, 4 columns)
- Data Type : Type of elements(float32, int etc.)
- It is single valued, represented by a single real or complex number.
- Magnitude only - has size but no directional component
- Rank 0 tensor - In tensor algebra, a scalar is a rank-0 tensor
- Examples : batting average, number of goals scored etc.
2) Vectors (Rank-1 Tensor) : A vector is an ordered collection of numbers that represents both magnitude and direction in space.
- It is multi valued - Represented by and ordered list of real or complex numbers
- Has both magnitude and direction
- Coordinate dependent - Components change with coordinate system transformation
- Rank-1 tensor - In tensor algebra, vector is a rank-1 tensor
- Note : 2nd image below is a 3D vector
- Notation - Tuple/List : (x1, x2, x3 ....xn) OR [x1, x2, x3, ..xn]
- A collections of vectors
- It has both row, column wise organization
- Values changes based on choice of coordinate basis
- It tensor algebra, matrices are rank-2 tensors
Types of matrices : Just refresh your cache with below information, I know you know these :)
- Rectangular matrix
- Square matrix
- Diagonal matrix
- Identity matrix
- Upper triangular matrix
- Lower triangular matrix
- Cube -valued - represented by a 3 dimensional array of numbers
- Multi-directional structure - has depth, height and width
- Coordinate invariant - Its values doesn't change with coordinate system transformations
- Image data
- A batch of 128 color images could be stored in a tensor of shape (128, 256, 256, 3)
- 128 images
- 256 * 256 pixels
- 3 is the color depth (remember R G B, depth of red, blue, green in any image ?)
- It is video data
- Video data is one of the few types on real world data for which you will need rank-5 tensor
- A 60 second, 144 * 256 YouTube video clip sampled at 4 frames per second would have 240 frames
- A batch of 4 such video clips would be stored in a tensor of shape(4, 240, 144, 256, 3)
- Below diagram shows what will happen inside a neuron
- x1, x2, x3, xn are the inputs, it is a vector on 'n' number of values
- Each input has a weight on neuron
- Sigma is the summation function, Sigma = x1*w1 + x2*w2+x3*w3+xn*wn
- Bias is added to the output of sigma, bias is another number
- Activation function(f), add non-linearity to the neuron, it is applying some kind of quadratic function or exponential function
- Each neuron outputs another number, similar type of processing happens in each neuron across hidden layer
- Final predicted output is : Summation of all input * weights + bias + activation function applied on this result to add non-linearity
- Linear transformation talks about : if X is a number and it is multiplied with any constant in a 2 dimensional space lets say (1, 2), then vector magnitude gets increased
- Non linear transformation change/adjust the angle of actual linear vector
- For example, lets us consider a simple formula, y = mx + b (where (m, b) is a point in 2D space), and x is a constant
- Now if we multiply m with x, then only m value changes but not b, and it add non-linearity to the initial linear vector
- Linear data :
- Here the relationship between the input and output is linear data
- Real time examples :
- House price Vs House Size
- We add more space to house, price also will increase
- If we draw in a line a line between Size & Price, it is linear. Isn't it ? This is called linearity or linear data.
- Non linear data :
- Deep Learning is meant of Non-linear data, not for Linear data. Linear data we can handle using Machine Learning as well.
- Real time examples :
- Let us consider : Age, Salary, Discount, Brand purchase history of a customer in a online store like Amazon.
- Think that all the above variables will work independently, correct ?
- Do you think one parameter is dependent on other ?
- Now understand
- Low Salary + High Discount ==> He might buy the product
- High Salary + Low Discount ==> He might buy the product
- Med Salary + Medium Discount ==> May or May not buy
- This is not a straight line ? Isn't it ? Sometimes Salary is low, medium, high and similar for all other params
- This is called non-linear data
- Vector operations
- Matrix operations
- U, V are 2 different vectors and we are adding both, it add corresponding values in each vector an create a new vector
- Bias will add to this new vector (bias is also a vector which we will add to final summation vector)
- Scalar multiplication - c * v = [c*v1, c*v2, ...]
- We are multiplying a scalar with a vector
- This is a linear transformation
- If we multiply with negative scalar, say -2 then it will traverse in reverse director because of sign '-'
- This is the most important operation that we want to learn
- We have 2 kinds of vector multiplications
- Dot Product
- Cross Product
- From Neural Networks perspective, we are interested in "Dot Product"
- Please see below images for information on matrix transformation, it will be easy to understand via images instead of data
- Rank = No. of linearly independent rows or columns in a matrix
- Rank tells you how much unique information a matrix contains
- Reshaping a tensor means rearranging its rows and columns to match a target shape.
- Naturally, the reshaped tensor has the same total number of coefficients as the initial tensor.
- Please verify below images for limitations on Linear transformations and the need of non-linear transformations
- ReLu is a famous non-linear transformation which we use in Deep Learning and Neural Networks
- Read below activation function formulas from below images and memorize
- Sigmoid
- Softmax
Comments
Post a Comment