Skip to main content

(AI Blog#1) Deep Learning and Neural Networks

I was curious to learn Artificial Intelligence and thinking what is the best place to start learning, and then realized that Deep Learning and Neural Networks is the heart of AI. Hence started diving into AI from this point. Starting from today, I will write continuous blogs on AI, especially Gen AI & Agentic AI. Incase if you are interested on above topics then please watch out this space.


What is Artificial Intelligence, Machine Learning & Deep Learning ?

AI can be described as the effort to automate intellectual tasks normally performed by Humans. Is this really possible ? For example, when we see an image with our eyes, we will identify it within a fraction of milliseconds. Isn't it ? For a computer, is it possible to do the same within same time limit ? That's the power we are talking about. To be honest, things seems to be far advanced than we actually thing about AI. 

BTW, starting from this blog, it is not just a technical journal, we talk about internals here. We will see modules, programing involved, concepts and connected information to gain sounds knowledge on AI and its related topics.

Machine Learning : 

A Machine Learning system is trained rather than explicitly programmed. In classical programming, we define some set of rules in program and based on the data that comes in, program executes and gives us some results. But it actually won't perform anything which is not programmed right ? we need to handle each and every possible scenario while programming, so when that particular data/activity hits that piece of code, it will execute based in the conditions we define in the program, and perform what it intends to perform. Correct ? But in Machine Learning this is not the situation! We feed huge amount of data to train the ML model and it learn what it intends to learn from that data. We will also feed some answers to it so that our ML model comes up with some results. We will see more in depth information  about what we are talking about now. Please see below diagram.

Deep Learning :

                  Deep Learning is a subset of Machine Learning that uses neural networks with many layers(deep networks) to learn patterns from large amounts of data automatically.

Deep Learning = Machine Learning using multi layered Neural Networks.

So, ho many layers it uses ? It depends on the complexity of the input data. More complex the data, more layers to analyze the pattern. We will see more technicality in this aspect later.

Below diagram shows a 4 layered Deep Learning Neural Network(NN), to analyze the given image and comes up with output after passing through all the NN layers.

Think this deep network as a multistage information distillation process. 



Structure of Neural Network :

            Below diagram shows the structure of a simple single hidden layer Neural Network. Every NN will have 3 layers.

1. Input Layer (which provides the input)
2. Hidden Layer(Single or multiple based on the complexity of use case)
3. Output Layer 

Circles in the Hidden layers are nothing but artificial neurons where actual processing happens. It transform input data into output after huge numbers of computations & calculations. Each hidden layer will have multiple neurons based on the complexity of the data. It will inject both Linearity & Non-Linearity to the data which we will see further in this blog.

Starting from Input Layer, each neuron is connected to every other neuron in the next layer. Also every neuron in each layer will receive input from the every other neuron from it's previous layer. 

Also note that every connection has a number associated with it as shown in above diagram. These numbers are called WEIGHT's. For example weight of input X(1) to 1st neuron in hidden layer is 0.2. Each neuron has a number inside is, it is called BIAS. Both Weights and Biases are called parameters of the Neural Network.

How many neuron's should present in each layer and how many such layers should exist are controlled by Hyper parameters in NN. These are tunable based on the situation.

When we say a NN is trained, it means that a NN came to a point where they have certain Weights and Biases which has learned from the input data in each layer than can finally transformed to a meaningful output. Training also involved adjusting these Weights and Biases based on the error signals. 

Try to digest below sentences carefully, read them with intensity to inject in mind(if multiple reads needed then please do, you have to remember it lifetime):

  •  In input layer X(1), X(2), X(3) are the input parameters
  • At start, before first hidden layer, we will allocate some random numbers to these neuron's as weights & Biases and these will be adjusted down the length in each iteration based on the errors. We will ask program to initialize these numbers to some random numbers.
  •  I will provide detailed example to have a clear understanding, it is a huge process hence I will write it at end of this blog or a new blog.
  • Lets say, for example the predicted value is 100 but the actual value is 120, so the loss is (100 - 120 = -20), now there is something called back propagation algorithm which will adjust the weight based on this loss and it will be iterated to next level of processing. I know it is still not clear, don't worry! Every single doubt will be clarified slowly. Just keep moving further.
  • So, all these numbers in the above picture are learned and adjusted accordingly 


Neural Network Parameter calculation :

  • Below NN has 5 input nodes, 3 hidden layers with 5 neurons in each layer and a output layer.
  • No. of parameters in a layer is equal to
    • = (neurons in previous layer * neurons in current layer) + neurons in current layer
    • = number of connections + biases
    • = weights + biases
  • Total number of parameters = Sum of parameters in each layer
  • Ex, below NN has 69 parameters



Alright, let's see what is a Tensor.


Tensor : 

  • A Tensor is a data container or mathematical object that represent data in N dimension
  • These are used for data processing(we will see in programming, I will add colab notes with proper description)
  • It is a structured way to store numbers for computers to process
  • Core properties of a Tensor is as below :
    • Rank (Order) : Number of dimensions
    • Shape : Tuple indicating size along each dimension (Matrix : 3 * 4 - 3 rows, 4 columns)
    • Data Type : Type of elements(float32, int etc.)

Now lets understand what are the types of Tensor's available.

1) Scalars (Rank-0 Tensor) : 
  • It is single valued, represented by a single real or complex number.
  • Magnitude only - has size but no directional component
  • Rank 0 tensor - In tensor algebra, a scalar is a rank-0 tensor
  • Examples : batting average, number of goals scored etc.

2) Vectors (Rank-1 Tensor) : A vector is an ordered collection of numbers that represents both magnitude and direction in space.

  • It is multi valued - Represented by and ordered list of real or complex numbers
  • Has both magnitude and direction
  • Coordinate dependent - Components change with coordinate system transformation
  • Rank-1 tensor - In tensor algebra, vector is a rank-1 tensor 
  • Note : 2nd image below is a 3D vector
  • Notation - Tuple/List : (x1, x2, x3 ....xn) OR [x1, x2, x3, ..xn]


3) Matrices (Rank-2 Tensor) : Matrices are rank-2 tensors represented by a 2 dimensional array of numbers

  • A collections of vectors
  • It has both row, column wise organization
  • Values changes based on choice of coordinate basis
  • It tensor algebra, matrices are rank-2 tensors


Types of matrices : Just refresh your cache with below information, I know you know these :)

  • Rectangular matrix
  • Square matrix
  • Diagonal matrix
  • Identity matrix
  • Upper triangular matrix
  • Lower triangular matrix

4) Rank-3 Tensor : 
  • Cube -valued - represented by a 3 dimensional array of numbers
  • Multi-directional structure - has depth, height and width
  • Coordinate invariant - Its values doesn't change with coordinate system transformations 

5) Rank 4 tensor 
  • Image data
  • A batch of 128 color images could be stored in a tensor of shape (128, 256, 256, 3)
    • 128 images
    • 256 * 256 pixels
    • 3 is the color depth (remember R G B, depth of red, blue, green in any image ?)

6) Rank 5 tensor
  • It is video data
  • Video data is one of the few types on real world data for which you will need rank-5 tensor
  • A 60 second, 144 * 256 YouTube video clip sampled at 4 frames per second would have 240 frames
  • A batch of 4 such video clips would be stored in a tensor of shape(4, 240, 144, 256, 3) 




Tensor recap :



Processing inside a Neuron :
  • Below diagram shows what will happen inside a neuron
  • x1, x2, x3, xn are the inputs, it is a vector on 'n' number of values
  • Each input has a weight on neuron
  • Sigma is the summation function, Sigma = x1*w1 + x2*w2+x3*w3+xn*wn
  • Bias is added to the output of sigma, bias is another number
  • Activation function(f), add non-linearity to the neuron, it is applying some kind of quadratic function or exponential function
  • Each neuron outputs another number, similar type of processing happens in each neuron across hidden layer
  • Final predicted output is : Summation of all input * weights + bias +  activation function applied on this result to add non-linearity
    • Linear transformation talks about : if X is a number and it is multiplied with any constant in a 2 dimensional space lets say (1, 2), then vector magnitude gets increased
    • Non linear transformation change/adjust the angle of actual linear vector
      • For example, lets us consider a simple formula, y = mx + b (where (m, b) is a point in 2D space), and x is a constant
      • Now if we multiply m with x, then only m value changes but not b, and it add non-linearity to the initial linear vector
Note : Only try to understand what is non-linearity from above example. That is good enough for now. 



More examples to get some idea on non-linearity :
  • Linear data : 
    • Here the relationship between the input and output is linear data
    • Real time examples : 
      • House price Vs House Size
      • We add more space to house, price also will increase
      • If we draw in a line a line between Size & Price, it is linear. Isn't it ? This is called linearity or linear data.
  • Non linear data :
    • Deep Learning is meant of Non-linear data, not for Linear data. Linear data we can handle using Machine Learning as well.
    • Real time examples :
      • Let us consider : Age, Salary, Discount, Brand purchase history of a customer in a online store like Amazon.
      • Think that all the above variables will work independently, correct ?
      • Do you think one parameter is dependent on other ?
      • Now understand 
        • Low Salary + High Discount ==> He might buy the product
        • High Salary + Low Discount ==> He might buy the product
        • Med Salary + Medium Discount ==> May or May not buy
        • This is not a straight line ? Isn't it ? Sometimes Salary is low, medium, high and similar for all other params
        • This is called non-linear data

Tensor Operations : The gears of neural networks
  • Vector operations
  • Matrix operations

Vector Operations (Addition):
  • U, V are 2 different vectors and we are adding both, it add corresponding values in each vector an create a new vector
  • Bias will add to this new vector (bias is also a vector which we will add to final summation vector)

Vector Operations (Scalar multiplication):
  • Scalar multiplication - c * v = [c*v1, c*v2, ...]
  • We are multiplying a scalar with a vector
  • This is a linear transformation
  • If we multiply with negative scalar, say -2 then it will traverse in reverse director because of sign '-'

Vector Operations (Dot product (Inner product)):
  • This is the most important operation that we want to learn
  • We have 2 kinds of vector multiplications
    • Dot Product
    • Cross Product
  • From Neural Networks perspective, we are interested in "Dot Product"


We are done with Vector operations, not lets see matrix operations.


Matrix Operations :
  • Please see below images for information on matrix transformation, it will be easy to understand via images instead of data










Rank of a Matrix :
  • Rank = No. of linearly independent rows or columns in a matrix 
  • Rank tells you how much unique information a matrix contains



How to find rank, in Python way :

import numpy as np

A = np.array([[1,2],[2,4]])
print(np.linalg.matrix_rank(A))


Tensor reshaping :
  • Reshaping a tensor means rearranging its rows and columns to match a target shape.
  • Naturally, the reshaped tensor has the same total number of coefficients as the initial tensor.


Non Linear Activations :
  • Please verify below images for limitations on Linear transformations and the need of non-linear transformations
  • ReLu is a famous non-linear transformation which we use in Deep Learning and Neural Networks
  • Read  below activation function formulas from below images and memorize
    • Sigmoid 
    • Softmax



We are done with basics of Neural Networks and Deep Learning. 

I will be adding the colab notes for tensor programming showcasing some important operations that we can perform using PyTorch and also come up with one example of Neural Network with some raw data in next few days.


Thanks,
Arun Mathe
Email ID : arunkumar.mathe@gmail.com

Comments

Popular posts from this blog

AWS : Working with Lambda, Glue, S3/Redshift

This is one of the important concept where we will see how an end-to-end pipeline will work in AWS. We are going to see how to continuously monitor a common source like S3/Redshift from Lambda(using Boto3 code) and initiate a trigger to start some Glue job(spark code), and perform some action.  Let's assume that, AWS Lambda should initiate a trigger to another AWS service Glue as soon as some file got uploaded in AWS S3 bucket, Lambda should pass this file information as well to Glue, so that Glue job will perform some transformation and upload that transformed data into AWS RDS(MySQL). Understanding above flow chart : Let's assume one of your client is uploading some files(say .csv/.json) in some AWS storage location, for example S3 As soon as this file got uploaded in S3, we need to initiate a TRIGGER in AWS Lambda using Boto3 code Once this trigger is initiated, another AWS service called GLUE(ETL Tool)  will start a Pyspark job to receive this file from Lambda, perform so...

Spark Core : Understanding RDD & Partitions in Spark

Let us see how to create an RDD in Spark.   RDD (Resilient Distributed Dataset): We can create RDD in 2 ways. From Collections For small amount of data We can't use it for large amount of data From Datasets  For huge amount of data Text, CSV, JSON, PDF, image etc. When data is large we should go with Dataset approach     How to create an RDD ? Using collections val list = List(1, 2, 3, 4, 5, 6) val rdd = sc.parallelize(list) SC is Spark Context parallelize() method will convert input(collection in this case) into RDD Type of RDD will be based on the values assigned to collection, if we assign integers and RDD will be of type int Let's see below Scala code : # Created an RDD by providing a Collection(List) as input scala> val rdd = sc.parallelize(List(1, 2, 3, 4, 5)) rdd: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at <console>:23 # Printing RDD using collect() method scala> rdd.collect() res0: Array[Int] = Array(1, 2, 3, 4...