(AI #21) Agentic AI Design Patterns

Agentic AI systems are evolving from simple prompt-response applications into autonomous systems capable of reasoning, planning, and taking actions using tools and external knowledge sources. Depending on the complexity of the workflow, these systems can be designed using either single-agent or multi-agent architectures. A single-agent system centralizes reasoning and decision-making within one intelligent agent, making it suitable for simpler workflows and lightweight automation. In contrast, multi-agent systems distribute responsibilities across specialized agents that collaborate to solve complex tasks more efficiently. Modern production-grade AI platforms increasingly adopt multi-agent and graph-based orchestration patterns to improve scalability, reliability, and observability.

Large Language Models (LLMs)

LLMs are AI models trained on vast amount of text data to understand and generate human-like text. They power chat-bots, code assistants, translation tools, content generation, and more. We have also discussed about how input text is converted into tokens using the concept of tokenization, then convert into embeddings for further model processing using Transformer architecture. We have been discussed about transformer architecture, encoding & decoding methods, attention mechanisms like masked multi-headed attention etc.

We also need to discuss about the limitations of LLMs.

May generate incorrect or misleading information(hallucination)
Lacks real-time knowledge unless connected to external tools
Biased outputs are possible if training data is biased
High computational cost for training and serving
Doesn't truly understand like humans

Retrieval Augmented Generation (RAG)

We have discussed that, before constructing a RAG, we need to prepare our knowledge base - like extract data from source, if needed chunk them using available chunking strategies, then convert chunks into embeddings and finally store them in a vector store/database. This entire process is called Indexing. Once indexing is done, we can work on building a RAG system using intent validation, query expansion, query reformulation, pre/post filtering, semantic/keyword searching, reranking etc.

Now, lets start our journey about Agentic AI.

Agentic AI

Agentic AI systems are autonomous agents that perceive their environment, reason, make decisions, takes actions using tools, and learn from outcomes to achieve goals with minimal human intervention.

Simply:

LLMs (Reasoning/Thinking power)
Tools (RAG, MCP)
Memory (Agentic AI memory - part of architecture)
Observability (Tracing entire Agentic AI execution)
Guardrails (Enable security)

Combine together is nothing but Agentic AI.

Typical Agent Loop

Goal(input) - Planning - Retrieve - Act - Observe - Reflect(validation) - Repeat until goal achieved

Above 7 steps are very important to follow whether it is a single/multi-agent system.

Start small, define clear goals
Provide high-quality tools & data
Set guardrails & monitor closely
Iterate, learn & scale

LLM vs RAG vs Agentic AI

Please observe below image carefully and understand the difference between LLM, RAG, Agentic AI.

Design Patterns of Agentic AI

There are lot of design patterns for building agentic-AI systems but below 3 are proven patterns.

React Pattern (Reason + Act in a loop)
Hierarchical Pattern (Delegate & Decompose OR Supervisor-Worker)
Planner - Executor - Reviewer Pattern (Plan, execute & self critique)

React Agent Pattern (Reason + Act in a loop)

The agent reasons about the current state, decides an action, executes it in the environment, observes the result, and repeated until the goal is achieved.

THOUGHT -> ACTION -> OBSERVATION -> Repeat till goal achieved.

This is not a feasible for a complex agentic-AI solutions. It is feasible for simple workflows. If we try to fit a travel planner agent into this design pattern. we can't event fit all the actions into one agent(remember in react agent, we have only one agent and it is like a straight line). We can't use React design pattern for building complex agentic-AI systems.

ReAct is fundamentally a single agent reasoning-and-tool-use design pattern where the agent iteratively thinks, acts and observes. It works well for simple to moderately complex workflows. However for large scale production agentic AI systems, a pure single agent ReAct architecture can become difficult to scale due to context growth, tool overload, latency, and reliability concerns. Modern systems therefore extend ReAct using graph based orchestration, supervisor-worker multi-agent architectures, memory layers, and guardrails. Even in multi agent systems, many individual agents still internally use the ReAct pattern.

Hierarchical Agent Design Pattern

Delegate & Decompose - Break down complex goals into subgoals and delegate to specialized agents organized in a hierarchy.

A top level manager agent receives a goal, decomposes into subgoals, and delegates them to specialized sub-agents. Sub-agents may further decompose and delegate, forming a hierarchy. Results are aggregated bottom-up to produce the final response.

Main drawback of this design pattern is Single Point of Failure, manager is the bottleneck. Here we can write data into memory instead of maintaining a manger agent.

Note :

Assume, we have a requirement where 2 sub-agents need to interact with each other. This is where Agent-to-Agent (A2A) protocol comes into picture. It is an additional integration to multi-agent systems. Google introduced this protocol on April 09th 2025, it will be useful to communicate between local agents and agents residing in cloud(AWS, GCP, Azure).

Planner - Executor - Reviewer Agentic AI Design Pattern

The Planner creates a plan to achieve the goal. The executor carries out the plan using tools and data. The Reviewer evaluates the result, suggests improvements, and decides whether to approve or iterate.

Agentic AI Memory

Agentic AI memory enables agentic AI systems to retain information, leverage past experiences, and continuously improve decision-making and task execution. In simple terms, we are making our agentic AI systems to remember, learn and act smarter over time.

Types of Agentic AI memory:

Short-Term memory (Working memory)

Holds the information in the current context or conversation

Long-Term memory (Episodic/Semantic memory)

Stores information across sessions, includes facts, preferences, interactions & experiences
Episodic memory - Past data
Semantic memory - Facts

User/Entity memory (Profile memory)

Stores knowledge specific to a user, entity or a domain

Procedural memory (Skill memory)

Stores procedures, workflows, and how-to-knowledge
Example : Common things like cycling, walking etc. will store permanently

Reflective memory (Insights/Lessons)

Stores warnings, feedback, and self-reflections
Stores lessons learned
Example : When a program failure happen, it is like storing the fix for this failure and store it for future purpose

A2A protocol :

To establish a connection between 2 agents using A2A - we need Agent skill & Agent card. If we need to define about a person, we need his skills & some personal information right ? Similar way, we need to prepare couple of JSON files about agent i.e. called Agent skills & Agent card.

In a Agent-to-Agent protocol :

Agent card describe who an agent is and how to talk to it
Agent skill describe what the agent can do

These two JSON files needs to be created once we are ready with agents code. Based on the skill set of that particular agents, we will create these 2 JSON files. Note, we are not going to create this manually. Once your agent is ready, we can simply give it to LLM and it can generate Agent card & skills JSON files.

Common problems in Agentic AI Systems

Lets discuss about Agentic AI design patterns in detail using a use case about bank loan processing.

REACT AGENT

User Input - I want a personal loan of $25k. My annual income is $85k and my SSN is 123-456-789. Similarly every day, bank will receive n number of applications.

It is extremely important to understand that we shouldn't pass above input immediately to processing layer. First and foremost thing that we need to do it enable Guardrails for this user input. Lets see what does it mean.

Guardrails (refer point#2 in the below image)

Input validation

Blocked pattern data
Input length check(<= 5000 chars)

Domain validation

Loan amount > 0 AND <= $10,00,000
Credit score between 300 - 850
Minimum income >= $12,000

Output sanitization

Mask sensitive data
SSN, passwords, keys etc.

Once the user input pass through proper input Guardrails and passed then only we need to allow user request into step3 to process.

Create your own user queries, both valid and invalid queries and present them to customer, then only client will understand the concept of having Guardrails in our Agentic AI system. Always create more queries, save them in DB for future reference. And all these queries are dependent on what guardrails you implement in your multi-agent system. Guardrails needs to be validation both in keyword way and semantic way.

For user query validation, you need to ask your customer about the documentation and create a knowledge base out of it. Once this knowledge base/graph is ready, then for every input query, you need to search in this knowledge base/graph whether a particular keyword or at-least semanticity is available based on the user input. Then validate the user input using this knowledge, once all input guardrails are passed, then only allow user input to process further. Otherwise you need to communicate the end user about this situation asking them to re-validate their input. We can take the help of LLM to create these knowledge graphs.

Once guardrails step is passed, next step is ReAct agent where we will use system prompt by using Chain-of-Thought prompt technique as mentioned in above image.

The flow will be moved to Thought -> Action -> Observation -> Answer layer. Developer need to see what kind of system and tools are needed here during ACTION. All the policy related documents and required data will be stored in RAG system especially in a vector DB. Then retrieval process will start using keyword(bm25) + Semantic techniques, followed by re-ranking and send user query + context pulled from tools to LLM, then LLM will produce final answer.

Real time data will be pulled from MCP tools, this is the responsibility of ACTION.

OBSERVATION will verify whether the results are grounded or not. These are nothing but evolution metrics.

Once LLM generates output, we need to enable output guardrails to mask sensitive data etc. Then we will produce final response to user.

Important Note :

Fallback mechanism implementation
Handling errors
Collect data wherever possible in this entire path and store in a database
Mention maximum number of retries to avoid infinite loops

SUPERVISOR + WORKER multi agent system

We will discuss about same example i.e. bank loan processing system. In previous case during ReAct agent design pattern, we have only once agent but here we will be having multiple agents. We do have Guardrails layer here and we have discussed about it.

As you can see in the above image, Supervisor Agent act as a orchestrator which is also called as Root Agent.

First responsibility of Supervisor agent is clearly understand the goal of the user
Orchestrates workflow and assigns tasks to specialized workers
Monitors progress, aggregates results, and makes routing decisions
Handles escalations, guardrail checks, and final decision
Maintain shared state across the system

Look at the subagents 3.1, 3.2, 3.3, 3.4, 3.5, 3.6 which are specialized in their specific work. Supervisor agent must aware of the specialization of sub agents.

Share state block & Supervisor decision logic will handle above task of maintains skills of sub agents, generally we can create a skills.md file which is having the details of sub agent specialization skills. Based on the skills mentioned in the skills.md file, supervisor decision logic will assign tasks to sub agents.

Formal way of doing this is by maintaining a proper agent skill registry in our company. You put that information in GitHub, Confluence or can use Google agent registry. But maintaining this information is extremely important. Whenever you create an agent with some skills, you should register in the agent registry. This is to provide information to other teams and programmers to avoid duplicate tools and agents.

After this, whenever a request coming from user, our supervisor agent will refer agent registry and try to select tools from registry.

Handshake/Handoff between sub agents will happen based on the state information. Each agent is associated with tools and tools are nothing but your python functions. These functions will return the current state information to SHARED STATE. Entire loan application state will be here and can be accessible to all the sub agents.

Planner-Executor-Reviewer loan processing system

Everything is same except the flow of data from one step to another. Here first step is planner, then Executor followed by Reviewer.

Based on the user request, your agent (internally it will call LLM) need to take care of planning.

Don't go with one single plan, always have a fallback plan as well. DO NOT fail system, instead

maintain a fallback.

We have to incorporate plan-B in the prompt itself during planning.

Responsibility of Executor is simply to execute things step-by-step. Run MCP server, use tools to get data from external sources or pull data from RAG etc. have fallback mechanism here as well.

Finally Reviewer will review the data and make decisions out of it.

Always include max number of iterations. Otherwise it will end up with infinite loop, which consumes cost, latency and every possible issue that we can't think of.

Incase if agent is unable to make the decision, in such case - redirect those requests to Human-in-the-loop. In Production, always try to minimize routing to Human-in-the-loop.

Conclusion :

That's all about theory. Feel free to download code from following repo : https://github.com/amathe1/AI-code/tree/main/8_AgenticAI_DesignPatterns

Thank you for reading this blog !

Arun Mathe

DataSphere

Search This Blog

(AI #21) Agentic AI Design Patterns

Labels

Comments

Post a Comment

Popular posts from this blog

AWS : Working with Lambda, Glue, S3/Redshift

(AI #1) Deep Learning and Neural Networks

Spark Core : Understanding RDD & Partitions in Spark