Skip to main content

Introduction:

As AI Agents use lot of concepts and models in Generative-AI, Let's first understand what is Generative-AI?
what makes Generative AI different from traditional AI concepts in nutshell.

So basically, Artificial Intelligence (AI) is generalized domain which contains all the methods and techniques which helps to mimic human brain.
While Generative – AI is specialized AI which is particularly used for generating creative content / information for users. Day to day life and very well-known example for Gen-AI is ChatGPT.
So, its function is to create new data or content that is similar to what it was trained on, often indistinguishable from human-made content.

Okay!!! so let’s move forward.

Let's understand what is AI Agent?

So, Agent is basically entire system or entity which perform tasks Autonomously /Independently Based on its environment and input data.
It may use or may not use AI techniques for performing specific tasks.
They can be rule-based or decision tree based.

There are some standard characteristics of Agents:

  1. Autonomy – Operating Independently and taking decisions based on environment and input data.
  2. Predefined Rules – They work on pre-defined rules or algorithms.
  3. Limited Decision-Making – It's based on specific rules and fixed logic.

Now, the question arises that if this is the agent, it seems approximately same as program, application and algorithm. In Computer Science. How we can differentiate then the agent different from Application OR Program OR Algorithm as all these do same things. Right?

Here are the simple definitions:

AIAgentsbyGoogle-image1

  1. Agent: An autonomous entity that performs tasks or makes decisions based on its environment.
  2. Application: A software designed to help users perform specific tasks or functions.
  3. Program: A set of instructions written to perform a particular task when executed by a computer.
  4. Algorithm: A step-by-step procedure or set of rules used to solve a problem or perform a task.

Now Let's understand the basic Architecture of Agents:

AIAgentsbyGoogle-image2

 

Let's start with Model.

  1. Model – In Scope of Agents, Model here is referred to LM (Language Model) which are there to interpret language of user and to take appropriate decisions.
    These models can be only one or there can be multiple models depending on overall structure and use case of Agent. Choosing appropriate model specific dimensional size is very important to get more accuracy.
  2. Tools – Model is actual problem solver taking correct decision and understanding the query of user, but its mathematical nature restricts it from being connected with external world.
    Here comes the concept of tools!!!

Toolsbridge the gap between External World and Model, empowering agents to interact with external data and services while unlocking a wider range of actions beyond that of the underlying model alone.

Tools can take a variety of forms and have varying depths of complexity but typically align with common web API methods like GET, POST, PATCH, and DELETE. Which are very easy to understand.

3. Orchestration Layer –
The orchestration layer describes a cyclical process that governs how the agent takes in information, performs some internal reasoning, and uses that reasoning to inform its next action or decision.

  1. Point to Remember – All the three components of Agents above are completely subjective and can be changed with different Situations for different Systems.

Glimpses of Difference between Model and Agents:

Model Agent
Knowledge of Model is limited as it is trained on specific corpus of data. Knowledge can be extended as Agent is always connected with external world.
Process involves single prediction based on user query. Unless there is no externally managed system for history retention. It has inbuilt history management system which retains history of all predictions from user query and all the decisions made in Orchestration layer. So the data can be fetched anytime.
No native implementation of tools. Tools are naively implemented in Agent Architecture.
No native logic layer implemented Native cognitive architecture that uses reasoning frameworks like CoT, ReAct, or other pre-built agent frameworks like LangChain.

Now here, we have covered the most conceptual part of the document!!! The next part involves some definitions of some Concepts which we often use in AI Agents. Each Concept is explained with very simple example –

Let's go!!!

1] Extensions 
It's simply a bridge between API and AI agents
Example – Booking a flight using the Google Flights API via an extension that understands inputs like departure and destination cities.

AIAgentsbyGoogle-image3

2] Sample Extensions –
Pre-built extensions provided by Google for common tasks such as data fetching.
Example – Using a code interpreter extension to generate Python code for inverting a binary tree.

3] Functions

  • Definition: Functions are reusable modules that agents invoke to perform specific tasks, executed client-side.
  • Example: A function outputs city names for ski trips, formatted in JSON for easier parsing by another system.

AIAgentsbyGoogle-image4

 

Use Cases of Functions

  • Functions are used when API calls require security, timing, or order constraints.
  • Example: Filtering results from an API that doesn’t support filtering directly.

5] Data Stores

  • Definition: Data stores enable agents to access and use additional dynamic and real-time information.
  • Example: An agent retrieves specific data from a vector database for a user query.

6] Implementation of Data Stores

  • Definition: Data is stored as vector embedding in databases, enabling agents to retrieve and use it dynamically.
  • Example: Accessing pre-indexed PDF or spreadsheets for retrieval-augmented generation (RAG).

7] Tools Recap

  • Definition: Tools like extensions, functions, and data stores enable agents to interact with the world effectively.
  • Example: Extensions for API calls, functions for client-side logic, and data stores for retrieving additional data.

8] Enhancing Model Performance with Targeted Learning

  • Definition: Approaches like in-context learning, retrieval-based learning, and fine-tuning improve model performance.
  • Example: Using retrieval-based learning to dynamically update a model’s prompt with relevant tools and data.

9] Agent Quick Start with LangChain

  • Definition: LangChain simplifies agent development by chaining reasoning and tool calls for multi-step queries.
  • Example: An agent uses SerpAPI and Google Places API to fetch a team’s football schedule and stadium address.

10] Production Applications with Vertex AI Agents

  • Definition: Vertex AI provides a managed platform for building, deploying, and maintaining production-ready agents.
  • Example: Using Vertex AI’s Agent Builder to define tools, sub-agents, and goals for a complete agent system.