Ravi Srivastava: Agentic AI Framework

Autogen Framework Overview

Autogen is an open-source framework from Microsoft, highlighting its asynchronous, event-driven architecture. The framework addresses issues such as observability, flexibility, control, and scalability in multi-agent systems.

Core Concepts and Components of Autogen:

Autogen Core:

Autogen Core serves as a generic, scalable runtime for multi-agent AI systems
Managing messaging and interactions between agents.
Generic framework for building a scalable Multi-Agentic system.
It can be distributed in different places.
An agent routine for running agents together
Provides the essential Architecture for

Agents
Messaging
Memory
Orchstration

It is the backbone of Microsoft Agentic AI Framework

Autogen Agent Chat:

provides a lightweight abstraction for constructing agent-based workflows with LLMs and tool integrations.
Conversational Single and multi-agent application
Similar to OpenAI SDK and to Crew AI
to use tools to allow them to interact with each other
This is built on the Autogen Core Platform
Core Components

Assistant Agent:

provides analysis, solution, and code
It represents an LLM‑powered autonomous agent whose job is to reason, respond, and collaborate with other agents

Generates responses: It uses an LLM to produce messages, solutions, or reasoning steps.
Maintains conversation state: It tracks the dialogue history and context across turns.
Collaborates with other agents: User Agent for clarification, requests actions from a Tool Agent, coordinate with other Assistant Agents
Executes reasoning loops: self‑reflect, revise answers, chain multiple reasoning steps, follow system rules defined in its configuration
5. Enforces constraints: system prompts, tool access, termination conditions, memory behavior, etc

User Proxy Agent:

The human’s representative agent that sends user instructions into the AutoGen system and receives responses back. It is the “bridge” between the human and the multi‑agent system
Injects user messages into the agent conversation
Approves or rejects actions (if configured)
Acts as the human in multi‑agent workflows

Critic Agent:

Suggest Improvement.
reviews another agent’s output and provides corrections, feedback, or refinements
Ensures the AssistantAgent’s answer is logically valid, accurate, or aligned with constraints.

Messenger Layer:

Handles back-and-forth communication between the agents. It is the communication channels between agents for AgentChat system
A transport system that moves messages between agents in AgentChat.
Routes messages between UserProxyAgent, AssistantAgent, CriticAgent, ToolAgents, etc

Memory:

Stores conversation history and context
stores, retrieves, and manages conversation memory so agents can remember past interactions and use them in future reasoning
It gives agents short‑term or long‑term memory.

Studio:

Studio is a low-code/no-code visual app for building agent workflows.
Prototype and managing AI agents
A web-based UI for a quick prototype

Configuring
Managing agents without writing code.
Built on AutoGen Chat, converstaional framework for single and multi-agent systems.

Magentic One CLI:

Magentic One is a command-line application for running agents, both positioned as research tools rather than production-ready solutions.
A console based assitant
Command-line tool
It can run multi-agents system through the command line/ terminal
Built on autogen Chat
Provide command line utility
Run magnetic one agents directly from the local terminal.

Open Source and Research Focus: Autogen is developed as a Microsoft Research community project, with contributions from a broad base and a focus on open-source research rather than commercialization.
Key focus Areas: primarily work with Autogen Core and Agent Chat, avoiding the low-code/no-code tools.

Building Blocks: Models, Messages, and Agents:

Model Abstraction: The model abstraction in Autogen wraps LLMs such as GPT-4 O mini or other models like Llama.
Example

from autogen_ext.models.openai import OpenAIChatCompletionClient
model_client = OpenAIChatCompletionClient(model="gpt-4o-mini")

from autogen_ext.models.ollama import OllamaChatCompletionClient
ollamamodel_client = OllamaChatCompletionClient(model="llama3.2")

Message Objects: Messages are core objects representing communication between agents, users, or internal tool calls, and can be simple text or multimodal (including images)

from autogen_agentchat.messages import TextMessage
message = TextMessage(content="I'd like to go to London", source="user")

Agent Creation: Agents are instantiated with a model client, a name, and a system message, and can be configured to stream results. The Assistant Agent class is the primary agent type used.

from autogen_agentchat.agents import AssistantAgent
agent = AssistantAgent(
name="airline_agent",
model_client=model_client,
system_message="act as a helpful assistant for an airline. give short, humorous answers.",
model_client_stream=True

)

Agent Interaction via on_messages: The on_messages method is used to pass messages to agents asynchronously

from autogen_core import CancellationToken
response = await agent.on_messages([message],cancellation_token=CancellationToken())
response.chat_message.content

Tool Integration and Database Access:

Database Setup and Query Tool: A SQLite database was created and populated with city and ticket price data. A Python function was implemented to query ticket prices by city, serving as a tool for the agent.
Tool Integration Simplicity: Autogen allows direct passing of Python functions as tools without decorators or wrappers, simplifying the process and reducing boilerplate.
Agent Tool Usage Example: An agent was configured with the ticket price lookup tool and demonstrated querying the database and returning a humorous, context-aware response to a user message.
Reflect on Tool Use Attribute: The reflect_on_tool_use attribute ensures that agents can process tool results and continue the conversation, rather than stopping after a tool call.

Advanced Features: Multimodal Messages and Structured Outputs:

Multimodal Message Handling: Autogen supports multimodal messages, allowing users to send images alongside text.
Structured Output with Pydantic: Structured outputs are easily achieved by specifying a Pydantic model as the expected output type.

Langchain Tool Integration:

Langchain Tool Adapter Usage: The Langchain tool adapter allows any Langchain tool to be wrapped and used as an Autogen tool, facilitating seamless integration between the two ecosystems.
Agent Task Execution: An agent was tasked with finding flights, using the integrated tools to search online, write results to a file, and select the best option, demonstrating the practical utility of tool integration.

Introduction to MCP Tools in Autogen:

Integration with Autogen: Autogen provides wrappers that enable users to easily incorporate any MCP-compliant tool, such as MCP Server Fetch, into their workflows, allowing for seamless tool usage without requiring additional glue code.
MCP Server Fetch Example: The session included a practical example where the MCP Server Fetch tool, which runs a headless Playwright browser to scrape web pages, was run locally and used within Autogen to review and summarize a website, with the assistant replying in Markdown.
Open Ecosystem and Community Tools: MCP's open standard allows anyone to write and share tools, creating a large, public, and open-source ecosystem accessible from within Autogen.

Comparison of Microsoft Semantic Kernel and Autogen Core:

Semantic Kernel Overview: Microsoft Semantic Kernel is a framework similar to Langchain, focusing on wrapping calls to large language models (LLMs), handling memory, tool calling, plugins, and prompt templating for business-oriented tasks.
Autogen Core Focus: Autogen Core is more agent-focused, designed for building autonomous agentic applications, and is distinct from Semantic Kernel; it orchestrates LLM calls for business logic.
Overlap and Use Cases: There is some overlap between Semantic Kernel and Autogen Core.

Interactions and Multi-Agent Workflows:

Agent Roles: Multiple agents (e.g., Primary and Evaluator) were created with distinct roles and prompts, collaborating to find and evaluate flight options in a round-robin group chat.
Termination Conditions: set based on the evaluator agent replying with 'approve', which signals the end of the workflow. noted that more robust conditions are advisable for production use.
Managing Conversation Flow: the risk of agents entering infinite loops or excessive back-and-forth, recommending prompt tuning and kernel restarts to manage runaway conversations, as Autogen lacks built-in recursion limits.

Introduction to Autogen Core and Its Architecture:

Agent Interaction Framework: Autogen Core is a framework for managing interactions between agents, regardless of the underlying platform, programming language, or abstraction used to implement the agents.
Standalone vs Distributed Runtimes: Autogen Core supports two runtime types:

standalone (local, single-machine)
distributed (enabling remote, cross-process agent interactions).

Decoupling Logic and Messaging: The framework separates agent logic from message delivery, handling agent lifecycle and communication, while developers are responsible for the agent's internal logic.
Comparison between Autogen Core and LangGraph

Both manage agent interactions
LangGraph emphasizes robustness and replayability,
Autogen Core focuses on supporting diverse, distributed agent interactions.

Overview of Autogen Core Distributed Runtime:

Distributed Runtime Architecture: The distributed runtime comprises a host service that manages connections and message delivery, and one or more worker runtimes that register and execute agents.
Session and Message Management: Direct messages are handled via GRPC sessions, with the framework managing the complexities of remote message delivery between processes, potentially in different languages.
Experimental Nature and Use Cases: emphasized that the distributed runtime is experimental, as an architectural preview, not for production use.

Autogen Core Distributed Runtime:

It is not ready for production
This is still in the conceptual model, which handles the processes.
Handles messaging across process boundaries.
It is not single-threaded running on a machine.
It does have two core capabilities

Host Service:

Connected to the Worker Routine
handles message delivery
Create a session for direct messaging

It handles through gRPC (it manages the session)
Sending a message from one system to another system or from one process to another, this will be taken care of by the Autogen framework

It works as a central orchestrator
Runs on one machine and knows all the agents in the system.
Keeps track of all the agents who all are registered, where they live, and how to route messages to any agents.

Worker Routine:

Advertise agents to the Host Service
handles executing agents' code
Runtime in the single-threaded case
Different agents that are registered with it
It can be a local machine, sperate machine, or a process
host one or more agents locally
connects back to the host service so it can send and recive message
Worker Routine dont directly call another Worker Routine

They communicate through Host Service
Host Service works as the central orchestrator

Execute the code

Ex 1: Search agent

A worker's routine running is one container

Ex 1: Summarize Agent

Another worker's routine running is in a different container

Agents:

The actual worker does the tasks
Each agent is registered either as a Host Service or a worker Routine
A Worker Routine handles communication

Explanation:

There are 3 agents - Agent A, Agent B, and Agent C

Agent A - hosted in Worker Routine 1
Agent B - hosted in Worker Routine 1
Agent C - hosted in Worker Routine 2

Now Agent A wants to send a message to Agent C. The message goes from Worker Routine 1 to Host Service
Host Service looks up where Agent C is, and it forward message to Worker Routine 2 for Agent C

Notes:

If multiple agents live inside the same Worker Routine, all communication still goes through the Host Service.
Two agents are in the same Worker Routine,
Two agents are in different Worker Routines,
Or two agents are on different machines.

The communication path is always: AgentA → Worker Routine → Host Service → Worker Routine → Agent B

Autogen Documentation Confusion: confusion arising from differences between AG2 and Microsoft Autogen documentation for users.

Still in progress

Ravi Srivastava

Friday, January 2, 2026

Agentic AI Framework - Autogen

No comments:

Post a Comment

About Me