// Pick the right one for the right job. Here’s how.
Python AI Agent
Frameworks:
The Honest Guide
They’re not competitors — they’re tools with different depths for different problems. Just like you don’t debate NumPy vs Pandas, you ask: what am I building? This is the engineering breakdown that cuts through the noise.
One Language. Five Libraries. Different Depths.
// They’re not competitors. Just like you don’t debate NumPy vs Pandas — they’re Python libraries for different jobs.
You ask: what am I building? Each framework is a library with different depth for different problems.
The AI agent framework landscape exploded in 2024 and consolidated in 2025. The Cambrian explosion of GitHub repos with 1,000+ stars jumped from 14 to over 200 in that period — and most of those frameworks will disappear or be absorbed within 12 months of release. What remains is a stable tier of tools that have proven themselves in production, each with a genuinely distinct value proposition.
The mistake most developers make is approaching this as a competition — trying to determine which framework “wins.” This framing produces bad decisions. LangGraph and CrewAI are not fighting for the same user. PydanticAI and Swarm serve different moments in a developer’s journey. MCP is not a framework at all — it is the protocol layer that all of these frameworks increasingly depend on.
What follows is an engineering guide to five tools in the Python agent ecosystem. For each one: when to use it, why it was designed the way it was, what kinds of problems it is genuinely suited for, and what the honest trade-offs are. No hype. No rankings. Just the breakdown you need to make the right call for your specific build.
Five Tools. Different Jobs. All Working Together.
LangGraph
LangGraph reached v1.0 in late 2025 and is now the default runtime for all LangChain agents. It takes a fundamentally different approach to agent orchestration: rather than writing sequential code, you define agents as nodes in a directed graph, with edges controlling data flow, branching, and transitions. This makes complex state management explicit and debuggable in ways that sequential agent frameworks cannot match.
The graph-based architecture is not an aesthetic choice — it is what makes production-grade agents possible. When your agent needs to loop until a confidence threshold is met, branch to different processing paths based on intermediate results, or recover from failures without restarting from scratch, the graph model gives you the precise control required. LangGraph supports durable execution: agents can persist through failures and resume automatically. Human-in-the-loop support lets you inspect and modify state at any point in the workflow.
Real production users: AppFolio and Vizient deploy domain-specific copilots with LangGraph for complex state management. Norwegian Cruise Line uses it for guest-facing AI with multi-step personalised interactions. Uber implements code generation workflows with multi-step reasoning and human approval checkpoints. The learning curve is real — LangGraph’s abstraction layering and documentation requires investment. But for teams building systems that need to run reliably at scale, that investment pays off.
from langgraph.graph import StateGraph, END from langgraph.checkpoint.memory import MemorySaver # State persists across nodes and loops def diagnose(state): if state["confidence"] < 0.9: return "gather_more_data" # loops back return END graph = StateGraph(MedicalState) graph.add_node("diagnose", diagnose_fn) graph.add_conditional_edges("diagnose", diagnose)
CrewAI
CrewAI is the answer to a different question than LangGraph. Where LangGraph asks “how do I control this workflow exactly?”, CrewAI asks “how do I get multiple agents collaborating intelligently with minimal setup?” The mental model is a workplace team: each agent has a Role, Goal, and Backstory. Assign a task to the crew and they autonomously collaborate to complete it.
At 44.6k GitHub stars, CrewAI has the largest ecosystem of any AI agent framework. The abstraction layer is genuinely designed to minimise setup cost — teams regularly go from idea to production in under a week. The built-in Agent Delegation mechanism is particularly well-designed: when an Agent encounters a task it cannot handle, it proactively delegates to a more capable Agent, without you writing that routing logic.
CrewAI’s enterprise tier (CrewAI AMP) includes Gmail, Slack, and Salesforce trigger integrations — making it genuinely production-deployable for content, research, and business workflow automation without custom integration work. Works especially well for content generation, research pipelines, and analysis workflows where the role-based abstraction maps naturally to how the work is structured.
from crewai import Agent, Task, Crew writer = Agent( role="Content Writer", goal="Write engaging blog posts", backstory="Expert at clear technical writing" ) editor = Agent( role="Editor", goal="Polish and fact-check all content", backstory="20 years in editorial" ) crew = Crew(agents=[writer, editor], tasks=[...])
PydanticAI
PydanticAI is built by the Pydantic team — the same validation layer used by OpenAI SDK, Anthropic SDK, LangChain, and essentially every Python AI library in the ecosystem. They turned that validation expertise into an agent framework with a specific value proposition: type safety catches agent logic errors at development time, before they surface in production.
PydanticAI shipped V1 in September 2025 with an API stability commitment, narrowing the gap with more established frameworks quickly. It is genuinely model-agnostic — supporting 20+ model providers — which means switching LLM vendors does not require rewriting business logic. Cost benchmarks show significant savings: $390 testing costs vs $1,088 for equivalent CrewAI implementations in published comparisons.
For teams that already live in Pydantic’s ecosystem — building FastAPI services, data validation pipelines, or structured data workflows — PydanticAI feels like a natural extension of existing code rather than a separate framework. The mental model is Python-native, the decorators and schema definitions are familiar, and the validation guarantees extend from your existing data layer into your agent layer without a seam.
from pydantic_ai import Agent from pydantic import BaseModel class FinancialReport(BaseModel): revenue: float # validated, not guessed currency: str period: str confidence: float # Output is always a validated FinancialReport agent = Agent("openai:gpt-4o", result_type=FinancialReport)
Swarm
OpenAI Swarm is not a production framework — and it is honest about that. It is an educational library designed to help developers understand the fundamental mechanics of multi-agent systems: how agents hand off to each other, how context transfers between agents, what the actual primitives underneath the higher-level frameworks are doing.
The value proposition is transparency. Where LangGraph abstracts state management into a graph model and CrewAI abstracts collaboration into crew roles, Swarm shows you the raw handoff mechanics directly. If you have ever wondered what is actually happening inside a multi-agent workflow — what data transfers, what instructions pass, how agents determine when to delegate — Swarm shows you that in the clearest possible way.
Use Swarm when you are new to agent development and want to build genuine intuition before committing to a heavier framework. Use it for rapid lightweight prototypes where you want minimal abstraction overhead. Do not use it for production systems — it was explicitly not designed for that purpose, and the lack of production safeguards is a design choice, not an oversight.
from swarm import Swarm, Agent def transfer_to_specialist(): return specialist_agent # raw handoff — transparent triage_agent = Agent( name="Triage", functions=[transfer_to_specialist] ) # You see exactly what's happening. No magic.
MCP — Model Context Protocol
MCP is not an agent framework. It is the protocol that all agent frameworks are converging on for one critical problem: how do AI agents connect to external tools, databases, and APIs without every team writing custom connectors from scratch?
Think of MCP as USB-C for AI integrations. Before USB-C, every device needed its own cable. Before MCP, every AI integration needed a custom connector. Anthropic introduced MCP in November 2024. By March 2026, it had crossed 97 million SDK installs, with OpenAI, Google DeepMind, Microsoft, AWS, and Cloudflare all backing the protocol through the Linux Foundation’s Agentic AI Foundation. There are now over 10,000 active MCP servers in production use.
The practical implication: a LangGraph agent, a CrewAI agent, and a PydanticAI agent can all use the same MCP server to query a database, access a file system, or call an external API. You write the connector once as an MCP server, and every framework in your stack consumes it. This is why MCP belongs in this guide even though it is not a framework — because in 2026, it is increasingly the connective tissue underneath all of the frameworks above.
# LangGraph agent using MCP from langchain_mcp_adapters import MCPToolkit toolkit = MCPToolkit(server="my-database-server") # CrewAI agent using the SAME MCP server from crewai_tools import MCPTool db_tool = MCPTool(server="my-database-server") # Same connector. Different frameworks. Zero duplication.
“The AI agent framework market of 2026 has moved past its most chaotic phase. You don’t need to chase every new framework — most of them will disappear or be absorbed within 12 months. What remains are tools with genuinely distinct value propositions that don’t compete with each other.”
DEV Community — 2026 AI Agent Framework Decision GuidePick Your Scenario. Pick Your Tool.
Different libraries. Different depths. Same goal — working agents.
2026: Consolidation, Not Competition
The ecosystem is maturing from chaotic experimentation into stable, complementary tools with clear use cases.
Start with the Problem. Let the Tool Follow.
The developers who pick frameworks badly are the ones who start with the framework question. They read a blog post, pick the most hyped option, and then try to fit their problem into the framework’s model. This produces either under-engineered solutions (when the framework is more complex than the problem needs) or painful re-engineering (when the framework’s constraints don’t match what the problem actually requires).
The developers who pick frameworks well start with the problem structure. Does this workflow loop? Does it branch based on conditions? Does it need multiple collaborating agents? Do the outputs need to be validated against a schema? These questions point directly to the right tool without needing to compare feature checklists.
The honest answer for most teams building production agents in 2026 is to start with Swarm to build intuition, move to CrewAI for fast multi-agent iteration, graduate to LangGraph when production requirements demand precise state control, and add PydanticAI wherever output schemas are first-class concerns. Layer MCP underneath all of them as your integration strategy matures. These tools are not competing — they are a progression.
LangGraph when you need the graph.
CrewAI when you need the crew.
PydanticAI when you need the schema.
Swarm when you need to learn.
MCP underneath all of them.
Different libraries. Different depths. Same goal — working agents.