Multi-Agent Systems (MAS)

"Multi-Agent System (MAS)" is an AI application design methodology that, instead of issuing all instructions to a single AI (single prompt), involves constructing "multiple independent AI agents," each assigned individual personas (e.g., engineer, designer, checker, leader) along with specialized knowledge and tools. These agents autonomously interact, negotiate, divide responsibilities, and repeatedly self-correct their outputs (loop verification) via a virtual messaging environment or messaging bus, ultimately working to achieve a single complex objective.
By facilitating discussions among AIs involving "proposals," "refinement," and "bug checks," this approach drastically suppresses "hallucinations" (fabrications), which are the greatest weakness of AI. It is rapidly expanding as a next-generation autonomous execution architecture capable of autonomously generating high-quality programming code and documents.
- Autonomous Coordination and Role Division: For instance, when instructed to "develop a web application," Agent A (planner) organizes requirements, Agent B (coder) writes the code, and Agent C (tester) executes tests. If errors are found, C sends the task back to B for re-correction, thereby automatically replicating a real-world development process.
- Hallucination Suppression through Mutual Verification: A single LLM often struggles to recognize its own incorrect outputs. However, by introducing an agent activated with a separate prompt instruction to act as a 'critical auditor' for verification, the accuracy of outputs dramatically improves.
- Frameworks like AutoGen and CrewAI: The emergence of Python libraries that simplify the description of conversation routing, data passing, and execution conditions (state management) among AI agents is driving practical implementation.
Why Does Multi-Agentification Improve "Quality"?
The principle is identical to that of human corporate organizations. If a single individual (single LLM) attempts to handle all tasks—planning, design, development, testing, and customer support—alone, the cognitive load from context switching becomes overwhelming, leading to a surge in oversights and errors. By assigning specialized agendas to each role, separating tasks, and implementing a "bucket brigade" (sequential handover) and double-checking for each independent task, highly refined outputs with no logical inconsistencies are automatically generated.
Specific Use Cases and Conversation Examples for "Multi-Agent Systems"
Head of Development A: "When we have AI write source code, we often receive deliverables containing non-functional code or library import errors, forcing engineers to debug manually in the end."
AI System Architect B: "Let's transition the development workflow to a Multi-Agent System. We'll integrate a 'Development Agent,' a 'Security Audit Agent,' and a 'Tester Agent' (which actually runs tests with a local compiler), all based on Llama. When the tester executes the code and detects error logs, it will autonomously chat with the Development Agent to request corrections. This process will repeat internal auto-loop debugging until compilation errors are completely eliminated. By the time a human reviews it, only clean code that compiles 100% will be delivered."
Comparison of "Single-Prompt AI" and "Multi-Agent Systems"
| Comparison Metric | Single-Prompt AI (Single Agent) | Multi-Agent System |
|---|---|---|
| Hallucination Rate | High (prone to generating outputs based on assumptions without autonomous checks). | Extremely Low (as other agents with different roles audit and reject logical flaws). |
| Task Complexity Handling | Low to Medium (prone to failure with lengthy texts or instructions involving complex dependencies). | Extremely High (as tasks are divided and processed in parallel and cooperatively). |
| Execution Cost & Token Consumption | Extremely Low (only one round-trip API communication). | High (AIs engage in dozens of conversations and loops in the background, leading to higher token consumption and longer execution times). |
Frequently Asked Questions (FAQ)
Q: When operating a multi-agent AI, is there a risk of agents falling into an "infinite loop," leading to exorbitant API charges?A: This is a very common issue. Designing "termination conditions" (Max iterations / human-in-the-loop) is essential for its resolution. For example, setting upper guardrails in the program, such as "limiting agent-to-agent chat exchanges to a maximum of 15 rounds, automatically stopping and notifying a human if exceeded," or "terminating the system if a specific string (e.g., 'TERMINATE') is outputted," is crucial for preventing cloud bill shock.
Etiquette for Human Intervention in Autonomous System Design
When fully automating complex business processes with multi-agent systems, an extreme design where "everything is resolved by AI agents themselves with no human checks whatsoever" constitutes a breach of etiquette, abandoning social accountability. Especially when executing actions (Tool execution) that can have real-world side effects, such as automated email replies to customers or modifying internal server settings, it is a professional rule for adults handling AI in business to always ensure a security design that includes a "human approval check" at the final stage.
About "Multi-Agent Systems (MAS)"
This page provides the English definition and usage guide for the professional term "Multi-Agent Systems (MAS)." If you have any suggestions, feedback, or corrections regarding our terminology articles, please feel free to reach out via our contact form.