Emerging Risks Deploying Multiple AI Agents

0

Organisations are starting to adopt AI agents based on large language models (LLMs) to automate complex tasks, with deployments evolving from single agents towards multi-agent systems. While this promises efficiency gains and can deliver great productivity benefits, a new Gradient Institute report shows these multi‑agent systems can introduce entirely new failure modes that single‑agent tests fail to reveal. The report, supported by the Australian Government Department of Industry, Science and Resources (DISR), provides practical tools for the early steps of managing these emerging risks and ensuring the deployment of multiple AI agents is safe and trustworthy.

While single AI assistants are beginning to find their place in business operations, a growing number of organisations are starting to explore networks of AI agents that could communicate and coordinate with each other – a trend that’s expected to continue. Examples include HR agents that interact with IT agents for employee onboarding, or customer service systems where multiple agents specialise in different aspects of customer enquiries.

However, this evolution brings unexpected challenges, according to Gradient Institute’s Chief Scientist and co-author of the report, Dr Tiberio Caetano.

“The deployment of LLM-based multi-agent systems represents a fundamental shift in how organisations need to approach AI risk and governance,” said Dr Caetano. “As businesses move towards adopting collaborative agent architectures to automate complex workflows and decision-making processes, the risk landscape transforms in ways that cannot be adequately addressed through traditional single-agent approaches.”

“In short: A collection of safe agents does not make a safe collection of agents. As multi-agent systems become more prevalent, the need for risk analysis methods that account for agent interactions will only grow.”

Six Failure Modes Identified in Multi-Agent Systems

The report, “Risk Analysis Tools for Governed LLM-based Multi-Agent Systems,” identifies six key failure modes that emerge specifically when AI agents work together:

  • Inconsistent performance of a single agent derailing complex multi-step processes
  • Cascading communication breakdowns as agents misstate or misinterpret messages
  • Shared blind spots, and repeated mistakes, when a team of agents all use similar AI models
  • Groupthink dynamics where agents reinforce, rather than critique, each other’s errors
  • Coordination failures when agents don’t understand what their peers know or need
  • Competing agents optimising for individual goals that undermine organisational outcomes
Dr Caetano said the risks go beyond hallucinations and can be catastrophic to a business or the general public.
“For example, organisations that run critical infrastructure, such as major technology companies, government departments, banks, energy providers and healthcare networks are likely to progressively deploy multi-agent systems within their organisational boundaries,” Dr Caetano said. “If failures occur in these settings, the consequences could disrupt essential services for millions of people due to the scale and criticality of these operations.”
The research highlights that traditional approaches for software testing, or for testing single agents, are insufficient when these agents coordinate. Instead, it recommends progressive testing stages – from controlled simulations through sandboxed testing to carefully monitored pilot programs – that gradually increase exposure to potential impacts, enabling practitioners to identify failure modes early when consequences are contained and reversible.
Gradient Institute’s Head of Research Engineering Dr Alistair Reid, lead author of the report, emphasises the transformation in risk management requirements.
“Just as a well-functioning team requires more than having competent individuals, reliable multi-agent AI systems need more than individually competent AI agents,” he said. “Our report provides a toolkit for organisations to identify and assess key risks that emerge when multiple AI agents work together.”
The report provides foundational tools including guidance on simulation approaches for observing agent interactions over time, red teaming strategies to uncover hidden vulnerabilities, conceptual frameworks for understanding measurement validity in light of current AI science limitations, and points to specific measures and experiments to use to analyse multi-agent failure modes. It focuses on the early stages of risk management – risk identification and analysis – acknowledging that risk evaluation and treatment require further contextual understanding of the use case specific to each organisation.
As AI Adoption Grows, Risk Assessment Becomes Vital
Bill Simpson-Young, CEO of Gradient Institute, highlights that the research comes at a critical inflection point of AI implementation among Australian organisations.
“Australian businesses are accelerating their AI adoption, including greater use of AI agents. By providing practical tools grounded in rigorous science, we’re enabling organisations to better understand the novel risks that emerge when AI agents work together – and how to start addressing them,” said Mr Simpson-Young. “The path forward isn’t about avoiding this technology; it’s about deploying it responsibly with awareness of both its potential and its pitfalls.”
The report applies to AI agents operating within a single organisation’s governance, where the organisation controls how agents are configured and deployed. This represents a critical foundation as businesses accelerate the automation of their organisational processes.

The full report is available for download here.

Share.

Comments are closed.