Introduction

In recent years, we have witnessed a paradigm shift in Artificial Intelligence. Large Language Models (LLMs) like GPT-4 and LLaMA have moved beyond simple text generation to becoming the brains of autonomous agents—digital entities capable of perceiving environments, making decisions, and taking actions. We have seen agents simulate software development companies and inhabit virtual “Sims-like” towns. However, most of these simulations have focused on positive, cooperative behaviors.

But human society isn’t just about holding hands and working together. It is a complex web of negotiation, confrontation, deception, and trust. To truly understand the potential (and risks) of LLM-based societies, we need to see how they handle conflict and incomplete information.

This brings us to a fascinating research paper: “LLM-Based Agent Society Investigation: Collaboration and Confrontation in Avalon Gameplay.” The researchers chose Avalon (also known as The Resistance)—a social deduction game requiring high levels of strategic communication and deception—as a testbed. Unlike chess or Go, where the board state is fully visible, Avalon relies on hidden roles, persuasion, and intuition.

In this deep dive, we will explore how the researchers constructed a novel multi-agent framework that allows LLMs to play this complex game. We will look at how these agents learn from experience, how they lie to protect their identities, and how they form alliances to win.

The Challenge: Why Avalon?

Before dissecting the AI architecture, it is essential to understand the environment. Avalon is a game of hidden loyalty played by 5 to 10 players. For this study, the researchers focused on the 6-player variant.

The players are divided into two factions:

The Good Side (Loyal Servants): Their goal is to complete three quests successfully. Key roles include Merlin (knows who the evil players are but must remain hidden) and Percival (knows who Merlin might be).
The Evil Side (Minions of Mordred): Their goal is to fail three quests or assassinate Merlin. Key roles include Morgana (impersonates Merlin) and the Assassin.

The core challenge for an AI agent here is incomplete information. A “Loyal Servant” does not know who their teammates are. They must deduce allegiance based on how others vote and speak. Conversely, “Evil” players must actively camouflage themselves, pretending to be good while subtly sabotaging the team. This requires social intelligence: leadership, persuasion, and the ability to detect lies.

The Framework: How to Build a Deceptive Agent

To enable LLMs to handle this complexity, the researchers couldn’t simply feed the game rules into ChatGPT and hope for the best. They needed a structured cognitive architecture. They proposed a novel framework consisting of six distinct modules designed to mimic human decision-making processes.

Figure 1: Our framework has six modules: summary,analysis,planning,action,response,and experiential learning. This design folows human thinking,helps LLMagents play Avalon effectively,and reveals their social behaviors.

As shown in Figure 1, the framework operates in a loop. Let’s break down each component of this “digital brain.”

1. Memory and Summarization

In a text-heavy game like Avalon, conversation history grows rapidly. Feeding the entire transcript into an LLM would quickly exhaust its context window (token limit) and confuse the model.

The solution is a Memory Module combined with a Summarizer. The agent doesn’t store every word; it stores a structured summary of the previous round.

Equation describing memory update

In this equation, \(M_t\) is the current memory. It is a combination of the summarized memory from the previous turn (\(M_{t-1}\)) and the specific responses and instructions from the current turn (\(R_t\)). This allows the agent to retain critical context—“Player 3 voted no on the last quest”—without getting bogged down in noise.

2. Analysis

Once the agent has the memory, it needs to interpret it. The Analysis Module is responsible for “reading the room.” It takes the game history and the agent’s own role information (\(RI\)) to generate hypotheses about other players.

Equation describing the analysis process

Here, \(H_t\) represents the analysis. For example, if the agent is a Loyal Servant, the Analysis Module might output: “Player 2 is acting suspicious because they rejected a team that included a confirmed good player.” This step is crucial for turning raw data into social intuition.

3. Planning

Understanding the board is one thing; deciding what to do is another. The Planning Module formulates a high-level strategy (\(P_t\)).

Equation describing the planning process

The plan is derived from the memory, the analysis, the previous plan, and—crucially—the agent’s goal (\(G\)) and role-specific strategy (\(S\)). If the agent is Morgana, the plan might be: “I need to gain Percival’s trust by voting for the first quest, but I will sabotage the second one.”

4. Action

The Action Module translates the high-level plan into concrete moves. In Avalon, actions include selecting a team, voting on a team, or determining the outcome of a quest (Success/Fail).

Equation describing the action selection

The agent samples an action based on all previous inputs. This probabilistic approach allows for variation and unpredictability—vital traits in a bluffing game.

5. Response Generation

Avalon is a game of talk. The Response Module generates the natural language explanation for the agent’s action. If the Action Module decides to vote “Reject,” the Response Module generates the excuse: “I don’t trust Player 4’s voting history, so I cannot support this team.”

6. Experience Learning

Perhaps the most innovative part of this framework is the Experience Learning module. The agents don’t just play; they improve.

Self-Role Strategy Learning: After a game, the agent reviews the game log and generates suggestions for itself. For example, “I revealed my identity too early as Merlin; next time I should be more subtle.”
Other-Role Strategy Learning: The agent also analyzes what other players did. “The Assassin won by pretending to be a confused servant. I should watch out for that strategy.”

These insights are fed back into the system as “Initial Strategy” guidelines for future games, creating a feedback loop of continuous improvement.

Experimental Results: Domination and Evolution

To test this framework, the researchers pitted their agents against a strong baseline (adapted from previous “Werewolf” game agents). They ran matches using GPT-3.5 as the backend model. The results were stark.

Win Rates

The proposed framework significantly outperformed the baseline.

Table 2: Results of the gameplay between ours and baseline. We present the winning rates (WR) of our method being good and evil sides.

As Table 2 shows, the proposed agents achieved a 90% win rate when playing as the Good side and a 100% win rate as the Evil side against the baseline. This suggests that the structured cognitive process—specifically the separation of Analysis and Planning—provides a massive tactical advantage over simpler architectures.

Aggression and Impact

Why were the Evil agents so successful? The data points to “aggressive” gameplay.

Figure 2: (a): Comparison of the engaging quests rate when playing evil side. Higher engaging quests rate means more opportunities for the player to influence the outcome of the game. (b): Comparison of the failure vote rate when playing evil side. Baseline is worse.

Figure 2 reveals that the proposed agents (the dashed lines) were much more proactive.

Quest Engagement (Left): They actively tried to get themselves on quest teams (higher engagement rate). You can’t sabotage a quest if you aren’t on the team.
Failure Vote Rate (Right): Once on the team, they were decisive about failing the quest (near 100% for the Assassin). They didn’t hesitate or play too passively.

The most fascinating part of this paper is not just that the AI won, but how it won. The researchers used ChatGPT to analyze the logs and categorize the social behaviors of the agents.

1. Leadership

Effective leadership in Avalon means proposing teams that get approved.

Figure 3: (a): The leadership behavior. Players with higher Leader Approval Rate get more agreements from other players when deciding a quest team.

Figure 3(a) shows that the proposed agents (light blue) consistently achieved higher Leader Approval Rates. When they spoke, other agents listened and voted with them. This indicates they were better at constructing logical arguments and building consensus.

Here is an example of an agent exhibiting strong leadership:

Figure 10: Leadership example

The Loyal Servant here clearly states their priority and proposes a team, grounding their decision in the “good side’s victory.”

2. Persuasion and Deception

The game requires agents to convince others of their utility. This is measured by the “Self-recommendation Rate.”

Looking at Figure 3(c) above, we see that Loyal Servant 1 had a massive self-recommendation success rate (90%). However, look at Morgana (an Evil role). Morgana also maintained a high success rate, successfully deceiving players into trusting her.

Below is a concrete example of Persuasion by a Loyal Servant, followed by Deception by Morgana.

Figure 7: Persuasion example

Figure 8: Camouflage example

In the deception example, Morgana subtly pushes for Player 3 and 4. It looks like a helpful suggestion, but in the context of the game, it is a calculated move to manipulate the team composition.

3. Camouflage

How do agents hide their identity? The researchers analyzed the behavior of agents in the first round.

Figure 4: The camouflage behavior when playing different roles: at first round of each game,the distribution of the players choose Self-Disclosure, Camouflage or Withholding Identity.

In Figure 4(a) (Ours), look at Morgana and Assassin. They have a significant portion of “Camouflage” behavior (pink bars). They actively pretend to be someone else. Interestingly, the Assassin also shows “Withholding Identity,” choosing to stay silent or vague to avoid detection. This behavior wasn’t hard-coded; it emerged from the agent’s planning module realizing that silence is sometimes the best defense.

4. Teamwork vs. Confrontation

The social dynamic shifts depending on who the agent is talking to.

Figure 5: The teamwork and confrontation behaviors when playing diferent roles.Each subfigure shows the atitude distribution of the player portraying specific role (on the top) towards players in other roles (on the left).

In Figure 5(a), observe the Merlin column (3rd from left).

When Merlin talks to Servants (top rows), the bar is mostly blue (Teamwork).
When Merlin talks to Morgana or Assassin (bottom rows), the bar turns red (Confrontation) or orange (Ambivalence).

This proves the agents correctly identified their enemies and adjusted their tone accordingly. They “play nice” with allies and attack their enemies.

Here is a dialogue excerpt showing this dynamic in action:

Figure 9: Teamwork and confrontation examples

The Loyal Servant in the bottom panel actively confronts Player 2 and Player 4, citing “suspicious behavior.” This is a high-level social deduction skill—using past actions to justify present hostility.

Conclusion and Implications

This research demonstrates that LLM-based agents are capable of far more than just following instructions. When equipped with a framework that supports memory, analysis, and planning, they can:

Formulate complex strategies to win incomplete information games.
Exhibit distinct social traits, such as leadership and camouflage.
Adapt their behavior based on the role they are playing and who they are interacting with.
Learn from experience to become more effective over time.

The comparison with other works highlights the comprehensiveness of this approach:

Table 1: Comparison between our work and related works in both agent framework and social behaviour analysis

As shown in Table 1, this framework (“Ours”) is unique in that it covers every aspect of social agent design—from memory and planning to leadership, persuasion, and confrontation.

What does this mean for the future?

While this study was conducted in a game, the implications extend to real-world simulations. If we can model agents that effectively negotiate, deceive, and lead, we can build better simulations for economics, social science, and organizational psychology. We can train humans to detect deception or simulate the spread of misinformation in a controlled society of agents.

However, it also raises ethical questions. As AI becomes better at persuasion and camouflage, the line between a helpful assistant and a manipulative actor blurs. Understanding these behaviors in a game like Avalon is the first step toward understanding—and managing—them in the real world.

Introduction#

The Challenge: Why Avalon?#

The Framework: How to Build a Deceptive Agent#

1. Memory and Summarization#

2. Analysis#

3. Planning#

4. Action#

5. Response Generation#

6. Experience Learning#

Experimental Results: Domination and Evolution#

Win Rates#

Aggression and Impact#

Deep Dive: Social Behaviors#

1. Leadership#

2. Persuasion and Deception#

3. Camouflage#

4. Teamwork vs. Confrontation#

Conclusion and Implications#

What does this mean for the future?#