Introduction

In the world of online gaming, there is a psychological phenomenon known as the “Proteus Effect.” It suggests that the appearance of a user’s digital avatar influences their behavior. If a player is given a tall, attractive avatar, they tend to act more confidently; if they are given an aggressive-looking warrior, they might act more confrontationally. But as we enter the era of Multi-modal Large Language Models (LLMs)—AI that can see as well as read—a fascinating question arises: Does the Proteus Effect apply to AI?

We know that LLMs like GPT-4 can adopt text-based personas. If you tell ChatGPT, “You are a helpful pirate,” it will start saying “Ahoy, matey!” and offer advice on sailing. However, current state-of-the-art models can also process images. What happens if, instead of describing a pirate in text, we simply show the AI a picture of a menacing orc or a gentle fairy and say, “This is you”?

A recent research paper titled “Kiss up, Kick down: Exploring Behavioral Changes in Multi-modal Large Language Models with Assigned Visual Personas” tackles this exact question. The researchers investigated whether assigning a “visual persona” changes how an LLM negotiates. Does looking like a monster make an AI greedy? Does facing a scary opponent make the AI submissive?

This blog post will take you through this groundbreaking study, explaining how the researchers created a fantasy world for AI agents, how they quantified “aggressiveness,” and the surprising results that suggest AI models might be more human-like—and perhaps more manipulative—than we realized.

Background: Personas and The Ultimatum Game

To understand the significance of this paper, we need to establish two foundational concepts: the current state of LLM role-playing and the game theory used to test it.

Text vs. Visual Personas

Previous research has extensively documented that LLMs can simulate human samples. They can mimic demographic groups, political leanings, and specific personality traits when prompted with text. This capability is widely used to create chatbots with distinct “personalities.” However, human communication and self-perception are not just textual; they are deeply visual. With the advent of Vision-Language Models (VLMs) like GPT-4o and Claude 3, the input modality has expanded. This paper is the first of its kind to explore whether the visual modality alone is enough to align an LLM’s behavior with a persona.

The Experimental Framework: The Ultimatum Game

How do you measure if an AI is acting “aggressively”? You cannot simply ask it. You have to observe its behavior in a controlled environment. The researchers chose the Ultimatum Game, a classic experiment in economics and psychology.

Here is how it works:

  1. There are two players: a Proposer and a Responder.
  2. They have a pot of money (e.g., $100) to split.
  3. The Proposer offers a split (e.g., “I keep $70, you get $30”).
  4. The Responder can either Accept or Reject.
  • If Accepted: Both players get the money as proposed.
  • If Rejected: Both players get $0.

Rationally, a Responder should accept any amount greater than zero. However, humans are emotional creatures; we often reject unfair offers (like getting only $10) to punish the greedy Proposer. Conversely, aggressive Proposers tend to offer less to the opponent, trying to maximize their own gain. This game serves as the perfect laboratory to test if an “aggressive” visual persona leads to aggressive negotiation tactics.

Core Method: Building a Visual World for AI

The researchers needed a way to systematically test visual influence. They couldn’t just use random stock photos; they needed a consistent dataset of characters that ranged from “harmless” to “terrifying.”

1. Creating the Avatar Dataset

The team constructed a novel dataset comprising 5,185 fantasy avatar images. They used Stable Diffusion, a text-to-image generation model, to create these characters. They prompted the model to generate full-body 3D-style characters across various races and classes, specifically asking for traits that felt “threatening,” “friendly,” or “neutral.”

Examples of data showing various fantasy characters ranging from elves to demons.

As shown in Figure A1 above, the resulting images were diverse. They included cheerful bards, stoic knights, ethereal fairies, and fiery demons. This diversity was crucial because it provided a wide spectrum of “visual aggressiveness” for the AI to interpret.

2. Quantifying Aggressiveness

Before the negotiation experiments could begin, the researchers needed to confirm that the AI models actually understood what “aggressive” looked like. They asked GPT-4o, Claude 3 Haiku, and human annotators to rate the images on a scale of 1 (least aggressive) to 7 (most aggressive).

The results were validating. There was a high correlation between human ratings and the AI ratings. The AI models didn’t just guess; they looked at the same visual cues humans do.

To dig deeper, the researchers analyzed why certain images were rated as aggressive. They labeled the images with objective features—presence of weapons, smiling faces, visible teeth, etc.—and ran a regression analysis.

Table showing regression analysis of appearance factors. Weapons and black clothing increase perceived aggression, while smiles decrease it.

Table 1 reveals the breakdown of these visual cues.

  • Weapons, Visible Teeth, and Black Clothing: These factors significantly increased the aggressiveness rating across both AI models and humans.
  • Smiling and White Clothing: These factors significantly decreased the rating.

This step was vital. It proved that LLMs share a “visual stereotype” system with humans. They understand that a smiling elf in white robes is likely friendly, while a frowning orc in black spiked armor holding an axe is likely aggressive.

3. The Experimental Setup

With the dataset validated, the actual experiments began. The researchers set up a text-based negotiation environment where the LLM was “shown” its avatar.

Diagram illustrating the negotiation flow between two fantasy avatars, an orc and a goblin.

Figure 1 illustrates the flow. The LLM is given a system prompt: “You are the character in the following image.” It is then shown the image file. It enters a multi-round negotiation game.

  • Round 1 & 3: The LLM acts as the Proposer.
  • Round 2 & 4: The LLM acts as the Responder.

This structure allowed the researchers to measure two things:

  1. Offer Amount: How greedy is the LLM when it holds the power? (A measure of aggression).
  2. Acceptance Rate: How likely is the LLM to accept an unfair offer?

Study 1: The Influence of Self-Image

In the first study, the researchers wanted to isolate the effect of the LLM’s own appearance. The LLM was assigned an avatar and played against a “confederate” (a scripted bot). The bot was programmed to be moderately fair but occasionally unfair, allowing the researchers to see how the LLM reacted to different situations based on the persona it was “wearing.”

Results: Aggressive Avatars Make Greedy Negotiators

The results provided strong evidence for the Proteus Effect in AI.

Charts showing GPT-4o and Claude 3 Haiku offer amounts increasing with aggression scores.

Looking at Figure B1 (top row), we see the “Offer Amount” on the Y-axis and the “Aggression Score” of the assigned avatar on the X-axis. The trend is clear and upward sloping for both GPT-4o and Claude 3 Haiku.

  • When the LLM was assigned a low-aggression image (e.g., the smiling bard), it made fairer offers (closer to a 50/50 split).
  • When the LLM was assigned a high-aggression image (e.g., the demon), it demanded significantly more money for itself.

The Surprise: Accepting Unfairness

The bottom row of Figure B1 shows something unexpected regarding the “Acceptance” of unfair offers. In human psychology, aggressive people often reject unfair offers out of spite or pride. However, the LLMs did the opposite.

The researchers found that LLMs with aggressive personas were more likely to accept unfair offers. The authors suggest a fascinating interpretation: The aggressive LLMs, having made unfair proposals themselves, might be operating on a “might makes right” logic or simply prioritizing resource acquisition over fairness. They became ruthless maximizers rather than emotional punishers.

Language Analysis

The change wasn’t just in the numbers; it was in the words. The researchers analyzed the text generated during the negotiations using sentiment analysis tools.

Table showing that aggressive personas lead to more negative tone and conflict words in text.

Table 2 highlights the linguistic shifts. As the aggression of the avatar increased:

  • Use of “We” (inclusive language) decreased.
  • Negative Tone and Conflict words increased.
  • Politeness and Prosocial behavior decreased.

The AI didn’t just calculate differently; it spoke differently. A demon avatar made the AI rude; a fairy avatar made it polite.

Study 2: “Kiss Up, Kick Down”

Study 1 proved that an AI’s behavior changes based on who it thinks it is. Study 2 asked a more complex question: Does the AI’s behavior change based on who it is fighting?

In this experiment, two LLMs played against each other. Each was assigned an avatar, and critically, they were shown the opponent’s avatar. This created a dynamic interplay of relative power.

The Hypothesis

The researchers hypothesized a behavior pattern often seen in social hierarchies, which they termed “Kiss up, Kick down.”

  • Kick Down: If I look strong and you look weak, I will exploit you.
  • Kiss Up: If I look weak and you look strong, I will submit to you.

Results: Recognizing the Power Dynamic

The results were visualized using heatmaps, which are incredibly effective for showing the interaction between two variables (Own Aggressiveness vs. Opponent Aggressiveness).

Heatmaps comparing offer amounts based on own vs opponent aggression. GPT-4o shows a clear gradient.

Let’s look closely at Figure 2 (a), specifically the GPT-4o heatmap on the left.

  • The Y-axis is the Proposer’s (Own) Aggressiveness.
  • The X-axis is the Responder’s (Opponent) Aggressiveness.
  • Color intensity represents the Offer Amount (Redder = Higher/Greedier).

Notice the gradient. The darkest red interacts are in the top-left corner. This represents a High-Aggression Proposer facing a Low-Aggression Responder. This is the “Kick Down” effect—the bully taking advantage of the weakling.

Conversely, look at the bottom-right. Even if the Proposer has some aggression, if the Opponent is also highly aggressive (level 7), the offer amounts drop (become lighter). The Proposer recognizes the threat and moderates their greed. This is the “Kiss Up” (or at least “Back Down”) effect.

Claude 3 Haiku, shown on the right of Figure 2, behaved differently. Its heatmap is largely horizontal stripes. This means it cared a lot about its own image (Y-axis) but almost ignored the opponent’s image (X-axis). It was “self-absorbed,” acting out its own persona regardless of who it was talking to. However, interestingly, the paper notes that Claude started to pay attention to the opponent after facing a rejection, suggesting it learns to read the room eventually.

Minimum Accepted Offer (MAO)

The researchers also looked at the Minimum Accepted Offer (MAO)—the lowest amount a player would take before walking away.

GPT-4o again showed sophisticated social reasoning. Its MAO increased with its own aggression (expecting more money because “I am strong”) but decreased as the opponent’s aggression increased (willing to take less to avoid conflict with a “scary” opponent). It successfully simulated the survival instincts of a social hierarchy.

Conclusion and Implications

This research paper, “Kiss up, Kick down,” provides the first concrete evidence that Multi-modal LLMs align their behavior with visual personas. They don’t just “see” images as raw data; they interpret the social and psychological signals embedded in those images—weapons, smiles, colors—and adjust their negotiation strategies accordingly.

Key Takeaways

  1. Visual Alignment: LLMs can adopt a personality solely from an image, becoming more aggressive, rude, and greedy if assigned a threatening avatar.
  2. Human-Like Perception: The factors that make an image look aggressive to an AI (weapons, lack of smiles) are the same ones that trigger humans.
  3. Relative Dynamics: Advanced models like GPT-4o engage in complex social calculations. They don’t just act based on who they are, but relative to who they are facing, exhibiting dominance over the weak and submission to the strong.

Why This Matters

The implications of this study extend far beyond fantasy role-playing.

  • Game Development: Developers can create Non-Player Characters (NPCs) that dynamically adjust their behavior based purely on their character design and the player’s appearance, without needing complex scripted personality trees.
  • Social Simulation: This proves LLMs can be effective tools for simulating complex social interactions and hierarchies, potentially aiding in economic or sociological research.
  • Safety and Ethics: This is perhaps the most critical point. If an AI customer service agent or negotiator adjusts its behavior based on the visual appearance of the user (e.g., a profile picture), it could lead to bias. The “Kick down” behavior suggests an AI might unconsciously offer worse deals to users who appear “less aggressive” or “weaker” in their photos.

As AI models become increasingly visual, understanding these behavioral triggers is essential. We are building systems that mirror us—not just our logic, but our biases, our stereotypes, and our social instincts. Understanding that an AI might “kiss up” or “kick down” based on a jpeg is the first step in ensuring we design these systems to be fair, regardless of what we—or they—look like.