Have you ever noticed that the more you use a streaming service or a shopping app, the more it seems to recommend the same few popular things? You watch one blockbuster, and suddenly your entire feed is dominated by the “Top 10” list, pushing niche indie films or unique products into obscurity.
This phenomenon is known as the Matthew Effect, derived from the biblical adage: “For to every one who has will more be given… but from him who has not, even what he has will be taken away.” In the context of Artificial Intelligence, it means the popular items get more exposure, while the unpopular ones (the long-tail items) get buried.
While this is a known problem in static recommendation lists, it becomes significantly more dangerous in Conversational Recommender Systems (CRSs). In a CRS, you are chatting with a bot. If the bot only talks about popular items, and you only reply to what the bot says, you enter a dynamic feedback loop that narrows your world view rapidly. This creates “echo chambers” and “filter bubbles” that are hard to break.
In this post, we will deep dive into a new research paper titled “Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation”. The researchers propose a novel framework called HiCore. It’s a complex name for a brilliant idea: using multi-layered hypergraphs to understand the subtle, multi-faceted interests of a user, ensuring that recommendations remain diverse and fair, even as the conversation goes on.
The Problem: Dynamic Feedback Loops
Existing methods to fight the Matthew Effect usually look at static data. They analyze a dataset, see that item X is too popular, and try to mathematically penalize it.
However, a Conversational Recommender System is not static. It is a time-evolving process.
- The user asks for a movie.
- The system suggests a popular hit.
- The user accepts (because it’s the only option presented).
- The system records this interaction as “positive feedback” for the popular item.
- The system becomes even more likely to suggest that item to the next person.
To break this loop, we need a system that understands Multi-Level User Interests. A user isn’t just “someone who likes Action movies.” They are a complex entity who might like Action movies, starring specific actors, containing certain keywords, and sharing viewing patterns with specific groups of friends.
Enter HiCore: The High-Level Overview
The proposed solution, HiCore, stands for Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning. That is a mouthful, so let’s break down the architecture visually before digging into the math.

As shown in Figure 1 above, the framework is split into two major phases:
- Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning (Left Side): This is the “brain” of the operation. It constructs complex graphs from historical data to learn deep representations of items, entities, and words.
- Interest-Boosted CRS (Right Side): This is the “mouth” and “hand” of the system. It uses those learned interests to actually chat with the user (Conversation Module) and pick items to show (Recommendation Module).
The core innovation here is the move from standard Graphs to Hypergraphs, and the use of Triple-Channel settings.
What is a Hypergraph?
In a standard graph, an edge connects two nodes (e.g., User A — connects to — Movie B). This is a pairwise relationship. In a Hypergraph, a “hyperedge” can connect any number of nodes at once. This allows the system to model high-order relationships, such as a group of three users who all watched the same set of movies, or a user, an entity, and a keyword appearing together.
HiCore constructs three specific types of hypergraphs to capture different semantic nuances:
- Item-oriented: Focuses on the products/movies themselves.
- Entity-oriented: Focuses on knowledge graph entities (actors, directors, genres) derived from DBpedia.
- Word-oriented: Focuses on the actual words spoken in the conversation, derived from ConceptNet.
The Core Method: Triangles and Channels
To build these hypergraphs effectively, the authors use Network Motifs. Motifs are small, recurring sub-graphs or patterns that describe how nodes interact. Think of them as the “Lego bricks” of social interaction.
The researchers categorized these interactions into three specific Channels:
- Group Channel (g): Captures social relations and shared preferences among groups.
- Joint Channel (j): Captures shared behaviors, like friends buying the same item.
- Purchase Channel (p): Captures the direct transaction or interaction between a user and an item.
Let’s look at the motifs used to build these channels:

In Figure 2, you can see the complexity increasing:
- Group Motifs (M1-M7): These triangles represent various social dynamics where users (circles) are connected to each other or to an anchor user (striped circle).
- Joint Motifs (M8-M9): These represent “joint” actions where users interact with both each other and an item (green circle).
- Purchase Motif (M10): The direct link between a user and an item.
Constructing the Item-Oriented Hypergraphs
The construction of these hypergraphs is mathematically rigorous. For the Group Channel, the system calculates adjacency matrices based on the seven group motifs shown above.
The hypergraph for the Group Channel is defined as:

Here, \(\mathcal{V}\) represents the items, and \(\mathcal{N}\) represents the hyperedges derived from the motifs. But how do we turn those triangle pictures into math? We calculate the adjacency matrix for each motif type.
For example, the matrices for the first seven motifs (Group interactions) are calculated using bidirectional (\(J\)) and unidirectional (\(I\)) relation matrices:

Once these matrices (\(H\)) are computed, they are combined to form the final adjacency matrix for the group hypergraph. If two nodes appear in a specific triangle formation (like \(M_1\) or \(M_3\)), the matrix captures that connection:

This math essentially digitizes the social structure. If User A and User B are friends and both follow User C, that’s a specific “shape” of interaction that the model now understands.
The Joint and Purchase Channels
The system doesn’t stop at social grouping. It also looks at Joint behaviors—where social connections overlap with item consumption. This helps distinguish between “friends who just chat” and “friends who actually buy similar things.”

Finally, the Purchase Channel looks at the implicit high-order relationships of users who might not be friends but buy the same things (the classic “People who bought X also bought Y” logic, but boosted by hypergraphs).

Entity and Word Orientations
To fight the “sparsity” problem (where we don’t have enough data on a specific user), HiCore expands its view beyond just items.
It builds Entity-oriented Hypergraphs using external knowledge bases (DBpedia). If you mention “Interstellar,” the system pulls in entities like “Matthew McConaughey” or “Sci-Fi.”

It also builds Word-oriented Hypergraphs using ConceptNet to understand the semantic meaning of the conversation history.

By processing Items, Entities, and Words through Group, Joint, and Purchase channels, HiCore creates a massive, multi-dimensional view of the user’s intent.
Learning the Multi-Level Interests
Once the hypergraphs are built, how does the system learn from them? It uses Hypergraph Convolutional Networks.
The convolution operation propagates information across the hyperedges. If you are connected to a “Group” hyperedge, you absorb information from the other members of that group.

This propagation happens for every channel. The system then aggregates the learned features. It separates the “noise” from the signal by using specific summations for Group, Joint, and Purchase interests (\(X_g\), \(X_j\), \(X_p\)):

Feature Fusion and Self-Supervision
At this stage, we have distinct interest representations for Items, Entities, and Words. To make a final prediction, we need to fuse them. The researchers use an Attention Network to weigh which interest is most important for the current context.

Here, \(X_m\) is the final Multi-Interest representation.
To ensure these representations are high-quality, the model uses Self-Supervised Learning (SSL) via InfoNCE loss. This is a technique where the model tries to maximize the agreement between its learned representation and a “ground truth” derived from the data itself, without needing human labels.

The Interest-Boosted CRS Modules
Now that HiCore has a deep understanding of user interests (\(X_m\)), it applies this knowledge to the two main tasks of a Conversational Recommender System.
1. Recommendation Module
This module predicts which item the user actually wants. It takes the multi-interest vector \(X_m\) and compares it against all candidate items (\(V_{cand}\)). The goal is to minimize the difference between the prediction and the actual user choice using standard cross-entropy loss.

By using the rich \(X_m\) vector (which contains social, joint, and purchase signals), the recommendation is less likely to simply default to the “most popular” item, thus mitigating the Matthew Effect.
2. Conversation Module
This module generates the text response (e.g., “How about you try watching Inception?”). It uses a Transformer-based architecture.
The equation below shows how the system uses Multi-Head Attention (MHA) to combine the current conversation context (\(X_{cur}\)) and historical context (\(X_{his}\)) with the learned multi-interests (\(X_m\)).

The generated response is trained to look like natural human dialogue:

Experiments and Results
Does this complex architecture actually work? The researchers tested HiCore against state-of-the-art baselines (like KGSF, BART, GPT-3, and UniCRS) on four major datasets: REDIAL, TG-REDIAL, OpenDialKG, and DuRecDial.
Recommendation Performance
First, let’s look at how well it recommends items. The metrics used are Recall (did it find the right item?) and NDCG (was the right item near the top?).

As seen in Table 1, HiCore (bottom row) consistently outperforms all baselines. In the REDIAL dataset, it achieves a Recall@10 of 0.2192, significantly higher than KGSF (0.1785) or standard BERT (0.1608). This proves that the multi-hypergraph approach captures user intent better than standard graph or text-based methods.
This dominance holds true across other datasets as well, as shown in the table below covering OpenDialKG and DuRecDial:

The Matthew Effect Analysis
This is the most critical part of the study. High accuracy is great, but if we are just recommending the same 5 movies to everyone, we haven’t solved the Matthew Effect.
To measure this, the authors used Coverage@k (what percentage of total available items are being recommended?). A higher coverage means the system is exploring the “long tail” of niche items.

Figure 3 is the smoking gun.
- Red Line (Ours/HiCore): Shows significantly higher coverage than all other methods.
- Blue Line (KBRD): Shows very low coverage, indicating it suffers heavily from popularity bias.
The researchers also measured Average Popularity (A@K) and Long Tail Ratio (L@K).
- Lower Average Popularity is better (it means we aren’t just suggesting hits).
- Higher Long Tail Ratio is better (it means we are suggesting niche items).

In Table 4, HiCore achieves the lowest Average Popularity scores while maintaining high Long Tail Ratios. This confirms that HiCore isn’t just accurate; it is fairer and more diverse.
Hyperparameter and Ablation Studies
Finally, the authors checked if all these complex parts were necessary.
Do hyperparameters matter? Yes. In Figure 4, we see that the dimension size (\(d\)) and the number of layers (\(N\)) significantly impact Recall. Specifically, a 2-layer network seems to be the sweet spot for balancing complexity and performance.

Do we need all the hypergraphs? The authors performed an ablation study, removing specific components (like the Group channel or the Word-oriented hypergraph) to see what would happen.

Table 5 shows that removing any single component leads to a drop in performance. Removing the Item-oriented Purchase channel (\(G_p^{(i)}\)) caused the biggest drop, which makes sense as purchase history is a strong signal. However, the Group and Joint channels also contribute significantly, proving that social dynamics matter in recommendation.
Conclusion
The Matthew Effect is a “rich get richer” problem that plagues recommender systems, turning them into echo chambers that stifle discovery. As AI becomes more conversational, this loop tightens, making it harder for users to discover new, niche interests.
HiCore offers a robust solution by acknowledging that user interests are not one-dimensional. By building Multi-Hypergraphs across Items, Entities, and Words, and analyzing them through Group, Joint, and Purchase channels, HiCore creates a rich, multi-textured map of user preferences.
The results are clear: HiCore not only predicts what you want better than current state-of-the-art models (like GPT-3 or BART-based systems), but it also digs deeper into the catalog, surfacing hidden gems and breaking the popularity loop. For students and researchers in AI, HiCore demonstrates the power of looking beyond simple user-item pairs and embracing the complex, high-order relationships that define how we actually interact with the world.
](https://deep-paper.org/en/paper/file-3382/images/cover.png)