](https://deep-paper.org/en/paper/2411.02018/images/cover.png)
Beyond the Prompt: Unpacking Shortcut Learning in Large Language Models
Large Language Models (LLMs) such as GPT-3, LLaMA, Qwen2, and GLM have revolutionized how humans interact with technology. Among their many capabilities, In-Context Learning (ICL) stands out as particularly intriguing—it allows them to learn to perform a new task simply by observing a few examples within a prompt, no retraining required. It feels almost magical. But what if this “magic” sometimes hides a clever illusion? LLMs often take the path of least resistance. Instead of grasping the reasoning we expect, they find simple shortcuts that seem to work—until they don’t. This phenomenon, known as shortcut learning, reveals that these models can overfit to shallow patterns rather than genuine logic. It’s reminiscent of Clever Hans, the horse thought to understand arithmetic but that really just responded to subtle cues from its handler. ...