](https://deep-paper.org/en/paper/file-2078/images/cover.png)
Squeezing 7B Models onto Consumer GPUs: A Deep Dive into Immediate Compensation Pruning (ICP)
If you have ever tried to run a state-of-the-art Large Language Model (LLM) like Llama-2 or a vision model like Segment Anything (SAM) on a single consumer-grade GPU, you know the struggle. These models are massive. A 7-billion parameter model is often the upper limit of what a decent desktop GPU can handle for inference, let alone fine-tuning. To deploy these models efficiently, we often turn to pruning—the process of removing unnecessary weights to make the model smaller and faster. However, there is a catch. Current “one-shot” pruning methods (which are fast and don’t require expensive retraining) work great when you remove 20% or 30% of the weights. But if you try to push the sparsity to 50% or 70% to significantly reduce the model size, performance collapses. ...
](https://deep-paper.org/en/paper/2503.19902/images/cover.png)
](https://deep-paper.org/en/paper/2503.16944/images/cover.png)
](https://deep-paper.org/en/paper/2412.02317/images/cover.png)
](https://deep-paper.org/en/paper/file-2073/images/cover.png)
](https://deep-paper.org/en/paper/2504.12284/images/cover.png)
](https://deep-paper.org/en/paper/2411.14628/images/cover.png)
](https://deep-paper.org/en/paper/2412.06171/images/cover.png)
](https://deep-paper.org/en/paper/2504.01512/images/cover.png)
](https://deep-paper.org/en/paper/file-2068/images/cover.png)
](https://deep-paper.org/en/paper/2503.18682/images/cover.png)
](https://deep-paper.org/en/paper/2501.02973/images/cover.png)
](https://deep-paper.org/en/paper/2411.19167/images/cover.png)
](https://deep-paper.org/en/paper/2411.18335/images/cover.png)
](https://deep-paper.org/en/paper/2504.10676/images/cover.png)
](https://deep-paper.org/en/paper/file-2061/images/cover.png)
](https://deep-paper.org/en/paper/file-2060/images/cover.png)
](https://deep-paper.org/en/paper/2502.20162/images/cover.png)
](https://deep-paper.org/en/paper/2412.00505/images/cover.png)
](https://deep-paper.org/en/paper/2502.04896/images/cover.png)