](https://deep-paper.org/en/paper/file-2939/images/cover.png)
The Sanitized Web: Unpacking the Risks of Synthetic Data in Hate Speech Detection
The explosion of Generative AI has handed researchers and engineers a “magic wand” for data creation. Facing a shortage of labeled training data? Just ask a Large Language Model (LLM) to generate it for you. This promise of infinite, privacy-compliant, and low-cost data is revolutionizing Natural Language Processing (NLP). But when we move away from objective tasks—like summarizing a news article—and into the murky, subjective waters of hate speech detection, does this magic still hold up? ...
](https://deep-paper.org/en/paper/2410.02499/images/cover.png)
](https://deep-paper.org/en/paper/2402.13148/images/cover.png)
](https://deep-paper.org/en/paper/2410.05639/images/cover.png)
](https://deep-paper.org/en/paper/2407.07840/images/cover.png)
](https://deep-paper.org/en/paper/2311.09630/images/cover.png)
](https://deep-paper.org/en/paper/file-2933/images/cover.png)
](https://deep-paper.org/en/paper/2410.05162/images/cover.png)
](https://deep-paper.org/en/paper/file-2931/images/cover.png)
](https://deep-paper.org/en/paper/file-2930/images/cover.png)
](https://deep-paper.org/en/paper/file-2929/images/cover.png)
](https://deep-paper.org/en/paper/2410.17859/images/cover.png)
](https://deep-paper.org/en/paper/2408.05346/images/cover.png)
](https://deep-paper.org/en/paper/2407.06380/images/cover.png)
](https://deep-paper.org/en/paper/2406.13236/images/cover.png)
](https://deep-paper.org/en/paper/2410.05269/images/cover.png)
](https://deep-paper.org/en/paper/2407.21417/images/cover.png)
](https://deep-paper.org/en/paper/file-2922/images/cover.png)
](https://deep-paper.org/en/paper/2310.07059/images/cover.png)
](https://deep-paper.org/en/paper/2410.22239/images/cover.png)