How Data Poisoning Compromises AI Models

Abstract: Artificial intelligence models are only as trustworthy as the data used to train them. Yet, the increasing presence of poisoned and fabricated data both from malicious human actors and from AI systems themselves poses a critical challenge. Traditionally, concerns around data poisoning focused on deliberate attacks, where adversaries injected misleading or biased information into datasets to manipulate model behavior. Today, the problem is compounded by the sheer volume of unverified content circulating on the web, much of it generated by AI tools capable of producing convincing but inaccurate material. As models increasingly learn from online sources, they risk absorbing and reinforcing these distortions, creating a feedback loop where fabricated data trains new models, which in turn generate more unreliable outputs. This cycle undermines the reliability, safety, and fairness of AI systems. Addressing this issue requires not only technical defenses such as robust data validation and anomaly detection but also a cultural shift toward accountability in data curation. By recognizing the intertwined roles of human error, malicious intent, and synthetic content, the research community can begin to chart pathways toward more resilient and trustworthy AI.