Poisoning Attacks on LLMs: A Direct Attack on LLMs with Less
than 250 Samples
Anthropic, in collaboration with the United Kingdom’s Artificial Intelligence Security Institute and the Alan Turing Institute, recently published an intriguing paper showing that as few as 250 malicious documents can create a “backdoor” vulnerability in a large language model, regardless of the model’s size or the volume of training data! We’ll explore these results in […]
The post Poisoning Attacks on LLMs: A Direct Attack on LLMs with Less than 250 Samples appeared first on Analytics Vidhya.
21