Anthropic reveals LLM vulnerability via malicious documents

This title was summarized by AI from the post below.
View profile for Pascal Finette

radical✦25K followers

Another day, another LLM vulnerability: The team at Anthropic (the folks behind Claude) showed that a small number of samples is all it takes to poison an LLM of any size. > "As few as 250 malicious documents can produce a 'backdoor' vulnerability in a large language model—regardless of model size or training data volume. […] Even though our larger models are trained on significantly more clean data, the attack success rate remains constant across model sizes." What this means in practical terms is that large language models can be fairly easily backdoored; all it takes is a small stash of malicious documents in the training set. As AI companies are gobbling up data left, right, and center, it is close to impossible to ensure training data isn't tainted. ↗ https://lnkd.in/ggiD6VHn

Sean Lemson, ACC CPCC

Motivated Outcomes, LLC1K followers

5mo

This is looking more and more like a dot com bubble inflating every day.

Arif K.

Ingenuitive Capital2K followers

4mo

Not unlike the idea of human minds vulnerable to a small set of “mind viruses” when it comes to influence! LLMs are amazing metaphors for the mind.

Susanne Siebrecht

Ecclesia Gruppe867 followers

5mo

Good2Know 🤓

See more comments

To view or add a comment, sign in

Explore content categories