Big Data

This new AI jailbreaking technique let’s hackers crack models in just three interactions

[ad_1]

A new jailbreaking technique could be used by threat actors to gradually bypass safety guardrails in popular LLMs to draw them into generating harmful content, a new report warns.

The ‘Deceptive Delight’ technique, exposed by researchers at Palo Alto Networks’ Unit 42, was able elicit unsafe responses from models in just three interactions.

[ad_2]

READ SOURCE

This website uses cookies. By continuing to use this site, you accept our use of cookies.