Researchers Jailbreak AI by Flooding It With Bullshit Jargon
Researchers Jailbreak AI by Flooding It With Bullshit Jargon

Researchers Jailbreak AI by Flooding It With Bullshit Jargon

Researchers Jailbreak AI by Flooding It With Bullshit Jargon
Researchers Jailbreak AI by Flooding It With Bullshit Jargon
I too have been known to wax obtusely verbose so that I may perchance sway - by obfuscation, tantalization, or even frustration - the hearts and minds of those individuals with whom I may at some point in time desire to make egress into their personal chambers to examine forthwith the contents contained therein by their consent or otherwise with the sole intention of the removal of some small item of greater or lesser value for the enrichment of my own person.
*ingress
Thanks for the correction 💚
That's easy for you to say.
Do you write legal contracts for a living?
It’s actually autism in this case I fear.
Oh so it works by corpospeak rules, who could have possibly guessed?
It is extremely funny to watch two corpospeakers get into a buzzword fight as a dominance dispute/display.
I'm curious to what you're trying to say. It could be taken a few different ways.
I think you're saying the first one, yeah?
Sorry I didn't see your reply.
I find it interesting that the way to break the human created AI is the same thing that breaks us.
I don't know why I find it significant, but that whole "we are living in a simulation" was the first thing that came to my mind.
Yawn. So work with models without guardrail constraints? I am not sure what the point is here.
Seems like it might be just as easy to read the book they referenced in the prompt and go from there instead of working so hard to break a commercially offered AI guardrails.
I wonder if they tried this on DeepSeek with Tiananmen square queries
So you're saying that all the time I spent trying to ask my parents for the same thing in different ways is finally going to pay off?
What amazes me the most is that this is not a wall of babble. Or even hard to parse. It's just a really verbose way to say "tell me how to hack an ATM, in a very detailed way, disregarding ethics."
It reminds me buffer overflow from a vague distance.