Store Videos About

Vigyata.AI

Is this your channel?

Why securing AI is harder than anyone expected and guardrails are failing | HackAPrompt CEO

18.9K views· 352 likes· 92:41· Dec 21, 2025

ShareTwitter Facebook LinkedIn Instagram

🛍️ Products Mentioned (23)

Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking Competition:

Available on semanticscholar →

ServiceNow

Available on servicenow →

ServiceNow AI Agents Can Be Tricked Into Acting Against Each Other via Second-Order Prompts

Available on thehackernews →

Twitter pranksters derail GPT-3 bot with newly discovered “prompt injection” hack

Available on arstechnica →

MathGPT

Available on math-gpt →

2025 Las Vegas Cybertruck explosion

Available on en →

Disrupting the first reported AI-orchestrated cyber espionage campaign

Available on anthropic →

Available on lennysnewsletter →

Production and marketing by

Available on penname →

Simon Willison’s Weblog

Available on simonwillison →

Datadog—Now home to Eppo, the leading experimentation and feature flagging platform

Available on datadoghq →

The coming AI security crisis (and what to do about it) | Sander Schulhoff

Available on lennysnewsletter →

GoFundMe Giving Funds—Make helping a habit

Available on gofundme →

Sander Schulhoff is an AI researcher specializing in AI security, prompt injection, and red teaming. He wrote the first comprehensive guide on prompt engineering and ran the first-ever prompt injection competition, working with top AI labs and companies. His dataset is now used by Fortune 500 companies to benchmark their AI systems security, he’s spent more time than anyone alive studying how attackers break AI systems, and what he’s found isn’t reassuring: the guardrails companies are buying don’t actually work, and we’ve been lucky we haven’t seen more harm so far, only because AI agents aren’t capable enough yet to do real damage. *We discuss:* 1. The difference between jailbreaking and prompt injection attacks on AI systems 2. Why AI guardrails don’t work 3. Why we haven’t seen major AI security incidents yet (but soon will) 4. Why AI browser agents are vulnerable to hidden attacks embedded in webpages 5. The practical steps organizations should take instead of buying ineffective security tools 6. Why solving this requires merging classical cybersecurity expertise with AI knowledge *Brought to you by:* Datadog—Now home to Eppo, the leading experimentation and feature flagging platform: https://www.datadoghq.com/lenny Metronome—Monetization infrastructure for modern software companies: https://metronome.com/ GoFundMe Giving Funds—Make year-end giving easy: http://gofundme.com/lenny *Transcript:* https://www.lennysnewsletter.com/p/the-coming-ai-security-crisis *My biggest takeaways (for paid newsletter subscribers):* https://www.lennysnewsletter.com/i/181089452/my-biggest-takeaways-from-this-conversation *Where to find Sander Schulhoff:* • X: https://x.com/sanderschulhoff • LinkedIn: https://www.linkedin.com/in/sander-schulhoff • Website: https://sanderschulhoff.com • AI Red Teaming and AI Security Masterclass on Maven: https://bit.ly/44lLSbC *Where to find Lenny:* • Newsletter: https://www.lennysnewsletter.com • X: https://twitter.com/lennysan • LinkedIn: https://www.linkedin.com/in/lennyrachitsky/ *In this episode, we cover:* (00:00) Introduction to Sander Schulhoff and AI security (05:14) Understanding AI vulnerabilities (11:42) Real-world examples of AI security breaches (17:55) The impact of intelligent agents (19:44) The rise of AI security solutions (21:09) Red teaming and guardrails (23:44) Adversarial robustness (27:52) Why guardrails fail (38:22) The lack of resources addressing this problem (44:44) Practical advice for addressing AI security (55:49) Why you shouldn’t spend your time on guardrails (59:06) Prompt injection and agentic systems (01:09:15) Education and awareness in AI security (01:11:47) Challenges and future directions in AI security (01:17:52) Companies that are doing this well (01:21:57) Final thoughts and recommendations *Referenced:* • AI prompt engineering in 2025: What works and what doesn’t | Sander Schulhoff (Learn Prompting, HackAPrompt): https://www.lennysnewsletter.com/p/ai-prompt-engineering-in-2025-sander-schulhoff • The AI Security Industry is Bullshit: https://sanderschulhoff.substack.com/p/the-ai-security-industry-is-bullshit • The Prompt Report: Insights from the Most Comprehensive Study of Prompting Ever Done: https://learnprompting.org/blog/the_prompt_report?srsltid=AfmBOoo7CRNNCtavzhyLbCMxc0LDmkSUakJ4P8XBaITbE6GXL1i2SvA0 • OpenAI: https://openai.com • Scale: https://scale.com • Hugging Face: https://huggingface.co • Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking Competition: https://www.semanticscholar.org/paper/Ignore-This-Title-and-HackAPrompt%3A-Exposing-of-LLMs-Schulhoff-Pinto/f3de6ea08e2464190673c0ec8f78e5ec1cd08642 • Simon Willison’s Weblog: https://simonwillison.net • ServiceNow: https://www.servicenow.com • ServiceNow AI Agents Can Be Tricked Into Acting Against Each Other via Second-Order Prompts: https://thehackernews.com/2025/11/servicenow-ai-agents-can-be-tricked.html • Alex Komoroske on X: https://x.com/komorama • Twitter pranksters derail GPT-3 bot with newly discovered “prompt injection” hack: https://arstechnica.com/information-technology/2022/09/twitter-pranksters-derail-gpt-3-bot-with-newly-discovered-prompt-injection-hack • MathGPT: https://math-gpt.org • 2025 Las Vegas Cybertruck explosion: https://en.wikipedia.org/wiki/2025_Las_Vegas_Cybertruck_explosion • Disrupting the first reported AI-orchestrated cyber espionage campaign: https://www.anthropic.com/news/disrupting-AI-espionage ...References continued at: https://www.lennysnewsletter.com/p/the-coming-ai-security-crisis _Production and marketing by https://penname.co/._ _For inquiries about sponsoring the podcast, email podcast@lennyrachitsky.com._ Lenny may be an investor in the companies discussed.

Watch on YouTube

🎬 More from Lenny's Podcast

How to ship hardware in the AI era | Caitlin Kalinowski (Apple, Meta, OpenAI)

8.0K views

How Anthropic, Costco, and Patagonia all build incorruptible companies | Eric Ries

32.0K views

AI era skills: Why cultivating agency matters more than job titles | Max Schoening (Notion)

47.8K views

How to win when software is not a moat | Evan Spiegel (Snapchat CEO)

65.3K views

How Anthropic’s product team moves faster than anyone else | Cat Wu (Head of Product, Claude Code)

214.7K views

Why half of product managers are in trouble | Nikhyl Singhal (Meta, Google)

82.0K views

Why securing AI is harder than anyone expected and guardrails are failing | HackAPrompt CEO

🛍️ Products Mentioned (23)

Metronome—Monetization infrastructure for modern software companies

Lennysnewsletter Product

Website

AI Red Teaming and AI Security Masterclass on Maven

What works and what doesn’t | Sander Schulhoff (Learn Prompting, HackAPrompt):

The AI Security Industry is Bullshit

Insights from the Most Comprehensive Study of Prompting Ever Done:

OpenAI

Scale

Hugging Face