Mark Russinovich on AI Safety and Security at BlueHat

This title was summarized by AI from the post below.

View organization page for Microsoft Security Response Center

73,250 followers

Mark Russinovich's BlueHat keynote this morning was practical and inspiring at the same time. Mark went deep into jailbreaks, prompt injection attacks, and hallucinations, and walked us through what these attacks look like in practice with multiple live demos and examples from both his personal experience and recent news. Most importantly, he walked through mitigation strategies and the latest research on how to defend against them, including FIDES (Flow Integrity Deterministic Enforcement System), a deterministic Information-Flow Control approach for prompt injection mitigation that lets us balance autonomy and security, and his RefChecker tool for catching hallucinated citations. He closed by reminding us that AI safety becomes security, and we must build defenses now or we will get "more OpenClaw at scale." #BlueHat

8 Comments

Ram Shankar Siva Kumar 🦝 3w

I cannot think of a more a holistic deepdive on prompt injection, hallucination and the fallibility of agents than this keynote! Thank you all for organizing this!!

7 Reactions

Eva Benn 3w

This was one of the best keynotes I've ever seen. Every single minute was jam-packed with value!

1 Reaction

Luck Suknimit 3w

Thanks for sharing

Nelson Kauley 3w

Very insightful presentation. Gave me some interesting threads to pull. Thanks again MSRC!

1 Reaction

Michal Pristas

Senior Software Engineer @ Elastic | Go Development, Open Telemetry

is there a recording?

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Ronaldo Smith Jr.
3w
Report this post
This is the exactly the kind of clarity the AI security conversation needs. Mark demos made the risks tangible, but the real value was in showing that practical defenses already exist. Tools like FIDES and RefChecker signal a shift toward engineering AI systems with security as a first‑class requirement, not an afterthought. AI safety is #cybersecurity now! The organizations that internalize this early will be the ones ready for what’s coming next in this AI world. #MSFTAdvocate #Microsoft #Employee #AISecurity #AISafety #Segurança #DigitalTransformation #AITransformation
Microsoft Security Response Center

73,250 followers
3w

Mark Russinovich's BlueHat keynote this morning was practical and inspiring at the same time. Mark went deep into jailbreaks, prompt injection attacks, and hallucinations, and walked us through what these attacks look like in practice with multiple live demos and examples from both his personal experience and recent news. Most importantly, he walked through mitigation strategies and the latest research on how to defend against them, including FIDES (Flow Integrity Deterministic Enforcement System), a deterministic Information-Flow Control approach for prompt injection mitigation that lets us balance autonomy and security, and his RefChecker tool for catching hallucinated citations. He closed by reminding us that AI safety becomes security, and we must build defenses now or we will get "more OpenClaw at scale." #BlueHat
Like Comment
To view or add a comment, sign in
Mark Russinovich
3w
Report this post
It’s becoming increasingly clear that as we move from simple prompts to autonomous agents, the boundary between AI Safety and AI Security is disappearing. I had a great time this morning discussing this shift and why the risks we’ve been tracking, such as jailbreaks, indirect prompt injections, and hallucinations, are no longer just academic edge cases. As we grant agents greater access to our personal data, business systems, and decision making loops, these vulnerabilities become direct threats to our security posture. Understanding these risks is the first step toward building the resilient, secure by design agentic systems that the next era of computing requires.
Microsoft Security Response Center

73,250 followers
3w

Mark Russinovich's BlueHat keynote this morning was practical and inspiring at the same time. Mark went deep into jailbreaks, prompt injection attacks, and hallucinations, and walked us through what these attacks look like in practice with multiple live demos and examples from both his personal experience and recent news. Most importantly, he walked through mitigation strategies and the latest research on how to defend against them, including FIDES (Flow Integrity Deterministic Enforcement System), a deterministic Information-Flow Control approach for prompt injection mitigation that lets us balance autonomy and security, and his RefChecker tool for catching hallucinated citations. He closed by reminding us that AI safety becomes security, and we must build defenses now or we will get "more OpenClaw at scale." #BlueHat
23 Comments
Like Comment
To view or add a comment, sign in
Gabriel N.
3w Edited
Report this post
Just attended one of the best keynotes I've seen in a while. The AI security talk at #BlueHat 2026 did not disappoint. Amazing talk Mark Russinovich 👏👏👏 In the village, the AeGIS AI Medusa's Memory Heist security game was very clever. A hands on challenge that tests your knowledge of AI vulnerabilities and how to defend against them. The kind of experience that makes you realize pretty quickly where your gaps are. #BlueHat #AISecurity #MSRC #AeGIS Microsoft Security Response Center Security conferences are really just reunions in disguise. #BlueHat 2026 was exactly that. Good vibes, familiar faces, and a few new ones. Already looking forward to MSRC reunion in Vegas!!!
Microsoft Security Response Center

73,250 followers
3w

Mark Russinovich's BlueHat keynote this morning was practical and inspiring at the same time. Mark went deep into jailbreaks, prompt injection attacks, and hallucinations, and walked us through what these attacks look like in practice with multiple live demos and examples from both his personal experience and recent news. Most importantly, he walked through mitigation strategies and the latest research on how to defend against them, including FIDES (Flow Integrity Deterministic Enforcement System), a deterministic Information-Flow Control approach for prompt injection mitigation that lets us balance autonomy and security, and his RefChecker tool for catching hallucinated citations. He closed by reminding us that AI safety becomes security, and we must build defenses now or we will get "more OpenClaw at scale." #BlueHat
2 Comments
Like Comment
To view or add a comment, sign in
Alex Trafton
3w
Report this post
I have been saying this for quite a while. This boundary is absolutely collapsing. Autonomous agents are a new risk vector, but time tested first principles are the key to managing the risk.
Microsoft Security Response Center

73,250 followers
3w

Mark Russinovich's BlueHat keynote this morning was practical and inspiring at the same time. Mark went deep into jailbreaks, prompt injection attacks, and hallucinations, and walked us through what these attacks look like in practice with multiple live demos and examples from both his personal experience and recent news. Most importantly, he walked through mitigation strategies and the latest research on how to defend against them, including FIDES (Flow Integrity Deterministic Enforcement System), a deterministic Information-Flow Control approach for prompt injection mitigation that lets us balance autonomy and security, and his RefChecker tool for catching hallucinated citations. He closed by reminding us that AI safety becomes security, and we must build defenses now or we will get "more OpenClaw at scale." #BlueHat
3 Comments
Like Comment
To view or add a comment, sign in
Steve Lamb
2w
Report this post
It’s great to see open discussion by thought leaders like Mark on the need to pay careful attention to the degree to which AI security breaches can lead to safety breaches.
Microsoft Security Response Center

73,250 followers
3w

Mark Russinovich's BlueHat keynote this morning was practical and inspiring at the same time. Mark went deep into jailbreaks, prompt injection attacks, and hallucinations, and walked us through what these attacks look like in practice with multiple live demos and examples from both his personal experience and recent news. Most importantly, he walked through mitigation strategies and the latest research on how to defend against them, including FIDES (Flow Integrity Deterministic Enforcement System), a deterministic Information-Flow Control approach for prompt injection mitigation that lets us balance autonomy and security, and his RefChecker tool for catching hallucinated citations. He closed by reminding us that AI safety becomes security, and we must build defenses now or we will get "more OpenClaw at scale." #BlueHat
Like Comment
To view or add a comment, sign in
TheNextGenTechInsider.com

753 followers
2w
Report this post
Anthropic Increases API Usage Limits for Claude Models 📌 Anthropic is boosting API usage limits for its Claude models, removing previous bottlenecks for developers and enterprises. This expansion enables the seamless scaling of agentic workflows, allowing organizations to deploy high-throughput tasks like automated malware analysis and complex binary deobfuscation. By facilitating more intensive, iterative prompting, this update moves AI integration from experimental testing into robust, large-scale production environments. 🔗 Read more: https://lnkd.in/dtc3cgKP #Anthropic #Claudemodels #Apiusagelimits #Largelanguagemodels
Like Comment
To view or add a comment, sign in
ClearTrust

3,588 followers
3w
Report this post
Is your 5% CTR a performance win or a crime scene?🕵️♂️ In the programmatic ecosystem, high engagement without conversions is the ultimate red flag for SIVT. If you aren’t auditing the silent signs, you’re likely optimizing for botnets instead of humans. 👉Check how much ad fraud is costing you: https://lnkd.in/gUCQdvUS The Red Flags You Can’t Ignore: • The 3 AM Surge • Predictable Timing • Signature Loops Stop the Bleed Before the Bid. ClearTrust automates your IVT filtering with pre-bid protection to ensure your budget reaches verified humans.🛡️ Clean your traffic. Protect your ROAS. #ProgrammaticAdOps #AdTech #SIVT #AdFraud #MediaBuying #ClearTrust #MarTech
Like Comment
To view or add a comment, sign in
Ryan Dobson
4d
Report this post
Pen testing isn’t dying. It’s becoming accessible at a scale it never was before. Watching an autonomous AI system reason its way to an ASLR bypass is one of those moments where you realise offensive security is changing permanently. Federico Kirschbaum (founder of Ekoparty, now leading Security Lab at XBOW) talks through what happens when AI can discover and exploit real vulnerabilities autonomously and what that means for pentesting, software liability, and the future role of humans in the loop. https://lnkd.in/e_KQkgM8

Federico Kirschbaum on XBOW, AI Hackers, and the Future of Pen Testing

securityconversations.fireside.fm
Like Comment
To view or add a comment, sign in
Raul Cachola
1w
Report this post
I just built safety guardrails for Claude Code on @NextWorkHQ! In this project, I configured three layers of security to protect an AI coding agent: - ✔️ Permission deny rules to block access to sensitive files - ✔️ Deterministic hook scripts that intercept dangerous commands - ✔️ A CLAUDE.md security policy for advisory guardrails - ✔️ Red-teamed all three layers to verify defense-in-depth https://lnkd.in/gZweV24b #ClaudeCode #AISecurity #NextWork

1 Comment
Like Comment
To view or add a comment, sign in
Shivam Mishra
3w
Report this post
Every employee is using ChatGPT, Copilot, Gemini... but do you know what they're pasting into it? Customer data. Credentials. Patient records. Confidential strategies. Unsanctioned. Unmonitored. Every single day. 𝗔𝗰𝗿𝗼𝗻𝗶𝘀 𝗚𝗲𝗻𝗔𝗜 𝗣𝗿𝗼𝘁𝗲��𝘁𝗶𝗼𝗻 𝗶𝘀 𝗵𝗲𝗿𝗲 ✔️ 1. 🔍 Visibility into every AI tool your clients use 2. 🚫 Block sensitive data from leaking into public AI 3. 🛡️ Stop prompt injection attacks Natively built into Acronis. No new tools. No new vendors. 𝗦𝗵𝗮𝗱𝗼𝘄 𝗔𝗜 𝗶𝘀 𝗮𝗹𝗿𝗲𝗮𝗱𝘆 𝗶𝗻 𝘆𝗼𝘂𝗿 𝗰𝗹𝗶𝗲𝗻𝘁𝘀' 𝗯𝘂𝘀𝗶𝗻𝗲𝘀𝘀𝗲𝘀 — 𝘁𝗵𝗲 𝗾𝘂𝗲𝘀𝘁𝗶𝗼𝗻 𝗶𝘀, 𝗮𝗿𝗲 𝗬𝗢𝗨 𝗶𝗻 𝗰𝗼𝗻𝘁𝗿𝗼𝗹? 🔗 https://lnkd.in/gU_-SMVa #Acronis #GenAIProtection #ShadowAI #Cybersecurity #MSP #DataProtection https://lnkd.in/gTuk2NVd

Acronis GenAI Protection - Dashboard and Reporting

https://www.youtube.com/
Like Comment
To view or add a comment, sign in

73,250 followers

View Profile Connect

Mark Russinovich on AI Safety and Security at BlueHat

More from this author

Microsoft AI Red Team Colloquium: Security as a team sport

Explore content categories

Mark Russinovich on AI Safety and Security at BlueHat

More Relevant Posts

Federico Kirschbaum on XBOW, AI Hackers, and the Future of Pen Testing

securityconversations.fireside.fm

Acronis GenAI Protection - Dashboard and Reporting

https://www.youtube.com/

More from this author

Microsoft AI Red Team Colloquium: Security as a team sport

Explore related topics

Explore content categories