Adversarial ML attacks on multi-modal models

This title was summarized by AI from the post below.

1mo Edited

As multi-modal models become the norm I think we are going to see a lot more impact from these "old school" adversarial ML attacks. If your whole AI red team approach is natural language prompt injection and you don't understand how these models actually work you are going to be missing a large attack surface.

Joseph Lucas

1mo

The old magic still works. It's not all prompts... there are still gradients to follow and decision boundaries to jump. I talk about PGD here, but also succeeded with C&W and HSJ. Here's my latest from the NVIDIA AI Red Team: https://lnkd.in/gfiziis6

Updating Classifier Evasion for Vision Language Models | NVIDIA Technical Blog developer.nvidia.com

To view or add a comment, sign in

More Relevant Posts

Saran Menon
1w
Report this post
OpenAI just detached from the Nvidia lifeline for inference, and the global model war has shifted from "biggest parameters" to "lowest latency." While Silicon Valley slept, Beijing and New Delhi executed massive releases that redefine cost-efficiency and sovereign data control. 1. OpenAI Breaks Hardware Monogamy with Cerebras Deployment In a strategic pivot away from total Nvidia dependency, OpenAI has rolled out GPT-5.3-Codex-Spark exclusively on Cerebras Wafer-Scale Engine 3 (WSE-3) chips. This research preview is designed specifically for "interruptible" coding workflows. Technical Impact: By utilizing wafer-scale integration (massive on-chip memory), OpenAI has bypassed the memory bandwidth bottlenecks typical of GPU clusters. The result is over 1,000 tokens per second per stream. This shifts the bottleneck from model inference time to the developer's ability to read code. 2. MiniMax M2.5: The Price-Performance Assassin Chinese lab MiniMax dropped M2.5 (Open Weights) and its "Lightning" variant, claiming a massive victory in cost efficiency. The model boasts an 80.2% score on SWE-Bench Verified, putting it within striking distance of closed-source western giants. Technical Impact: The model utilizes a highly aggressive Mixture-of-Experts (MoE) architecture optimized for "agent-native" tasks via extensive Reinforcement Learning (RL). The architectural efficiency allows them to offer pricing at roughly $1/hour for continuous 100 tok/sec usage, a figure that undercuts AWS Bedrock and Azure pricing structures significantly. 3. India's Sarvam Wins on Regional Specialization Ahead of today's AI Impact Summit in New Delhi, Indian startup Sarvam released Saaras V3. The model has successfully outperformed Gemini 3 Pro and Deepgram on the IndicVoices and olmOCR-Bench (84.3% accuracy) benchmarks. Technical Impact: This proves that generalized foundational models (like GPT-5 series or Gemini) suffer from the "curse of multilinguality" where tokenization remains inefficient for non-Latin scripts. Sarvam’s specialized encoders for Indian languages and OCR noise reduction provide higher fidelity at a fraction of the parameter count of US models. 4. ByteDance Seedance 2.0 vs. Hollywood ByteDance released Seedance 2.0, a text-to-video model capable of generating "cinematic" outputs in seconds. The release was met with immediate legal threats from US entertainment guilds (MPA, SAG-AFTRA) regarding copyright data usage. Technical Impact: While the diffusion pipelines have improved for temporal consistency, the real story is the lack of "safety filters" compared to OpenAI’s Sora or Google’s Veo. Seedance 2.0 appears to run with minimal RAG-based censorship layers, allowing for faster generation but significantly higher liability. #CerebrasWSE3 #InferenceLatency #SovereignAI #GenerativeAI #TechNews

Hugging Face – The AI community building the future. huggingface.co
Like Comment
To view or add a comment, sign in
WEI (Worldcom Exchange, Inc.)

4,647 followers
1mo
Report this post
Pure Storage integrates its KVA with NVIDIA Dynamo to speed up #AI inference and reduce latency for large language models. This helps enterprises run #AIWorkloads faster and receives results in seconds. The integration helps enterprises run more control over their AI pipelines and helps them scale projects with confidence. #DataAcceleration Read the article to learn more about the integration: https://hubs.la/Q040k1kc0

Supercharging AI Inference: Pure KVA Now Integrates with NVIDIA Dynamo for Scalable, Low-latency LLM Inference https://blog.purestorage.com
Like Comment
To view or add a comment, sign in
Flavius Burca

CTO @ Invergent | Frontier AI Development | AI & Tech Solutions
5d
Report this post
Exactly, and this is why the ability to train and run local agentic models is crucial for the next wave of Ai adoption accross enterprises.
Chorouk Malmoum

Founder & CTO | Building and teaching AI Agents | CITA Member | France’s Top 2% voice in AI
1w

🚨 Breaking: First Andrew Ng, now NVIDIA’s own researchers are backing the same two conclusions: - Small Language models are better than LLMs for Agentic AI - Better orchestratrion leads to cost efficient AI Agents Here are two papers explaining this 👇 📌 Paper 1: "SLM is the Future of Agentic AI" - NVIDIA argues that agents don't need general conversational genius 100% of the time. - Actual agentic workflows are mostly narrow tasks : • "Format this JSON" • "Call the weather API" • "Check if this date is valid" The SLM Advantage - For these tasks, Small Language Models (SLMs <10B params) are not just "good enough", they are better • Latency: They react instantly (critical for multi-step agents) • Cost: 10-30x cheaper per token • Control: Easier to fine-tune for strict protocols than a stubborn giant model But wait... "If we use small models, won't the agent get dumber?" This is where the second paper changes the game "Intelligence isn't just raw parameter count; it's about organization" 🔗 https://lnkd.in/dajzm3TV 📌 Paper 2: "ToolOrchestra" NVIDIA introduced Orchestrator-8B It’s not a genius at everything, but it’s a genius at management. It acts as a "Router" It analyzes a request and decides: "Do I need the expensive reasoning model for this? Or can I use a cheap calculator tool?" - The Results are Wild They tested this on the "Humanity's Last Exam" (HLE) benchmark. • GPT-5 (Monolithic): 35.1% accuracy • Orchestrator-8B (Router): 37.1% accuracy The kicker? The Orchestrator system was 2.5x more efficient and cost ~30% of the monolithic baseline TLDR: Building "Agentic Systems" by just wrapping a prompt around a massive API is a financial dead end As usage scales, your margins vanish Orchestrated approach keeps intelligence high but slashes the compute bill by ~70% • Monolithic Agents: Expensive, slow, wasteful • The NVIDIA Way: Orchestrate specialized SLMs. 🔗 https://lnkd.in/dJc8-Res 📌 Key Insight: Intelligence is about routing to the right tool, not being the tool
Like Comment
To view or add a comment, sign in
Chorouk Malmoum
1w
Report this post
🚨 Breaking: First Andrew Ng, now NVIDIA’s own researchers are backing the same two conclusions: - Small Language models are better than LLMs for Agentic AI - Better orchestratrion leads to cost efficient AI Agents Here are two papers explaining this 👇 📌 Paper 1: "SLM is the Future of Agentic AI" - NVIDIA argues that agents don't need general conversational genius 100% of the time. - Actual agentic workflows are mostly narrow tasks : • "Format this JSON" • "Call the weather API" • "Check if this date is valid" The SLM Advantage - For these tasks, Small Language Models (SLMs <10B params) are not just "good enough", they are better • Latency: They react instantly (critical for multi-step agents) • Cost: 10-30x cheaper per token • Control: Easier to fine-tune for strict protocols than a stubborn giant model But wait... "If we use small models, won't the agent get dumber?" This is where the second paper changes the game "Intelligence isn't just raw parameter count; it's about organization" 🔗 https://lnkd.in/dajzm3TV 📌 Paper 2: "ToolOrchestra" NVIDIA introduced Orchestrator-8B It’s not a genius at everything, but it’s a genius at management. It acts as a "Router" It analyzes a request and decides: "Do I need the expensive reasoning model for this? Or can I use a cheap calculator tool?" - The Results are Wild They tested this on the "Humanity's Last Exam" (HLE) benchmark. • GPT-5 (Monolithic): 35.1% accuracy • Orchestrator-8B (Router): 37.1% accuracy The kicker? The Orchestrator system was 2.5x more efficient and cost ~30% of the monolithic baseline TLDR: Building "Agentic Systems" by just wrapping a prompt around a massive API is a financial dead end As usage scales, your margins vanish Orchestrated approach keeps intelligence high but slashes the compute bill by ~70% • Monolithic Agents: Expensive, slow, wasteful • The NVIDIA Way: Orchestrate specialized SLMs. 🔗 https://lnkd.in/dJc8-Res 📌 Key Insight: Intelligence is about routing to the right tool, not being the tool
85 Comments
Like Comment
To view or add a comment, sign in
Sushovan Jena
1w
Report this post
I really don't understand why most enterprises are consuming so costly tokens, using OpenAI and Claude's big models to build Agents. Small reasoning and language models can be deployed to handle sub-tasks in Orchestrator-Worker framework. Why they can't deploy small open-source models in aws and try to make the orchestration intelligent enough (following the RL concepts). If the Agentic AI momentum keeps building just on models by some superpowers, they are just powering the centralisation of data to a few hands and then of course the CO2 released, and India becomes their Datacentre.
Chorouk Malmoum

Founder & CTO | Building and teaching AI Agents | CITA Member | France’s Top 2% voice in AI
1w

🚨 Breaking: First Andrew Ng, now NVIDIA’s own researchers are backing the same two conclusions: - Small Language models are better than LLMs for Agentic AI - Better orchestratrion leads to cost efficient AI Agents Here are two papers explaining this 👇 📌 Paper 1: "SLM is the Future of Agentic AI" - NVIDIA argues that agents don't need general conversational genius 100% of the time. - Actual agentic workflows are mostly narrow tasks : • "Format this JSON" • "Call the weather API" • "Check if this date is valid" The SLM Advantage - For these tasks, Small Language Models (SLMs <10B params) are not just "good enough", they are better • Latency: They react instantly (critical for multi-step agents) • Cost: 10-30x cheaper per token • Control: Easier to fine-tune for strict protocols than a stubborn giant model But wait... "If we use small models, won't the agent get dumber?" This is where the second paper changes the game "Intelligence isn't just raw parameter count; it's about organization" 🔗 https://lnkd.in/dajzm3TV 📌 Paper 2: "ToolOrchestra" NVIDIA introduced Orchestrator-8B It’s not a genius at everything, but it’s a genius at management. It acts as a "Router" It analyzes a request and decides: "Do I need the expensive reasoning model for this? Or can I use a cheap calculator tool?" - The Results are Wild They tested this on the "Humanity's Last Exam" (HLE) benchmark. • GPT-5 (Monolithic): 35.1% accuracy • Orchestrator-8B (Router): 37.1% accuracy The kicker? The Orchestrator system was 2.5x more efficient and cost ~30% of the monolithic baseline TLDR: Building "Agentic Systems" by just wrapping a prompt around a massive API is a financial dead end As usage scales, your margins vanish Orchestrated approach keeps intelligence high but slashes the compute bill by ~70% • Monolithic Agents: Expensive, slow, wasteful • The NVIDIA Way: Orchestrate specialized SLMs. 🔗 https://lnkd.in/dJc8-Res 📌 Key Insight: Intelligence is about routing to the right tool, not being the tool
2 Comments
Like Comment
To view or add a comment, sign in
AI Innovations
1w
Report this post
very informative Small LLMs act like the fast working memory of an agent, while big LLMs act like the deep thinker. They handle repetitive, frequent, low-risk tasks: tool selection, routing, classification, summarization, extracting fields, maintaining conversation state, and validating outputs. Because they’re cheap and fast, the agent can think step-by-step without calling an expensive model every time. The large model is only called for hard reasoning. So: Small LLM = reflexes Big LLM = brain This makes agents faster, cheaper, and more reliable.
Chorouk Malmoum

Founder & CTO | Building and teaching AI Agents | CITA Member | France’s Top 2% voice in AI
1w

🚨 Breaking: First Andrew Ng, now NVIDIA’s own researchers are backing the same two conclusions: - Small Language models are better than LLMs for Agentic AI - Better orchestratrion leads to cost efficient AI Agents Here are two papers explaining this 👇 📌 Paper 1: "SLM is the Future of Agentic AI" - NVIDIA argues that agents don't need general conversational genius 100% of the time. - Actual agentic workflows are mostly narrow tasks : • "Format this JSON" • "Call the weather API" • "Check if this date is valid" The SLM Advantage - For these tasks, Small Language Models (SLMs <10B params) are not just "good enough", they are better • Latency: They react instantly (critical for multi-step agents) • Cost: 10-30x cheaper per token • Control: Easier to fine-tune for strict protocols than a stubborn giant model But wait... "If we use small models, won't the agent get dumber?" This is where the second paper changes the game "Intelligence isn't just raw parameter count; it's about organization" 🔗 https://lnkd.in/dajzm3TV 📌 Paper 2: "ToolOrchestra" NVIDIA introduced Orchestrator-8B It’s not a genius at everything, but it’s a genius at management. It acts as a "Router" It analyzes a request and decides: "Do I need the expensive reasoning model for this? Or can I use a cheap calculator tool?" - The Results are Wild They tested this on the "Humanity's Last Exam" (HLE) benchmark. • GPT-5 (Monolithic): 35.1% accuracy • Orchestrator-8B (Router): 37.1% accuracy The kicker? The Orchestrator system was 2.5x more efficient and cost ~30% of the monolithic baseline TLDR: Building "Agentic Systems" by just wrapping a prompt around a massive API is a financial dead end As usage scales, your margins vanish Orchestrated approach keeps intelligence high but slashes the compute bill by ~70% • Monolithic Agents: Expensive, slow, wasteful • The NVIDIA Way: Orchestrate specialized SLMs. 🔗 https://lnkd.in/dJc8-Res 📌 Key Insight: Intelligence is about routing to the right tool, not being the tool
Like Comment
To view or add a comment, sign in
Ojash Shrestha
1w
Report this post
You don't just pop in antibiotics for a minor headache inconvenience. Sounds about right on the logic of the assumption, let alone backed by two industry giants.
Chorouk Malmoum

Founder & CTO | Building and teaching AI Agents | CITA Member | France’s Top 2% voice in AI
1w

🚨 Breaking: First Andrew Ng, now NVIDIA’s own researchers are backing the same two conclusions: - Small Language models are better than LLMs for Agentic AI - Better orchestratrion leads to cost efficient AI Agents Here are two papers explaining this 👇 📌 Paper 1: "SLM is the Future of Agentic AI" - NVIDIA argues that agents don't need general conversational genius 100% of the time. - Actual agentic workflows are mostly narrow tasks : • "Format this JSON" • "Call the weather API" • "Check if this date is valid" The SLM Advantage - For these tasks, Small Language Models (SLMs <10B params) are not just "good enough", they are better • Latency: They react instantly (critical for multi-step agents) • Cost: 10-30x cheaper per token • Control: Easier to fine-tune for strict protocols than a stubborn giant model But wait... "If we use small models, won't the agent get dumber?" This is where the second paper changes the game "Intelligence isn't just raw parameter count; it's about organization" 🔗 https://lnkd.in/dajzm3TV 📌 Paper 2: "ToolOrchestra" NVIDIA introduced Orchestrator-8B It’s not a genius at everything, but it’s a genius at management. It acts as a "Router" It analyzes a request and decides: "Do I need the expensive reasoning model for this? Or can I use a cheap calculator tool?" - The Results are Wild They tested this on the "Humanity's Last Exam" (HLE) benchmark. • GPT-5 (Monolithic): 35.1% accuracy • Orchestrator-8B (Router): 37.1% accuracy The kicker? The Orchestrator system was 2.5x more efficient and cost ~30% of the monolithic baseline TLDR: Building "Agentic Systems" by just wrapping a prompt around a massive API is a financial dead end As usage scales, your margins vanish Orchestrated approach keeps intelligence high but slashes the compute bill by ~70% • Monolithic Agents: Expensive, slow, wasteful • The NVIDIA Way: Orchestrate specialized SLMs. 🔗 https://lnkd.in/dJc8-Res 📌 Key Insight: Intelligence is about routing to the right tool, not being the tool
Like Comment
To view or add a comment, sign in
Jaroslav Urbánek

Enterprise Innovation Architect | Digital Transformation Leader | AI Strategy & Cloud Architecture | Driving 300%+ Efficiency through Technology Leadership | SAFe® Architect
6d
Report this post
The smartest agent isn’t the biggest - it’s the best organized. Orchestrator-8B beats GPT-5 on accuracy at 30% of the cost. Scale ≠ intelligence. Routing = intelligence.
Chorouk Malmoum

Founder & CTO | Building and teaching AI Agents | CITA Member | France’s Top 2% voice in AI
1w

🚨 Breaking: First Andrew Ng, now NVIDIA’s own researchers are backing the same two conclusions: - Small Language models are better than LLMs for Agentic AI - Better orchestratrion leads to cost efficient AI Agents Here are two papers explaining this 👇 📌 Paper 1: "SLM is the Future of Agentic AI" - NVIDIA argues that agents don't need general conversational genius 100% of the time. - Actual agentic workflows are mostly narrow tasks : • "Format this JSON" • "Call the weather API" • "Check if this date is valid" The SLM Advantage - For these tasks, Small Language Models (SLMs <10B params) are not just "good enough", they are better • Latency: They react instantly (critical for multi-step agents) • Cost: 10-30x cheaper per token • Control: Easier to fine-tune for strict protocols than a stubborn giant model But wait... "If we use small models, won't the agent get dumber?" This is where the second paper changes the game "Intelligence isn't just raw parameter count; it's about organization" 🔗 https://lnkd.in/dajzm3TV 📌 Paper 2: "ToolOrchestra" NVIDIA introduced Orchestrator-8B It’s not a genius at everything, but it’s a genius at management. It acts as a "Router" It analyzes a request and decides: "Do I need the expensive reasoning model for this? Or can I use a cheap calculator tool?" - The Results are Wild They tested this on the "Humanity's Last Exam" (HLE) benchmark. • GPT-5 (Monolithic): 35.1% accuracy • Orchestrator-8B (Router): 37.1% accuracy The kicker? The Orchestrator system was 2.5x more efficient and cost ~30% of the monolithic baseline TLDR: Building "Agentic Systems" by just wrapping a prompt around a massive API is a financial dead end As usage scales, your margins vanish Orchestrated approach keeps intelligence high but slashes the compute bill by ~70% • Monolithic Agents: Expensive, slow, wasteful • The NVIDIA Way: Orchestrate specialized SLMs. 🔗 https://lnkd.in/dJc8-Res 📌 Key Insight: Intelligence is about routing to the right tool, not being the tool
Like Comment
To view or add a comment, sign in
Paulo Kuriki, MD
6d
Report this post
Great insight. Using expensive models everywhere increases latency and cost. Depending on the task, a SLM router plus regex-based tools can often deliver the same result, faster and cheaper.
Chorouk Malmoum

Founder & CTO | Building and teaching AI Agents | CITA Member | France’s Top 2% voice in AI
1w

🚨 Breaking: First Andrew Ng, now NVIDIA’s own researchers are backing the same two conclusions: - Small Language models are better than LLMs for Agentic AI - Better orchestratrion leads to cost efficient AI Agents Here are two papers explaining this 👇 📌 Paper 1: "SLM is the Future of Agentic AI" - NVIDIA argues that agents don't need general conversational genius 100% of the time. - Actual agentic workflows are mostly narrow tasks : • "Format this JSON" • "Call the weather API" • "Check if this date is valid" The SLM Advantage - For these tasks, Small Language Models (SLMs <10B params) are not just "good enough", they are better • Latency: They react instantly (critical for multi-step agents) • Cost: 10-30x cheaper per token • Control: Easier to fine-tune for strict protocols than a stubborn giant model But wait... "If we use small models, won't the agent get dumber?" This is where the second paper changes the game "Intelligence isn't just raw parameter count; it's about organization" 🔗 https://lnkd.in/dajzm3TV 📌 Paper 2: "ToolOrchestra" NVIDIA introduced Orchestrator-8B It’s not a genius at everything, but it’s a genius at management. It acts as a "Router" It analyzes a request and decides: "Do I need the expensive reasoning model for this? Or can I use a cheap calculator tool?" - The Results are Wild They tested this on the "Humanity's Last Exam" (HLE) benchmark. • GPT-5 (Monolithic): 35.1% accuracy • Orchestrator-8B (Router): 37.1% accuracy The kicker? The Orchestrator system was 2.5x more efficient and cost ~30% of the monolithic baseline TLDR: Building "Agentic Systems" by just wrapping a prompt around a massive API is a financial dead end As usage scales, your margins vanish Orchestrated approach keeps intelligence high but slashes the compute bill by ~70% • Monolithic Agents: Expensive, slow, wasteful • The NVIDIA Way: Orchestrate specialized SLMs. 🔗 https://lnkd.in/dJc8-Res 📌 Key Insight: Intelligence is about routing to the right tool, not being the tool
Like Comment
To view or add a comment, sign in

1,715 followers

View Profile Follow

Adversarial ML attacks on multi-modal models

More from this author

An AI Red Teamer's Advice for Orgs Deploying AI

Explore content categories

Adversarial ML attacks on multi-modal models

More Relevant Posts

More from this author

An AI Red Teamer's Advice for Orgs Deploying AI

Explore related topics

Explore content categories