Top LinkedIn Content on Designing For Multimodal Interactions

621,631 followers 6mo

The Context Engineering Framework is quickly becoming one of the most important tools for anyone building reliable LLM systems. Getting the model to respond is the easy part. The real challenge is: → What should the model know right now? → Where should that info come from? → How should it be structured, stored, retrieved, or compressed? That’s exactly what this framework solves. 🧠 𝗪𝗵𝗮𝘁 𝗶𝘀 𝗖𝗼𝗻𝘁𝗲𝘅𝘁 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 Context engineering = designing dynamic systems that deliver the right info, in the right structure, at the right time, so models can reason, retrieve, and respond effectively. This matters most in agents, copilots, retrieval-augmented pipelines, and anything with memory or tools. ⚙️ 𝗜𝗻𝘀𝗶𝗱𝗲 𝘁𝗵𝗲 𝗖𝗼𝗻𝘁𝗲𝘅𝘁 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 Here’s the 3-layer system I use when designing end-to-end LLM workflows 👇 1️⃣ Context Retrieval & Generation → Prompt Engineering & Context Generation → External Knowledge Retrieval → Dynamic Context Assembly 2️⃣ Context Processing → Long Sequence Processing → Self-Refinement & Adaptation → Structured + Relational Information Integration 3️⃣ Context Management → Fundamental Constraints (tokens, latency, structure) → Memory Hierarchies & Storage Architectures → Context Compression & Trimming 🧱 All of this feeds into the Context Engine, which handles: → User Prompts → Retrieved Info → Available Tools → Long-Term Memory This is what gives your system continuity, task awareness, and reasoning depth across steps. ⚙️ Tools I would recommend: → LangGraph for orchestration + memory → Fireworks AI for fast, open-weight inference → LlamaIndex for modular retrieval → Redis & Vector DBs for scoped memory recall → Claude/Mistral for summarization and compression If your system is hallucinating, drifting, or missing the mark, it’s likely a context failure, not a prompt failure. 📌 Save this framework. 📩 Share it with your team before your next agent or RAG deployment. 〰️〰️〰️ Follow me (Aishwarya Srinivasan) for real-world GenAI system breakdowns, and subscribe to my Substack for deep dives and weekly insights: https://lnkd.in/dpBNr6Jg

59 Comments

Brij kishore Pandey

AI Architect & Engineer | AI Strategist

715,814 followers 7mo

𝗗𝗲𝘀𝗶𝗴𝗻𝗶𝗻𝗴 𝗖𝗼𝗻𝘁𝗲𝘅𝘁-𝗔𝘄𝗮𝗿𝗲 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁𝘀: 𝗧𝗵𝗲 𝟲 𝗗𝗶𝗺𝗲𝗻𝘀𝗶𝗼𝗻𝘀 𝗼𝗳 𝗖𝗼𝗻𝘁𝗲𝘅𝘁 Building AI agents isn’t just about fine-tuning prompts or plugging in APIs. The real differentiator lies in how effectively we design and manage context. Context defines the agent’s role, behavior, reasoning, and decision-making. Without it, even the best models act inconsistently. With it, agents become reliable, explainable, and enterprise-ready. Here are the 6 essential types of context for AI agents: 1. 𝗜𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻𝘀 – Define the who, why, and how: • Role (persona, e.g., PM, coding assistant, researcher) • Objective (business value, outcomes, success criteria) • Requirements (steps, constraints, formats, conventions) 𝟮.𝗘𝘅𝗮𝗺𝗽𝗹𝗲𝘀 – Demonstrate desired (and undesired) patterns: • Behavior examples (step sequences, workflows) • Response examples (positive/negative outputs) 𝟯.𝗞𝗻𝗼𝘄𝗹𝗲𝗱𝗴𝗲 – Embed domain and system understanding: • External context (business model, strategy, systems) • Task context (workflows, procedures, structured data) 𝟰.𝗠𝗲𝗺𝗼𝗿𝘆 – Extend reasoning across time: • Short-term memory (chat history, state, reasoning steps) • Long-term memory (facts, episodic experiences, procedural instructions) 𝟱.𝗧𝗼𝗼𝗹𝘀 – Extend capability beyond training data: • Tool descriptions act as micro-prompts • Parameters and examples guide usage 𝟲.𝗧𝗼𝗼𝗹 𝗥𝗲𝘀𝘂𝗹𝘁𝘀 – Close the loop by feeding outputs back into reasoning: • Orchestration layers attach results • Enables agents to adapt dynamically 𝗪𝗵𝘆 𝗶𝘁 𝗺𝗮𝘁𝘁𝗲𝗿𝘀: By designing across all six dimensions, we move beyond “prompt engineering” into structured context engineering. This makes agents: • More autonomous • More explainable • Easier to scale across enterprise systems In practice, this framework underpins everything from agent orchestration protocols (MCP, A2A) to multi-agent architectures in production. Question for you: When building AI agents, which of these six contexts have you found most challenging to implement at scale?

33 Comments

Greg Coquillo

Product Leader @AWS | Startup Investor | 2X Linkedin Top Voice for AI, Data Science, Tech, and Innovation | Quantum Computing & Web 3.0 | I build software that scales AI/ML Network infrastructure

227,036 followers 2mo

People think RAG is just “retrieve → generate.” That version is already outdated. As models get stronger, the real bottleneck isn’t generation. It’s how, when, and why you retrieve. That’s why RAG is evolving fast. Here are 12 advanced RAG patterns that show where things are heading and what problems teams are actually solving now: 1. Mindscape-Aware RAG Builds a global view first, then retrieves with intent. Useful when long context matters more than isolated chunks. 2. Hypergraph Memory RAG Stores facts as connected graphs so multi-hop reasoning works across retrieval steps. 3. QUCO-RAG Triggers retrieval based on suspicious or rare entities, reducing confident hallucinations. 4. HiFi-RAG Uses cheap models to filter early and strong models later, cutting cost without losing quality. 5. Bidirectional RAG Writes verified answers back into memory, but only after grounding checks pass. 6. TV-RAG Adds time awareness for video and long media, aligning text, frames, and events. 7. MegaRAG Uses multimodal knowledge graphs to reason across books, visuals, and long documents. 8. AffordanceRAG Retrieves only actions that are physically possible, designed for robots and agents. 9. Graph-01 Agent-driven GraphRAG that explores paths step by step using planning and search. 10. SignRAG Vision + retrieval for recognizing signs without training new models. 11. Hybrid Multilingual RAG Handles noisy OCR and multilingual data with query expansion and grounded fusion. 12. RAGPART + RAGMASK Defends against poisoned corpora by masking suspicious tokens and similarity shifts. The big shift is clear: RAG is no longer a single pipeline. It’s becoming a design space. The question isn’t “Should we use RAG?” It’s “Which RAG pattern matches our failure mode?” Which one of these do you think will become mainstream first?

82 Comments

Rrahul Sethi

39,535 followers 3mo

The Future of Immersion is Headset-Free? 😇 We often talk about the Metaverse being accessible via VR/AR headsets, but what happens when the most powerful immersive experience is shared, device-free, and right in front of you? The Shanghai Natural History Museum's "China's Dinosaur World" exhibition offers a powerful answer. They're using large-scale Projection Mapping and physical space to immerse 118 dinosaur specimens. Visitors literally walk through a prehistoric world, without a single tether or headset on their face. This isn't just a cool effect; it's a profound demonstration of how to scale presence and communal engagement. The Key Insight? True immersion isn't always about personal isolation. It's about collective experience. We need to stop framing immersive tech solely through the lens of hardware. The real innovation lies in the experience design—leveraging technologies like Projection Mapping and Spatial AR to make digital content accessible and communal for a massive audience. It democratizes the experience, making the 'Metaverse' a space for everyone, not just early adopters. Think about corporate training, product showcases, or massive-scale entertainment. The museum's approach proves that "shared reality" is perhaps the most impactful reality of all. What's the most compelling headset-free immersive experience you've encountered? 💡 ¿El Futuro de la Inmersión es Sin Auriculares? A menudo hablamos del Metaverso accesible a través de dispositivos VR/AR, pero ¿qué pasa cuando la experiencia inmersiva más potente es compartida, sin necesidad de dispositivos y está justo frente a ti? La exposición "China's Dinosaur World" en el Museo de Historia Natural de Shanghái ofrece una respuesta contundente. Están utilizando Projection Mapping a gran escala y el espacio físico para dar vida a 118 especímenes de dinosaurios. Los visitantes caminan literalmente a través de un mundo prehistórico, sin ataduras ni cascos. Esto no es solo un efecto visual genial; es una demostración profunda de cómo escalar la presencia y el compromiso comunitario. #AugmentedReality #VirtualReality #SpatialComputing #ExperientialDesign #Museums #EmergingTechnology

18 Comments

Kuldeep Singh Sidhu

Senior Data Scientist @ Walmart | BITS Pilani

15,642 followers 8mo

Excited to share a groundbreaking advancement in Retrieval-Augmented Generation! Researchers from KAIST and DeepAuto.ai have just released their paper on "UniversalRAG," a novel framework that addresses a critical limitation in current RAG systems. While traditional RAG approaches have shown promise in improving factual accuracy, they're typically limited to a single modality (text-only) or corpus. UniversalRAG changes the game by dynamically retrieving and integrating knowledge from heterogeneous sources with diverse modalities (text, images, videos) and granularities. >> Technical Innovation The key insight behind UniversalRAG is addressing the "modality gap" problem. When forcing all modalities into a unified representation space, retrieval tends to favor items from the same modality as the query, overlooking relevant content from other modalities. Instead of using a unified embedding space, UniversalRAG introduces a modality-aware routing mechanism that: 1. Maintains separate embedding spaces for each modality (text, image, video) 2. Dynamically identifies the most appropriate modality for each query 3. Routes the query to the corresponding modality-specific corpus Beyond modality, UniversalRAG also organizes each modality into multiple granularity levels: - Text: paragraph-level and document-level - Video: clip-level and full-length video - Images: kept intact as they're inherently piecemeal >> Implementation Details The framework includes a Router component that predicts the most suitable retrieval type from six options: None, Paragraph, Document, Image, Clip, or Video. The researchers explored both training-free routers (using pretrained LLMs) and trained routers optimized specifically for the routing task. When validated across 8 benchmarks spanning multiple modalities, UniversalRAG consistently outperformed modality-specific and unified baselines, demonstrating the effectiveness of its adaptive corpus selection approach. This research opens exciting possibilities for more flexible and accurate information retrieval systems that can truly understand and respond to the diverse knowledge requirements of real-world queries.

3 Comments

Justin Seeley

Sr. eLearning Evangelist, Adobe | L&D Community Advocate

12,447 followers 11mo

I scrapped an entire video yesterday. Not because the content was bad. The writing was clear. The visuals were polished. The delivery was energetic. But I realized it was solving the wrong problem. Our customers don't struggle with WHAT to do in our software. They struggle with WHEN and WHY to use certain features. Context is everything in learning design. You can create the most beautiful explanation of a feature, but if learners don't understand when to apply it in their workflow, you've failed. Great instructional designers don't just organize information—they organize relevance. They don't just deliver content. They deliver the situational awareness to make that content useful. This is why we need to stop obsessing over content creation and start obsessing over context creation. Ask yourself: Do your learners understand not just the "how" but the "when" and "why"? What's one way you've improved the context in your learning experiences?

6 Comments

Vaibhava Lakshmi Ravideshik

AI for Science @ GRAIL | Research Lead @ Massachussetts Institute of Technology - Kellis Lab | LinkedIn Learning Instructor | Author - “Charting the Cosmos: AI’s expedition beyond Earth” | TSI Astronaut Candidate

19,655 followers 6mo Edited

Enterprises today are drowning in multimodal data - text, images, audio, video, time-series, and more. Large multimodal LLMs promise to make sense of this, but in practice, embeddings alone often collapse nuance and context. You get fluency without grounding, answers without reasoning, “black boxes” where transparency matters most. That’s why the new IEEE paper “Building Multimodal Knowledge Graphs: Automation for Enterprise Integration” by Ritvik G, Joey Yip, Revathy Venkataramanan, and Dr. Amit Sheth really resonates with me. Instead of forcing LLMs to carry the entire cognitive burden, their framework shows how automated Multi Modal Knowledge Graphs (MMKGs) can bring structure, semantics, and provenance into the picture. What excites me most is the way the authors combine two forces that usually live apart. On one side, bottom-up context extraction - pulling meaning directly from raw multimodal data like text, images, and audio. On the other, top-down schema refinement - bringing in structure, rules, and enterprise-specific ontologies. Together, this creates a feedback loop between emergence and design: the graph learns from the data but also stays grounded in organizational needs. And this isn’t just theoretical elegance. In their Nourich case study, the framework shows how a food image, ingredient list, and dietary guidelines can be linked into a multimodal knowledge graph that actually reasons about whether a recipe is suitable for a diabetic vegetarian diet - and then suggests structured modifications. That’s enterprise relevance in action. To me, this signals a bigger shift: LLMs alone won’t carry enterprise AI into the future. The future is neurosymbolic, multimodal, and automated. Enterprises that invest in these hybrid architectures will unlock explainability, scale, and trust in ways current “all-LLM” strategies simply cannot. Link to the paper -> https://lnkd.in/gv93znbQ #KnowledgeGraphs #MultimodalAI #NeurosymbolicAI #EnterpriseAI #KnowledgeGraphLifecycle #MMKG #AIResearch #Automation #EnterpriseIntegration

Building Multimodal Knowledge Graphs: Automation for Enterprise Integration ieeexplore.ieee.org

6 Comments

Grant Dudson

🔹Global Creative Director of Fever Originals 🔹Experiential Artist 🔹Brand Experience 🔹Immersive Art🔹Retail Design 🔷Mentor 🔷Keynote Speaker 🔷Favikon #1 Art & Culture

120,661 followers 7mo

Shopping centres must become experiential arenas! The term ‘experiential arenas’ comes from Diana Teixeira Pinto and aligns with my view of how to design worlds not spaces. So how do we transform spaces into worlds? Here are some of my top design principles for executing successful Experiential Arenas: Build Worlds, Not Spaces Design destinations that transport people into new realities, not just corridors of commerce. Colour as Energy Bold, surprising palettes and patterns that lift mood and inject personality into every corner. Wellness in Motion Seating that heals, greenery that breathes, zones that invite pause and reset through biophilic design. Shopping should restore, not exhaust. Fill the Forgotten Atriums, rooftops, stairwells, and voids become playgrounds for art, light, and imagination. Sensory Immersion Use sound, scent, light, and texture as storytelling layers to spark memory and emotion. Everywhere’s a canvas Turn escalators, walkways, and food courts into theatres for entertainment, surprise, and play. Participation Over Passivity Invite people to co-create through interactive art, digital play, gamified shopping, and communal rituals. Play is Serious Business Design joy into the architecture: swings as benches, slides as shortcuts, playful touchpoints everywhere. Local Stories, Global Scale Embed local culture, artists, and narratives, then amplify them into experiences with global resonance. Micro-Magic Surprise through small details like bins that talk, ceilings that glow, restrooms that delight. Fluid & Ever-Changing Keep spaces alive with rotating installations, seasonal scenography, and pop-up moments of wonder. Sustainable Spectacle Awe doesn’t need waste: design modular, reusable, and eco-conscious experiences that wow responsibly. Community as Stage Curate experiences where people become part of the show — from live performance to collaborative design. Memory is the Metric Success isn’t footfall, it’s stories: people leave with moments worth retelling, not just receipts. Elena Knezović #retail #architecture #interior #design

41 Comments

Markus J. Buehler

McAfee Professor of Engineering at MIT; Co-Founder & CTO at Unreasonable Labs; AI-Driven Scientific Discovery

29,757 followers 1y

How do materials fail, and how can we design stronger, tougher, and more resilient ones? Published in #PNAS, our physics-aware AI model integrates advanced reasoning, rational thinking, and strategic planning capabilities models with the ability to write and execute code, perform atomistic simulations to solicit new physics data from “first principles”, and conduct visual analysis of graphed results and molecular mechanisms. By employing a multiagent strategy, these capabilities are combined into an intelligent system designed to solve complex scientific analysis and design tasks, as applied here to alloy design and discovery. This is significant because our model overcomes the limitations of traditional data-driven approaches by integrating diverse AI capabilities—reasoning, simulations, and multimodal analysis—into a collaborative system, enabling autonomous, adaptive, and efficient solutions to complex, multiobjective materials design problems that were previously slow, expert-dependent, and domain-specific. Wonderful work by my postdoc Alireza Ghafarollahi! Background: The design of new alloys is a multiscale problem that requires a holistic approach that involves retrieving relevant knowledge, applying advanced computational methods, conducting experimental validations, and analyzing the results, a process that is typically slow and reserved for human experts. Machine learning can help accelerate this process, for instance, through the use of deep surrogate models that connect structural and chemical features to material properties, or vice versa. However, existing data-driven models often target specific material objectives, offering limited flexibility to integrate out-of-domain knowledge and cannot adapt to new, unforeseen challenges. Our model overcomes these limitations by leveraging the distinct capabilities of multiple AI agents that collaborate autonomously within a dynamic environment to solve complex materials design tasks. The proposed physics-aware generative AI platform, AtomAgents, synergizes the intelligence of LLMs and the dynamic collaboration among AI agents with expertise in various domains, incl. knowledge retrieval, multimodal data integration, physics-based simulations, and comprehensive results analysis across modalities. The concerted effort of the multiagent system allows for addressing complex materials design problems, as demonstrated by examples that include autonomously designing metallic alloys with enhanced properties compared to their pure counterparts. We demonstrate accurate prediction of key characteristics across alloys and highlight the crucial role of solid solution alloying to steer the development of alloys. Paper: https://lnkd.in/enusweMf Code: https://lnkd.in/eWv2eKwS MIT Schwarzman College of Computing MIT Civil and Environmental Engineering MIT Department of Mechanical Engineering (MechE) MIT Industrial Liaison Program MIT School of Engineering

7 Comments

Natasha Malpani

Early-Stage Investor | Stanford MBA

34,489 followers 9mo Edited

The best design will soon be invisible. Interfaces used to ask: what do you need? Agentic AI flips it to: what have I already handled? The surface shrinks. We’ll gesture less and grant more permission. Location, calendar, biometrics, preference history: these signals replace tap-and-type. The UI only shows up when confidence drops and the agent needs clarity. The foreground becomes explanation: “Here’s what I did, veto if wrong.” The background is silent execution. Multimodal stops being a demo trick. Voice for speed. Text for precision. Glanceable cards for audit. Users glide across modes instead of switching apps. Design shifts from fetching tasks to negotiating autonomy. Micro-copy matters more than motion. Reversible actions matter more than dark-mode flair. If an agent moves money or publishes words, it owes the user a trail they can scan in seconds. Solving for who makes invisible work feel trustworthy is the edge. Build the layer that hides the work and surfaces the proof. Boundless Ventures

18 Comments

Designing For Multimodal Interactions

More in Designing For Multimodal Interactions

More User Experience topics

Explore categories