As LLMs power more sophisticated systems, we need to think beyond prompting. Here are the 3 core strategies every AI builder should understand: 🔵 Fine-Tuning You’re not just training a model — you’re permanently altering its weights to learn domain-specific behaviors. It’s ideal when: • You have high-quality labeled data • Your use case is stable and high-volume • You need long-term memory baked into the model 🟣 Prompt Engineering Often underestimated, but incredibly powerful. It’s not about clever phrasing — it’s about mapping cognition into structure. You’re reverse-engineering the model’s “thought process” to: • Maximize signal-to-noise • Minimize ambiguity • Inject examples (few-shot) that shape response behavior 🔴 Context Engineering This is the game-changer for dynamic, multi-turn, and agentic systems. Instead of changing the model, you change what it sees. It relies on: • Chunking, embeddings, and retrieval (RAG) • Injecting relevant context at runtime • Building systems that can “remember,” “reason,” and “adapt” without retraining If you’re building production-grade GenAI systems, context engineering is fast becoming non-optional. Prompting gives you precision. Fine-tuning gives you permanence. Context engineering gives you scalability. Which one are you using today?
Cognitive Computing Strategies
Explore top LinkedIn content from expert professionals.
Summary
Cognitive computing strategies refer to methods and systems that enable computers to mimic human-like reasoning, adapt to complex tasks, and solve problems using advanced AI techniques. These approaches blend machine learning, natural language processing, and cognitive modeling to create AI systems that can plan, remember, and interact in dynamic environments.
- Apply prompt engineering: Structure prompts and examples carefully to guide AI thinking, improve clarity, and generate more reliable responses for different tasks.
- Build agentic systems: Design AI workflows that plan, use tools, and verify results so your system can automate complex business processes and convert goals into real actions.
- Blend cognitive modeling: Use models based on human thought processes to test design assumptions and ensure your AI systems remain transparent, trustworthy, and easy for people to use.
-
-
𝐎𝐧𝐞 𝐭𝐡𝐢𝐧𝐠 𝐰𝐢𝐥𝐥 𝐬𝐞𝐩𝐚𝐫𝐚𝐭𝐞 𝐭𝐡𝐞 𝐖𝐢𝐧𝐧𝐞𝐫𝐬 𝐨𝐟 𝐀𝐈: 𝐭𝐡𝐞𝐢𝐫 𝐚𝐛𝐢𝐥𝐢𝐭𝐲 𝐭𝐨 𝐝𝐞𝐬𝐢𝐠𝐧 𝐚𝐠𝐞𝐧𝐭𝐢𝐜 𝐬𝐲𝐬𝐭𝐞𝐦𝐬. If you only know models, you’re one step behind. Agents turn models into outcomes. 1/ 𝐑𝐮𝐥𝐞→𝐌𝐋→𝐋𝐋𝐌→𝐀𝐠𝐞𝐧𝐭𝐢𝐜 → 𝐂𝐨𝐠𝐧𝐢𝐭𝐢𝐯𝐞 ↳ Rule-based systems were explicit and brittle. ↳ ML added prediction but little orchestration. ↳ LLMs gave language and context. ↳ Agentic systems plan, call tools, retry, and compose results. ↳ Cognitive agents add memory, reflection, and long-term goals. 2/ 𝐖𝐡𝐚𝐭 𝐚𝐠𝐞𝐧𝐭𝐢𝐜 𝐬𝐲𝐬𝐭𝐞𝐦𝐬 𝐝𝐨 ↳ Plan a multi-step strategy from a goal. ↳ Orchestrate tools: search, DB, code execution, APIs. ↳ Aggregate, verify, and present an outcome. ↳ Recover from errors and ask humans when needed. 3/ 𝐖𝐡𝐲 𝐢𝐭 𝐦𝐚𝐭𝐭𝐞𝐫𝐬 ↳ Agents convert intent into work, not just answers. ↳ They automate complex business flows end-to-end. ↳ They scale human knowledge and reduce toil. 4/ 𝐑𝐞𝐚𝐥 𝐰𝐨𝐫𝐥𝐝 𝐩𝐚𝐭𝐭𝐞𝐫𝐧𝐬 ↳ Research agent: crawl, extract, compare, summarize. ↳ MLOps agent: detect drift, retrain, deploy, roll back. ↳ Support swarm: classify, retrieve, draft, escalate. 5/ 𝐖𝐡𝐚𝐭 𝐭𝐨 𝐥𝐞𝐚𝐫𝐧 𝐧𝐨𝐰 (𝐚𝐜𝐭𝐢𝐨𝐧𝐚𝐛𝐥𝐞) ↳ Prompt engineering and chain-of-thought design ↳ Tool integration patterns: APIs, DBs, search, function calling ↳ State & memory: snapshotting, vector stores, TTL strategies ↳ Orchestration: retries, idempotency, task queues, human gates ↳ Safety & provenance: input validation, audit trails, verifiability 6/ 𝐖𝐡𝐚𝐭 𝐭𝐨 𝐛𝐮𝐢𝐥𝐝 𝐟𝐢𝐫𝐬𝐭 ↳ Start with a planner + one tool agent + verifier. ↳ Keep loops explicit: Plan → Act → Observe → Revise. ↳ Measure business outcomes, not prompt perplexity. 𝐓𝐢𝐦𝐞𝐥𝐢𝐧𝐞 Rule-Based → ML → LLM → Agentic → Cognitive 1980s 2010s 2020s 2023+ Future 𝐑𝐞𝐬𝐨𝐮𝐫𝐜𝐞𝐬 ↳ 𝗟𝗮𝗻𝗴𝗖𝗵𝗮𝗶𝗻 docs: https://lnkd.in/g-8CSSgC ↳ 𝗔𝘂𝘁𝗼𝗚𝗲𝗻 repo: https://lnkd.in/gtTgQt-k ↳ 𝗔𝗻𝗱𝗿𝗲𝗷 𝗞𝗮𝗿𝗽𝗮𝘁𝗵𝘆 channel: https://lnkd.in/g222AND5 𝐌𝐢𝐧𝐢 𝐂𝐚𝐬𝐞 (𝐚𝐩𝐩𝐥𝐢𝐜𝐚𝐛𝐥𝐞) ↳ Goal: convert product feedback into roadmap items. ↳ Planner: decompose goal to discovery tasks. ↳ Tool agents: run semantic search on feedback, cluster issues, estimate effort via model. ↳ Verifier: human-in-loop reviews prioritized list. ↳ Outcome: triage time drops from days to minutes. 𝐓𝐋;𝐃𝐑 Agents are the bridge from language to action. Master planners, tool integration, state design, and safety. Build small, measure outcomes, iterate. That’s how you move from demos to systems that run companies. --- 📕 400+ 𝗗𝗮𝘁𝗮 𝗦𝗰𝗶𝗲𝗻𝗰𝗲 𝗥𝗲𝘀𝗼𝘂𝗿𝗰𝗲𝘀: https://lnkd.in/gv9yvfdd 📘 𝗣𝗿𝗲𝗺𝗶𝘂𝗺 𝗗𝗮𝘁𝗮 𝗦𝗰𝗶𝗲𝗻𝗰𝗲 𝗜𝗻𝘁𝗲𝗿𝘃𝗶𝗲𝘄 𝗥𝗲𝘀𝗼𝘂𝗿𝗰𝗲𝘀 : https://lnkd.in/gPrWQ8is 📙 𝗣𝘆𝘁𝗵𝗼𝗻 𝗗𝗮𝘁𝗮 𝗦𝗰𝗶𝗲𝗻𝗰𝗲 𝗟𝗶𝗯𝗿𝗮𝗿𝘆: https://lnkd.in/gHSDtsmA 📸/ @Piyush
-
Turns out, the longer your model thinks, the better it performs. One of the tricks is leveraging test-time compute scaling. Here’s how it works. This approach shifts the focus from building ever-larger models to making smaller models that reason more effectively at inference time. Instead of brute-forcing performance through costly pretraining, test-time compute allows models to search, verify, and refine responses dynamically—allocating just the right amount of computational effort based on task difficulty. OpenAI’s o1 system exemplifies this approach, using reinforcement learning to optimize test-time compute. It enables the model to “think longer” through chain-of-thought reasoning before responding, dynamically adjusting computational resources to the complexity of the task. This process, kind of like AlphaGo’s Monte Carlo Tree Search, generates valuable search traces, creating a feedback loop for future improvements directly when a request (inference) is made. Meanwhile, frameworks like QwQ take a different angle, emphasizing experimental philosophical inquiry and contemplative reasoning. This approach fosters a more questioning and doubtful (critical) thinking process, QwQ excels at math and reasoning tasks, showing the diversity of thought strategies that can emerge in optimized inference. The result is compelling: smaller models augmented with test-time compute can outperform those up to fourteen times their size on most tasks. Another option is the DiVeRSe framework. It enhances reasoning by generating multiple reasoning paths through diverse prompts, employing a voting verifier system to evaluate these paths, and utilizing step-aware verification to identify and correct errors at each step. this my favorite approach and what I use for my symbolic prompts. Why does this matter? Smarter reasoning, not bigger brains. By allowing models to “think longer” and adapt their thought processes, we’re entering a new era where inference strategies—not brute parameter scaling—define the cutting edge of performance. This fundamental rethink enables smaller, compute-optimized models to challenge the dominance of monolithic architectures while maintaining efficiency and adaptability.
-
Cognitive modeling has become central to cognitive engineering, and it carries lessons that are directly relevant for UX research. The value of a model isn’t in abstract theory but in whether it helps improve performance, usability, safety, workload, or trust. This marks a shift from experimental psychology. While cognitive science aims to explain mechanisms in depth, cognitive engineering prioritizes utility. Models are built to address operational needs. They often arise in emerging domains - autonomous driving, wearable computing, AI copilots - where prior data and theory may not exist. In these cases, cognitive engineers rely on domain experts, and the measure of success is whether the model helps a system work better, not whether it explains every cognitive mechanism. Early approaches set the tone. The GOMS framework (Goals, Operators, Methods, Selection rules) showed how tasks could be broken down into structured hierarchies. With GOMS, it became possible to estimate how long a process should take, spot bottlenecks, and anticipate errors. Later extensions, such as CPM-GOMS, emphasized that human behavior is not strictly linear but parallel and embodied - we perceive, act, and think at the same time. For UX, this matters when we design systems that demand attention across multiple channels, like driving while navigating or interacting with a voice assistant. Building on this, more advanced formalisms like ConcurTaskTrees and Enhanced Operator Function Models allowed richer representations of workflows, nondeterminism, and concurrency. At the same time, cognitive architectures such as ACT-R, EPIC, Soar, and QN-MHP expanded the ability to simulate memory, attention, multitasking, and motor actions. For UX researchers, these approaches provide structured ways to test design assumptions, model user journeys under stress, and anticipate where cognitive bottlenecks will emerge. The real power of cognitive modeling shows up in complex systems. In aviation, healthcare, and autonomous vehicles, many failures attributed to “human error” are actually design problems. Mode confusion, automation surprises, and workload overload occur when system design doesn’t align with cognition. Looking ahead, there are both opportunities and risks. Machine learning and AI bring predictive power but can create brittle and opaque systems. History shows that unexplained automation often leads to confusion and breakdowns. Cognitive models, in contrast, emphasize transparency and grounding in human abilities. The strongest path forward is hybrid: use AI for large-scale prediction, but pair it with cognitive modeling to keep outcomes interpretable and trustworthy.
-
Automating Problem Solving with AI: New Research Leverages Language Models to Specify Problems ... Researchers have developed a novel approach that uses large language models (LLMs) to automatically translate natural language problem descriptions into formal specifications that can be solved by AI systems. The implications are significant - this innovation could dramatically speed up the development of AI applications by reducing the need for manual problem formulation. 👉 Cognitive Task Analysis with LLMs Traditionally, specifying problems for AI systems to solve has required significant human effort through a process called cognitive task analysis. The new research shows how LLMs can be leveraged to automate much of this process: - LLMs analyze natural language problem descriptions - Key elements like initial states, goal states, operators, and constraints are extracted - A formal problem specification is output that can be ingested by an AI architecture By automating the translation from problem description to specification, this approach could make it much faster and easier to apply AI to a wide variety of domains and use cases. 👉 Integrating LLMs with Cognitive Architectures The researchers demonstrate how the LLM-based problem specification can be integrated with cognitive architectures like Soar. This enables a powerful combination: - LLMs handle the natural language interpretation and problem formulation - The cognitive architecture provides domain-general problem solving strategies - Together, they can tackle complex problems that are specified in plain language This type of tight integration between language models and reasoning engines points to a promising future direction for creating more versatile and capable AI systems. 👉 Boosting Efficiency with Search Control Another key finding is the importance of search control knowledge in making the problem solving process more efficient. By identifying unproductive paths and dead ends, the amount of computation required can be significantly reduced. Some examples of eliminating undesirable states and actions: - Avoiding loops that return to previously visited states - Pruning sequences of actions that undo each other - Detecting when actions are not making progress toward the goal Incorporating this type of search control allows the AI to find solutions much faster and using less resources. This will be critical for applying these techniques to large-scale, real-time applications. The research also outlines a number of directions to extend and enhance the approach, such as defining hierarchies of problem spaces and integrating with interactive task learning. As these techniques continue to mature, they could enable faster development of more capable cognitive systems that can be applied to an ever expanding range of domains. What potential applications of AI-automated problem-solving are you most excited about? Let me know your thoughts in the comments!
-
LLMs are like operating systems where the context window is RAM. "Context engineering" is now the #1 job for AI engineers building agents, according to Cognition (makers of Devin). Agents make hundreds of tool calls that can exceed context windows. For example, Lance Martin's (LangChain) deep research agent hit 500k tokens and cost several dollars per run because of token-heavy search APIs. Three strategies for managing agent context: Compress (Claude Code auto-compacts at 95% capacity), Persist (Claude.md files store memories across sessions), and Isolate (split work across sub-agents with separate contexts). Anthropic's multi-agent researcher beat single-agent by 90.2% by running parallel sub-agents. But Cognition warns against multi-agent systems because coordination is hard and prompting gets complex. The bitter lesson still applies: don't over-engineer context management that will become obsolete as models improve, but do instrument token usage, design clean state schemas, and compress at tool boundaries.
-
Earlier today, Professor Amit Sheth shared with me a paper from nearly a decade ago. It struck me as potentially valuable in addressing some current limitations of Gen-AI LLMs. For instance, the paper's discussion of perceptual computing for grounding GenAI in reality, semantic computing for providing knowledge and meaning, and cognitive computing for enhancing reasoning suggests a path toward overcoming issues like a lack of real-world understanding, poor multimodal reasoning, hallucinations, and biases, ultimately leading to more truthful, context-aware, and capable AI. ——— Title: Semantic, Cognitive, and Perceptual Computing: Advances toward Computing for Human Experience URL: arXiv:1510.05963 Key Points: * Emphasizes a human-centric view of computing, focusing on serving human needs, empowering individuals, and improving the quality of life. * Highlights the importance of making the Web more intelligent to better serve people by endowing it with sophisticated, human-like capabilities to reduce information overload. * Introduces the computing paradigms of semantic computing, cognitive computing, and perceptual computing, explaining their distinctions and complementary capabilities in supporting CHE. * Semantic Computing: Focuses on meaning and context of data. * Cognitive Computing: Aims for human-like understanding and reasoning. * Perceptual Computing: Emphasizes interpretation of sensory input and environment exploration to actively interact with the surrounding environment in order to collect data of relevance and usefulness for understanding the world around us. * Explains that by using semantic, cognitive, and perceptual computing synergistically, computers can not only provide answers to complex questions but also ask follow-up questions and interact with the environment to collect relevant data. Relevance Today (May 13, 2025): * Highly relevant due to the continued explosion of data and advancements in AI. * Principles are crucial for improving user experience, enabling ambient intelligence, and addressing information overload. * Essential for developing more intuitive, personalized, and ethically sound AI systems that understand and respond to human context. ———
-
🧠 Overthinking: No Longer Just a Human Problem We trained LLMs with the best of our cognitive abilities — and unfortunately, some of the worst too. One of those bad habits? Overthinking. Ask a simple question and watch your model go down a rabbit hole of unnecessary Chain-of-Thought (CoT) reasoning. Feels familiar, doesn’t it? Now imagine a system that does what smart humans do: 💡 Answer simply when the task is simple. 🧠 Think deeper only when required. Enter OThink-R1 — a dual-mode reasoning framework from Zhejiang University and OPPO that introduces cognitive efficiency to LLMs. Inspired by System 1 vs System 2 thinking, OThink-R1 allows LLMs to dynamically switch between fast (intuitive) and slow (analytical) reasoning modes — all based on task complexity. How it works: A secondary LLM acts as a reasoning judge 🧑⚖️ Redundant steps are pruned ✂️ Trained with a dual-reference KL-divergence loss to optimize both modes ⚖️ What’s the impact? 🔻 23% fewer tokens, no accuracy drop ⚡ Faster + cheaper inference 🧠 Better performance on tasks like OpenBookQA, GSM8K 💪 Outperforms static models like DualFormer Why it matters: In real-world AI systems, reducing unnecessary reasoning isn’t just smart — It’s operational efficiency. It’s reduced latency. It’s greener compute. If you’re building production LLM pipelines, Paper :https://lnkd.in/dcE4AcUA GitHub link: https://lnkd.in/dUuh6hjW Innovation Hacks AI Inc. #OThinkR1 #LLMOptimization #ReasoningEfficiency #AIInfrastructure #CognitiveComputing #TokenEfficiency #FutureOfAI
-
🤗 Why make models bigger when you can make them smarter at test-time? 🔗 Check out "Scaling Test-Time Compute with Open Models" by Ed Beeching, Lewis Tunstall, and Sasha Rush: (https://lnkd.in/gdz7WzKH). The article explores how scaling test-time compute can boost large language model (LLM) performance—without the massive costs of training ever-larger models. 🔹 Key Takeaways: Test-Time Compute Strategies: - Iterative correction of outputs for refining responses (requires specialized mechanisms). - Generating and scoring multiple solutions using methods like: - Best-of-N Sampling: Prioritizes high-scoring responses. - Beam Search: Optimizes reasoning step-by-step using process reward models (PRMs). - Diverse Verifier Tree Search (DVTS): Enhances diversity, excelling with large compute budgets. Experimental Wins: - Smaller models (Llama-1B, 3B) outperform much larger counterparts (Llama-70B) on math benchmarks by simply “thinking longer.” - DVTS shines in diversity-driven tasks, particularly in complex scenarios. Compute-Optimal Scaling: - Combines strategies dynamically based on problem difficulty and compute budgets, achieving optimal performance without brute-force scaling.
-
Test/Inference Time Compute vs. Systems 2 Thinking Scaling test-time (or inference-time) compute has been an emerging research direction in the past few months. The core focus of such research is to figure out how an LLM’s performance can improve if it’s given a fixed, substantial compute during prediction. Google Deepmind’s latest research (Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters) makes some promising progress on this front. Humans, when faced with complex problems, tend to think longer and deeper. In such scenarios, our minds perform additional processing compared to when we’re solving simpler and straightforward problems. These 2 types of cognitive processing are popularly known as System 1 and System 2 Thinking, which was introduced by Daniel Kahneman in his book Thinking, Fast and Slow. A good analogy (but not equivalence) for Test-time compute is System 2 thinking, which refers to deep and conscious deliberation and is more suited to complex problem solving tasks. DeepMind's research tries to explore if such capabilities can be instilled in LLMs. Test-time compute can be effective over pre-training, especially because of its practical advantages. At inference time - ↗️ a) the model can iterate and self-improve (no need to retrain models) ↗️ b) smaller on-device models can achieve performance comparable to those deployed in data-centers ↗️ c) compute can be allocated depending on the problem difficulty Here’s what the authors observed in their experiments - 1️⃣ In some settings, it is more effective to pre-train smaller models and apply test-time compute to generate better results. This is, however, limited to easy and intermediate questions and some types of hard questions. 2️⃣ For extremely challenging questions, test-time compute barely shows any advantage - it is more effective to invest in additional pre-training compute in such scenarios. This is expected, given the limitations autoregressive LLMs have in reasoning. 3️⃣ There is no winner-takes-all approach when it comes to test time compute. Different test-time strategies work better in different settings. For e.g. on easier problems, letting the model refine its initial answer by making n sequential revisions worked well. For harder problems, either generating multiple answers in parallel or a tree search against a process-based reward model works better. Planning is often formulated as a search problem, and therefore it’s encouraging to see that search approaches are more promising for complicated problems. We're likely to see more compute spent on inference vs pre-training. Architectural improvements in Open AI’s o1 models and Ilya Sutskever’s “Pre-training as we know it will unquestionably end” remark at NeurIPS a few months back are other signals that support this hypothesis. This however, does not mean that test-time compute replaces pre-training.