Everyone's talking about LLMs. I went a different direction 🧠 While everyone's building RAG systems with document chunking and vector search, I got curious about something else after Prof Alsayed Algergawy and his assistant Vishvapalsinhji Parmar's Knowledge Graphs seminar. What if the problem isn't just retrieval - but how we structure knowledge itself? 🤔 Traditional RAG's limitation: Chop documents into chunks, embed them, hope semantic search finds the right pieces. But what happens when you need to connect information across chunks? Or when relationships matter more than text similarity? 📄➡️❓ My approach: Instead of chunking, I built a structured knowledge graph from Yelp data (220K+ entities, 555K+ relationships) and trained Graph Neural Networks to reason through connections. 🕸️ The attached visualization shows exactly why this works - see how information naturally exists as interconnected webs, not isolated chunks. 👇🏻 The difference in action: ⚡ Traditional RAG: "Find similar text about Italian restaurants" 🔍 My system: "Traverse user→review→business→category→location→hours and explain why" 🗺️ Result: 94% AUC-ROC performance with explainable reasoning paths. Ask "Find family-friendly Italian restaurants in Philadelphia open Sunday" and get answers that show exactly how the AI connected reviews mentioning kids, atmosphere ratings, location data, and business hours. 🎯 Why this matters: While others optimize chunking strategies, maybe we should question whether chunking is the right approach at all. Sometimes the breakthrough isn't better embeddings - it's fundamentally rethinking how we represent knowledge. 💡 Check my script here 🔗: https://lnkd.in/dwNcS5uM The journey from that seminar to building this alternative has been incredibly rewarding. Excited to continue exploring how structured knowledge can transform AI systems beyond what traditional approaches achieve. ✨ #AI #MachineLearning #RAG #KnowledgeGraphs #GraphNeuralNetworks #NLP #DataScience
How Knowledge Graphs Improve AI
Explore top LinkedIn content from expert professionals.
Summary
Knowledge graphs are structured networks that connect different pieces of information, helping artificial intelligence systems understand relationships, context, and meaning rather than just retrieving similar facts. By organizing data in this way, knowledge graphs allow AI to reason, explain answers, and adapt to new situations with greater accuracy and trustworthiness.
- Build meaningful connections: Structuring information as a network lets AI connect concepts and trace how answers are formed, making responses more reliable and understandable.
- Support real reasoning: Knowledge graphs enable AI to go beyond memorization and pattern matching by allowing step-by-step logic and inference, similar to how experts solve problems.
- Maintain clear context: Treating context, definitions, and relationships as first-class data ensures AI understands not just what data means, but why answers matter in business, medicine, or any complex domain.
-
-
A lot of AI engineers (even sharp ones) get seduced by the cool factor of vector databases. Cosine similarity, ANN search... it all sounds cutting-edge. But when you're building a Retrieval-Augmented Generation (RAG) pipeline, you're not just doing retrieval. You're orchestrating a semantic symphony between memory, context, and reasoning. And that's where many go off the rails. ❌ The Mistake: Vector First, Think Later Vector DBs are fantastic if: • Your knowledge is flat, unstructured, and mostly text • You want fast nearest-neighbor search over embeddings • You're okay with opaque black-box retrieval But the moment your domain knowledge has structure, hierarchies, relationships, or rules that need to be preserved across hops... vector search starts hallucinating. Hard. Because embedding space flattens knowledge. It smears out the sharp logic. It doesn't understand that "Paris is the capital of France and a city in Europe and has museums related to Impressionism." Vector DB just knows "Paris" is semantically close to "Eiffel Tower." Wow. Groundbreaking. 🧭 What You Should Be Using: Knowledge Graphs If your use case has: • Ontologies (types, classes, hierarchies) • Multi-hop reasoning (A→B→C) • Causality or directionality (X leads to Y, not just related to) • Entity disambiguation (which "Apple" are we talking about?) • Need for traceability and explainability (the why behind the answer) Then a Knowledge Graph (KG) is your divine weapon. Graphs don't just store facts. They encode logic, preserve causality, and let you do symbolic + neural hybrid search. They let you model the world like the world actually works... not just as a soup of cosine-clustered tokens. 🧪 Real-World Case: Ask a medical LLM powered by a vector DB: Can ibuprofen be taken with aspirin? You might get a generic answer scraped from a webpage. Ask the same question in a KG-powered RAG. The graph knows: Ibuprofen is an NSAID. Aspirin is an antiplatelet. There's a potential drug interaction due to increased bleeding risk. This depends on patient profile → age → comorbidities → other meds It can trace a path through nodes and edge types to construct a reasoned answer. This is not just retrieval. This is inference. 🔮 Where This Is Going The future of RAG is hybrid: 🔸️Embeddings for semantic breadth 🔸️Graphs for logical depth You'll embed the leaves of the tree... but you'll walk the branches with graph logic. 🎯 TLDR for the Impatient: Vector DBs are great for fuzzy recall. Knowledge Graphs are necessary for precise reasoning. And most AI engineers forget that precision is not optional in high-stakes domains like medicine, law, or finance. If your system needs to think, not just parrot, start with the graph. #database #vector #embeddings #knowledgegraphs #algorithms #computerscience #software #tech #medicine #law #finance #AI #RAG #LLM
-
🚀 Demo Highlight: Ontology on Snowflake ❄️ Why do enterprise AI agents struggle with trust? They’re good at finding data — but they’re guessing the meaning. In the latest demo, I showed how Ontology on Snowflake changes that. By embedding an ontology-driven knowledge graph directly inside Snowflake, we unify facts, business semantics, governance, and AI reasoning in a single architecture. Instead of querying disconnected semantic views, agents reason over a shared business map — grounded in real-world concepts and relationships. What this enables: 🔎 Agents understand abstract business concepts (customer, employee, asset, supplier) across roles and structures — without hardcoded logic 🕸️ Knowledge graph relationships power deeper insights across operations, performance, and risk 🛡️ Built-in governance allows agents to respect access controls and detect data integrity issues 🧠 Cortex agents orchestrate SQL, graph analytics, and semantic models on top of a durable business foundation This isn’t just another semantic layer. It’s a five-layer, Snowflake-native architecture where data, meaning, and intelligence operate as one. No brittle SQL. No duplicated logic. No semantic guessing. Just grounded, explainable AI — powered by ontology and knowledge graph, natively on Snowflake. Blog series: Part 1: https://lnkd.in/geiM_faj Part 2: https://lnkd.in/gbSYyUWv Part 3: https://lnkd.in/gqx7A626 Repo: https://lnkd.in/ev3tgDEY #Snowflake #Ontology #KnowledgeGraph #EnterpriseAI #Cortex #DataArchitecture #AI #ArtificialIntelligence #AIAgent #AgenticAI #semanticmodel
-
Lesson 1: Knowledge Graphs Make LLMs Accurate Knowledge graphs and ontologies are suddenly a hot topic, and that’s not a coincidence. LLMs exposed a problem that everyone has been dealing with since the early days of ChatGPT. LLMs don’t know what your data actually means. They don’t understand your: - business context - definitions - semantics - metrics In the early days of ChatGPT (first half of 2023), there was a lot of talk about “KGs + LLMs.” Mostly intuition. Mostly speculation. I remember being at the Snowflake Summit and talking to their product managers telling them that they needed to invest in semantics and knowledge graphs. They were skeptical and asked if we had evidence. We didn’t. What changed was when Dean Allemang, Bryon Jacob and I published our paper “A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model's Accuracy for Question Answering on Enterprise SQL Databases” in Nov 2023 and put numbers behind it. The benchmark research concrete showed how semantics and knowledge graphs improve LLM accuracy by 3X. That research helped kick off the serious industry focus we’re seeing now on semantics and knowledge graphs, and it reframed the conversation of LLMs and AI. The first team to independently validate the results was dbt Labs led by Jason Ganz. Then it snowballed from there. GraphRAG came out. Neo4j's Graph Manifesto cited our work as the first reason for “why graphs”. The following year, semantic layers were on every keynote stage. Dean and I continued to do research based on what we had learned in the first six months of deployment and published “Increasing the LLM Accuracy for Question Answering: Ontologies to the Rescue!” where we show that the ontology itself can further increase accuracy and per our benchmark increased it to 4X. Most people entering the knowledge graph and ontology space today aren’t coming from traditional data integration or semantics. They’re coming through LLMs because the gap is now obvious and there is a principled approach to bridge it. Enterprises need to invest in treating knowledge, semantics, metadata, and context as a first class asset. Knowledge graphs provide that missing context. That’s why this isn’t a “nice to have” anymore. My strong position: Enterprise AI will not succeed without an investment in knowledge graphs because - agents will not be accurate - outputs won’t be trustworthy because they can’t explain where an answer came from, how it was derived, which definitions were used. AI became the forcing function that finally aligned incentives. LLMs didn’t replace knowledge graphs. They finally made them unavoidable. (I’m starting a new series of posts based on my talk “20 Lessons from 20 Years of Building Ontologies and Knowledge Graphs”. Every day I will be sharing a lesson. Next lesson: “You can’t automate understanding.”)
-
How Agentic Graph Systems Transform AI from Pattern Matching to Intelligent Reasoning 💡 Sutskever's observation about reaching "peak data" - having only one internet worth of training data - highlights the unsustainability of current approaches. Meanwhile, Ng's emphasis on agentic workflows shows how AI can operate more efficiently and intelligently with available data. Agentic graph systems emerge as a paradigm that addresses core limitations of traditional deep learning while creating a sustainable path toward more sophisticated AI capabilities. The first significant observation is the fundamental difference in learning approach. Traditional deep learning systems operate like sophisticated pattern-matching engines, similar to a student who memorizes answers without understanding underlying principles. In contrast, agentic graph systems function more like human experts, actively engaging with problems and building structured understanding through experience. Second, the implementation of a knowledge graph structure fundamentally changes how information is processed and stored. Rather than treating each piece of information as an isolated data point, the system creates a rich network of interconnected concepts. This mirrors how human experts organize their knowledge, with each piece of information contextually linked to related concepts and principles. Third, the system's ability to create a self-improving data flywheel represents a breakthrough in sustainable AI development. Unlike traditional systems that require complete retraining to incorporate new information, agentic graph systems continuously integrate new knowledge and experiences into their existing structure, becoming more capable with each interaction. By structuring knowledge in a graph format and employing active learning agents, these systems overcome three critical limitations of current deep learning approaches: the dependence on massive training datasets, the black-box nature of decision making, and the inability to reason about new situations. The system's ability to reflect on its actions, plan multi-step approaches, and collaborate with specialized agents creates a form of artificial intelligence that more closely resembles human expert thinking. This is particularly evident in how it handles new information - not simply storing it, but analyzing its relationships to existing knowledge, identifying implications, and integrating it into a broader understanding. The data flywheel mechanism represents perhaps the most significant advancement. By creating a positive feedback loop where each interaction enhances the system's capabilities, it establishes a sustainable path for AI development that doesn't rely on the increasingly problematic requirement for massive training datasets. This represents a fundamental shift from what Sutskever calls the "pre-training era" to what we might call the "continuous learning era."
-
🚀 Graph-Enhanced RAG Architecture Most RAG systems rely only on vector similarity. That works for relevance, but it breaks when questions require relationships, dependencies, or multi-hop reasoning. This is where many enterprise RAG systems fail in production. This architecture shows how combining Knowledge Graphs, graph embeddings, and LLMs enables deeper reasoning. High-level flow in this system: - User submits a query - The AI application constructs a context-aware prompt - Data is processed using NER, enrichment, and embeddings - Neo4j stores entities, relationships, and graph embeddings - Retrieval combines semantic similarity with graph traversal - The LLM receives a grounded and enriched prompt - The response is accurate, explainable, and context-aware 💡 Why graphs matter in RAG: - Vector search answers what is similar - Knowledge graphs answer how things are connected - Graph embeddings bridge symbolic structure and neural search 👉 Example: Question: “What downstream systems are affected if this policy changes?” - Vector search retrieves policy documents - Graph traversal identifies related teams, owners, systems, and dependencies - The LLM synthesizes an answer grounded in real organizational structure This approach reduces hallucinations and improves reasoning quality in enterprise use cases like compliance analysis, impact assessment, fraud detection, and knowledge discovery. RAG becomes significantly more powerful when retrieval understands relationships, not just text. If you are building production GenAI systems, graph-augmented RAG is a pattern worth mastering. ➕ Follow Shyam Sundar D. for practical learning on Data Science, AI, ML, and Agentic AI 📩 Save this post for future reference ♻ Repost to help others learn and grow in AI #GenerativeAI #RAG #KnowledgeGraph #GraphEmbeddings #Neo4j #LLM #AIArchitecture #EnterpriseAI #AgenticAI #GenAIArchitecture
-
If you want to understand why your data needs a semantic layer, look at what happens to AI without one. Without semantics, your best case is forcing a massive JSON payload into an LLM to explain what your data means. The worst case? AI blindly wanders through undocumented data, guessing based on statistical probability rather than actual logic. Neither outcome is acceptable for organizations building reliable, production-grade AI. As Meagan Palmer recently noted, the semantic layer has historically had two meanings: one from BI/data management, and one from the Semantic Web. Yesterday Jessica Talisman did a deep dive on the history of semantic layers cracking open the fine-grained detail. (See comments for links). To truly support AI, you must unify both types of semantics. Here's a Crawl → Walk → Run progression for evolving your metadata stack into a unified form: Crawl: Structure and Best Effort Semantics — Centralize metadata into JSON Schemas and glossaries. Move away from fragmented silos by adopting a unified metadata model capturing first-class entities like tables, dashboards, and pipelines. This establishes clearer definitions across your organization. But this is still just best effort semantics. Flat definitions create dangerous ambiguity AI cannot resolve alone. Ask what revenue means and Finance says "Net", Sales says "Gross", Marketing says "Attributed". Without explicit architectural meaning, AI fills that gap with probability — delivering confident but wrong answers. Walk: From Metadata Graph to RDF — Make your metadata machine-understandable. Translate JSON schemas into an RDF graph of subjects, predicates, and objects. Starting with schema.org means analysts work in familiar JSON while that structure translates into formal formats like JSON-LD without complex context switching. The result is a knowledge graph built on standard vocabularies like DCAT for datasets and PROV for lineage, layered with data quality, ownership, and usage context. This enables GraphRAG, SPARQL queries, and cross-system connectivity. Run: Ontologies and AI Reasoning — Evolve from a flat glossary to a full knowledge ontology. Instead of defining a customer as simply a person who buys goods, map exactly how that entity relates to domains, metrics, revenue streams, and orders. Connect that ontology to your physical data estate by tagging real tables and columns to concepts in your OWL ontology. The result: you move beyond context-driven approximations like vector similarity to a true semantics-driven system. AI agents consume structured semantic context to execute cognitive logic. Definitions and relationships are explicitly governed, so answers are precise, consistent, and explainable. Not statistical guesses. You've stopped standardizing metadata. You've started standardizing meaning. #DataStrategy #SemanticLayer #AIData #KnowledgeGraph #DataManagement #EnterpriseAI #Ontology #DataGovernance #RDF
-
A brilliant idea isn’t a fact—until it is. Many groundbreaking discoveries seem obvious only in hindsight, once they unify a web of seemingly isolated facts into a general principle. Before we connected the dots between evolution, genetics & material science, silk was just a thread, proteins were just biological molecules, & genes were just codes. But once we saw their relationships, we unlocked deep truths about how nature builds materials at every scale. What If AI Could Think in Relationships Instead of Just Memorizing? Most AI today doesn’t work this way. It merely predicts the next token, unaware of whether its own output is meaningful, correct, or groundbreaking. They: ❌ Lack true reasoning—they do not verify if their responses make sense. ❌ Cannot correct themselves—once they generate something, they have no mechanism to reflect and refine their own ideas. ❌ Do not connect ideas deeply—they retrieve, not discover. 💡 SciAgents does something different. Rather than treating knowledge as isolated facts, it builds a massive relational graph, connecting every concept and idea to others. Then, a team of AI agents explores this graph, not just by taking the shortest path between ideas, but by wandering through unexpected links. How SciAgents Reasons over Graphs ▶️Instead of taking the shortest path between two ideas (which can be too direct & limiting), SciAgents samples diverse paths through a powerful algorithm that explores ever-growing sets of diverse waypoints. This allows it to natively explore broader, richer relationships—leading to unexpected discoveries. ▶️For example, to explore the connection between silk and energy efficiency, SciAgents didn’t just look at direct links. It uncovered intermediate concepts like biocompatibility, multifunctionality & structural coloration, revealing new ways to design bioinspired materials that human researchers might have overlooked. Why does this matter for building better AI for science and beyond? 1⃣Generalization is the key to intelligence. Memorization alone won’t get AI to true reasoning—but structuring knowledge in a relational way can. 2⃣SciAgents goes beyond predicting words. It constructs maps of ideas by conceptual blueprints, from genes encoding proteins to evolutionarily refined materials like silk, and extrapolates new designs. 3⃣It refines its own outputs. Rather than passively generating text, SciAgents’ multi-agent system debates, critiques, and improves hypotheses, making its discoveries deeper and more reliable. Graph-based reasoning plus multi-agent collaboration is not just a better way for AI to think—it’s likely on a critical path towards AGI. The ability to form deep, structured insights from sparse information is what separates mere computation from true intelligence. A. Ghafarollahi, M.J. Buehler, SciAgents: Automating Scientific Discovery Through Bioinspired Multi-Agent Intelligent Graph Reasoning, Adv. Materials, DOI: 10.1002/adma.202413523, 2025
-
Earlier this summer, Gartner released its Hype Cycle for AI 2025, and it continues to come up in my conversations with leaders. Some placements were expected: • Generative AI is in the Trough of Disillusionment. Enterprises are still investing heavily, but without organizational context, results often disappoint. • AI Agents sit at the Peak of Inflated Expectations: full of promise, but not yet delivering at scale in many organizations. More telling is the placement of Knowledge Graphs on the Slope of Enlightenment, signaling proven value. By mapping relationships between information — like who created a document, what project it supports, or how it ties to a customer — Knowledge Graphs give AI the context to deliver more accurate, useful results. But traditional Knowledge Graphs aren’t enough. They capture how information is connected, but not how employees actually work: their tasks, priorities, who they collaborate with, how they make decisions. At Glean, we call this the Enterprise Graph: the combination of the information your company has and how your people get work done. What excites me most is that by capturing the how of work, AI can shift from reactive — answering questions and following directions — to proactive: flagging risks before they surface, pulling in the right teammates based on their strengths, and recommending the next best action before you even need to ask.
-
GraphRAG: Teaching LLMs to Connect the Dots 📚 Ever felt like your AI assistant just doesn't get the big picture? Traditional RAG systems are like that friend who remembers random facts but can't quite piece them together. Meet GraphRAG, Microsoft's clever solution to help LLMs see the forest, not just the trees. Imagine trying to solve a puzzle with pieces scattered across different rooms. That's what traditional RAG does - it finds individual pieces but struggles to put them together. GraphRAG creates a map of how all the information fits together. This means LLMs can now understand connections and context in ways they never could before. What all GraphRAG can do? 1. Uncover Hidden Connections GraphRAG is like a detective, finding links between facts even when they're spread out. It helps LLMs tackle complex questions that require understanding how different pieces of info relate to each other. 2. Pinpoint Accuracy GraphRAG uses its knowledge map to find answers that are spot-on and make sense in context. Plus, you can trace each part of an answer back to its source. 3. Unlock Meaningful Insights GraphRAG doesn't just fetch facts, it sees the big picture. It can spot trends, identify themes, and offer insights that would be near impossible to find otherwise. Why This Matters for You? Think about how often you've asked an AI a question and gotten a response that's... close, but not quite right. Or worse, an answer that's just plain wrong. GraphRAG could change all that. It's about making AI assistants that truly understand what you're asking and can give you answers that actually help. What's Next? As GraphRAG like developments mature, we might see: • More intuitive AI assistants that can handle complex, multi-step questions • Better automated research tools that can draw insights from vast databases • AI systems that can explain their reasoning, making them more trustworthy and useful in fields like medicine or law.