Ever wondered how Large Language Models (LLMs) like ChatGPT actually learn to talk like humans? It all comes down to a multi-stage training process - from raw data learning to human feedback fine-tuning. Here’s a quick breakdown of the 4 Stages of LLM Training: Stage 0: Untrained LLM At this stage, the model produces random outputs — it has no understanding of language yet. Stage 1: Pre-training The model learns from massive text datasets, recognizing language patterns and structure - but it’s still not conversational. Stage 2: Instruction Fine-Tuning Now, it’s trained on question–answer pairs to follow instructions and provide more useful, context-aware responses. Stage 3: Reinforcement Learning from Human Feedback (RLHF) The model learns to rank responses based on human preference, improving response quality and helpfulness. Stage 4: Reasoning Fine-Tuning Finally, the model is trained on reasoning and logic tasks, refining its ability to produce factual and well-structured answers. Understanding how LLMs evolve helps you build, prompt, and use them better.
How LLMs Model Human Language Abilities
Explore top LinkedIn content from expert professionals.
Summary
Large language models (LLMs) simulate human language abilities by processing vast amounts of text data, recognizing patterns, and learning to generate responses that resemble human conversation. While they can closely mimic how people use language, LLMs rely on statistical modeling rather than genuine understanding or human-like reasoning.
- Understand model limitations: Remember that LLMs generate answers based on patterns in data, not personal experiences or emotions, so their responses may lack deeper meaning or context.
- Use critical thinking: Always evaluate LLM outputs yourself, as these models can produce plausible-sounding answers that aren't necessarily true or grounded in real-world understanding.
- Explore new research: Keep up with studies comparing LLMs and human brain activity to appreciate the strengths and weaknesses of AI in language processing.
-
-
🧠 LLMs and the Frontier of Understanding: Unveiling a Theory of Mind In a groundbreaking study led by Michal Kosinski at Stanford, the spotlight turns to an intriguing aspect of artificial intelligence: Can large language models (LLMs) understand that others may hold beliefs different from factual reality? This concept, known as the theory of mind, is a fundamental psychological construct that humans typically navigate with ease. The findings? The larger the model, the more adept it becomes at mirroring this uniquely human cognitive ability. 🔗 https://lnkd.in/e4YMMhgH Exploring the Theory of Mind in LLMs: The study meticulously evaluated LLMs (ranging from GPT-1 through GPT-4 and BLOOM) against 40 tasks designed to test human theory of mind capabilities. These tasks, split between "unexpected transfers" and "unexpected content," challenge the models to recognize that characters in the narratives may believe in factually incorrect information. Impressive Outcomes: The leap in performance is striking. While GPT-1, with its 117 million parameters, struggled to grasp these concepts, GPT-4—a behemoth rumored to exceed 1 trillion parameters—solved an impressive 90% of unexpected content tasks and 60% of unexpected transfer tasks. Astonishingly, this surpasses the understanding of 7-year-old children in similar tests. Why This Matters: This research doesn't just push the boundaries of what we expect from AI; it redefines them. By applying tests traditionally used to assess cognitive development in children, we now have a metric to compare aspects of intelligence between humans and deep learning models. This opens a fascinating dialogue on the evolving capabilities of AI and its potential to understand, predict, and interact with human mental states more effectively than ever before. A Thought to Ponder: As AI continues to blur the lines between computational processes and cognitive understanding, we're left to wonder: If an AI model can exhibit a theory of mind, how does that shape our interactions and trust in these systems? Are we ready to engage more deeply with entities that "understand" us in ways we're only beginning to comprehend? #AIResearch #TheoryOfMind #CognitiveAI #StanfordResearch #FutureOfAI
-
You know all those arguments that LLMs think like humans? Turns out it's not true 😱 In our new paper we put this to the test by checking if LLMs form concepts the same way humans do. Do LLMs truly grasp concepts and meaning analogously to humans, or is their success primarily rooted in sophisticated statistical pattern matching over vast datasets? We used classic cognitive experiments as benchmarks. What we found is surprising... 🧐 We used seminal datasets from cognitive psychology that mapped how humans actually categorize things like "birds" or "furniture" ('robin' as a typical bird). The nice thing about these datasets is that they are not crowdsourced, they're rigorous scientific benchmarks. We tested 30+ LLMs (BERT, Llama, Gemma, Qwen, etc.) using an information-theoretic framework that measures the trade-off between: - Compression (how efficiently you organize info) - Meaning preservation (how much semantic detail you keep) Finding #1: The Good News LLMs DO form broad conceptual categories that align with humans significantly above chance. Surprisingly (or not?), smaller encoder models like BERT outperformed much larger models. Scale isn't everything! Finding #2: But LLMs struggle with fine-grained semantic distinctions. They can't capture "typicality" - like knowing a robin is a more typical bird than a penguin. Their internal concept structure doesn't match human intuitions about category membership. Finding #3: The Big Difference Here's the kicker: LLMs and humans optimize for completely different things. - LLMs: Aggressive statistical compression (minimize redundancy) - Humans: Adaptive richness (preserve flexibility and context) This explains why LLMs can be simultaneously impressive AND miss obvious human-like reasoning. They're not broken - they're just optimized for pattern matching rather than the rich, contextual understanding humans use. What this means: - Current scaling might not lead to human-like understanding - We need architectures that balance compression with semantic richness - The path to AGI ( 😅 ) might require rethinking optimization objectives Our paper gives tools to measure this compression-meaning trade-off. This could guide future AI development toward more human-aligned conceptual representations. Cool to see cognitive psychology and AI research coming together! Thanks to Chen Shani, Ph.D., who did all the work and Yann LeCun and Dan Jurafsky for their guidance
-
Major preprint just out! We compare how humans and LLMs form judgments across seven epistemological stages. We highlight seven fault lines, points at which humans and LLMs fundamentally diverge: The Grounding fault: Humans anchor judgment in perceptual, embodied, and social experience, whereas LLMs begin from text alone, reconstructing meaning indirectly from symbols. The Parsing fault: Humans parse situations through integrated perceptual and conceptual processes; LLMs perform mechanical tokenization that yields a structurally convenient but semantically thin representation. The Experience fault: Humans rely on episodic memory, intuitive physics and psychology, and learned concepts; LLMs rely solely on statistical associations encoded in embeddings. The Motivation fault: Human judgment is guided by emotions, goals, values, and evolutionarily shaped motivations; LLMs have no intrinsic preferences, aims, or affective significance. The Causality fault: Humans reason using causal models, counterfactuals, and principled evaluation; LLMs integrate textual context without constructing causal explanations, depending instead on surface correlations. The Metacognitive fault: Humans monitor uncertainty, detect errors, and can suspend judgment; LLMs lack metacognition and must always produce an output, making hallucinations structurally unavoidable. The Value fault: Human judgments reflect identity, morality, and real-world stakes; LLM "judgments" are probabilistic next-token predictions without intrinsic valuation or accountability. Despite these fault lines, humans systematically over-believe LLM outputs, because fluent and confident language produce a credibility bias. We argue that this creates a structural condition, Epistemia: linguistic plausibility substitutes for epistemic evaluation, producing the feeling of knowing without actually knowing. To address Epistemia, we propose three complementary strategies: epistemic evaluation, epistemic governance, and epistemic literacy. Full paper in the first comment. Joint with Walter Quattrociocchi and Matjaz Perc.
-
How do LLMs really compare to the human brain? In a sequence of studies, our team at Google Research and partners at Princeton, HUJI and NYU have been exploring this question, aligning brain activity with the internal contextual embeddings of speech and language within LLMs as they process everyday conversations. What have we discovered? AI language models and the human brain share underlying organizational principles for representing words. Specifically, the internal representations used by these AI systems correspond to the neural activity patterns observed in brain regions crucial for both understanding and generating speech, hinting at potential shared computational mechanisms, all in all demonstrating the power of deep language model’s embeddings to act as a framework for understanding how the human brain processes language. Thank you to our research partners: Hasson Lab at Princeton University, the DeepCognitionLab at the Hebrew University, and researchers from the NYU Langone Comprehensive Epilepsy Center. Google Research Blog Post, authored by Mariano Schain and Ariel Goldstein: https://lnkd.in/d9cbP4Ee Nature Human Behaviour Paper: https://lnkd.in/dCbvdC-A Nature Communications Paper: https://lnkd.in/dbPCGY_m Nature Neuroscience Paper: https://lnkd.in/dvWjEg3C Hasson Lab, Princeton: https://lnkd.in/diMh8RA4 DeepCognitionLab at the Hebrew University: https://lnkd.in/dFPDuM7x NYU Langone Comprehensive Epilepsy Center: https://lnkd.in/dU8rWe5e
-
Did we just get closer to understanding how the brain works? Two groundbreaking papers explore how AI models and the human brain process language, with some interesting implications for text-to-speech. 📌 Google’s Research: Deciphering the human brain with LLM representations The most striking takeaway from Google’s study is that LLMs may process language in ways surprisingly similar to the human brain. By comparing fMRI scans of neural responses to LLM representations, researchers found a fascinating alignment between how the brain’s cortical regions handle language and how LLMs decompose linguistic information. Why does this matter for text-to-speech? Today’s state-of-the-art voice models, which primarily rely on Transformer architectures, excel at producing coherent, fluent speech. However, they often fall short when it comes to replicating the natural prosody, emotional nuance, and contextual awareness that human speech embodies. If model architectures can be refined to reflect the brain’s approach to semantic understanding — particularly how meaning is encoded and represented over time — it could vastly improve the naturalness and expressiveness of AI-generated speech. 📌 Anthropic’s Research: Tracing thoughts in language models Anthropic’s work emphasizes how LLMs break down complex tasks through structured chains of reasoning. The key takeaway? LLMs aren’t just retrieving information; they’re simulating cognitive processes that resemble human-like problem-solving. This process, along with chain-of-thought prompting, allows models to handle intricate tasks by breaking them into manageable steps. The implications for voice AI are profound. Incorporating structured reasoning architectures could allow text-to-speech systems to dynamically adjust prosody, tone, and pacing based on context. For instance, if a model can determine from the conversational structure that a user is expressing frustration or joy, it can modulate the generated speech to mirror that emotional state. It’s about creating models that don’t just speak, but speak with understanding. It's fascinating is that engineering is almost evolving into a science as we build increasingly complex systems that we only partially understand. I think we'll see more use of empirical methods to understand how these systems work, in addition to just building them. At Rime, we’re deeply excited about these findings and are closely monitoring updates. Bridging the gap between neural processes and machine learning architectures will be the key to building voice systems that feel truly human. 🧠💡 Links below... So what’s your take? Are LLMs closer to mimicking the brain than we previously thought?
-
LLMs can analyze sentence structure for the first time. Metalinguistic ability is one of the rare traits of language/cognition that we don't find in the animal kingdom. We show that the largest LLMs (with chain of thought) can analyze sentence structure as a linguist would. Linguistic formalism can now serve as a window into internal representations of very large LLMs (te so-called behavioral interpretability). We created sentences with syntactic ambiguity, recursion, movement and phonological problems. We had three graduate students evaluate the model’s responses. O1 performs significantly better than other models. To the recursive sentence "The worldview that the prose Nietzsche wrote expressed was unprecedented." o1 generates a syntactic tree and adds a layer of recursion: "The worldview that the prose that the philosopher Nietzsche admired wrote expressed was unprecedented." There are many other aspects of linguistics that can be tested in such manner. It remains to be seen if the models can come up with new innovative solutions to the problems. Preprint: https://lnkd.in/gtjcqPDq The dataset for evaluation is available on the OSF depository.
-
Do our brains work like AI language models? 🤯 Google Research and collaborators discovered remarkable similarities between human brain activity and AI language model embeddings during natural conversations! Some key findings 🧠🔤 Brain areas for speech align with AI speech embeddings, and language areas with word-level embeddings, suggesting our brains use statistical patterns, not strict grammar, for language processing. 🧠💭 Our brains, like LLMs, predict conversation words and show "surprise" when wrong. Today's AI might reflect core human cognitive principles. 🧠🔄 Unlike AI transformers that process thousands of words at once, our brains process language word by word. This difference offers opportunities for new neural architectures that align with human language learning. Check out the exciting blog in the comments 👇 #NeuroscienceAI #LanguageProcessing #CognitiveScience #DeepLearning #BrainResearch