Sign in to view Ramitha’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Sign in to view Ramitha’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Sunnyvale, California, United States
Sign in to view Ramitha’s full profile
Ramitha can introduce you to 10+ people at Apple
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
4K followers
500+ connections
Sign in to view Ramitha’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View mutual connections with Ramitha
Ramitha can introduce you to 10+ people at Apple
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View mutual connections with Ramitha
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Sign in to view Ramitha’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
About
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Experience & Education
-
Apple
**** *********
-
******* ***
**** *********
-
******* ****
**** *********
-
********** ** ********** *** *****
******** ****** **** ******* * ******** ********* undefined
-
-
********* **********
********** ****** ******** ******* *** ***********
-
View Ramitha’s full experience
See their title, tenure and more.
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Licenses & Certifications
Courses
-
Business Analytics in Marketing, Finance, and Operations
-
-
Business Intelligence Systems and Big Data Technologies
-
-
Collecting and Analyzing Large Data
-
-
Consumer Behavior
-
-
Customer Analytics
-
-
Data Driven Pricing
-
-
Experiments in Firms
-
-
Fraud Analytics
-
Projects
-
Research Assistant
-
Exploratory data analysis and modeling on gun transaction trends. Exploring predictors for gun sales, following a mass shooting
View Ramitha’s full profile
-
See who you know in common
-
Get introduced
-
Contact Ramitha directly
Other similar profiles
-
Theophilus Siameh
Theophilus Siameh
My success in leveraging my diverse technology, business and communication expertise has spanned two continents and in both private and higher educational settings. I am an expert in financial risk, regression,data mining,machine learning, hadoop ecosystem, big data technologies and predictive modeler. I have empowered organizations to make smarter decisions by effectively leveraging big data and advanced analytics.
2K followersPlano, TX
Explore more posts
-
Shanshan Zhang
Amazon • 921 followers
Ten Years,Three Lessons Today marks my 10th year at Amazon. When I started my career, I never imagined staying at the same company for a decade. But Amazon is not just any company - it's a place that constantly reinvents itself, challenges you to think big, and reward builders. A friend today reminded me: “You only get so many 10 years in your life.” That struck a chord and prompted me to reflect—not just for myself, but also for the younger version of me and the early-career professionals who’ve reached out for advice. I’ll keep it short and share three lessons that have shaped my journey: 1. Choice is more important than effort. I used to believe that hard work alone guarantees success. Over time, I learned that what you choose—your team, your manager, your priorities—can matter even more. Effort is necessary, but making the right choices gives that effort leverage. As we gain experience, we gain the ability—and responsibility—to choose better. 2. Invest in relationships. One of the most important reasons I’ve stayed at Amazon for 10 years is the people. I’ve had the privilege of working with brilliant, generous colleagues who’ve taught me more than any textbook ever could. Some of these colleagues have become mentors, collaborators, and even dear friends. Investing in relationships built on trust, respect, and kindness has made all the difference. 3. You can’t have it all—at least not all at once. Every choice involves trade-offs. Especially as a woman, I’ve had to navigate moments when I couldn’t be fully present at home and pursue every professional opportunity. And that’s okay. We are defined not by what we have, but by what we choose. Looking back, I wish I had taken more risks. I held myself back from opportunities I thought I couldn’t succeed in, without even trying. So for the next 10 years, my goal is simple: be bold, be brave, and choose.
1,021
35 Comments -
Anirban Nandi
Albertsons Companies India • 20K followers
We’re hitting the data ceiling!! For years, AI improved with one strategy: More data. More parameters. More compute. But scaling laws follow diminishing returns.Each doubling delivers smaller gains than the last. DeepMind’s Chinchilla showed it clearly: Performance comes from balancing data and model size, not endlessly adding more data. Now estimates suggest high-quality public text data could be exhausted between 2026–2032. The frontier is shifting: More data → Compute-optimal training → Alignment → Feedback loops → Synthetic refinement The flywheel isn’t stopping. It’s changing direction. If more data isn’t the answer… Where will the next 10% performance gain come from? #AI #LLM #GenAI #ScalingLaws #AILeadership
65
5 Comments -
Vincent Weng
Meta • 6K followers
I’m nearing the 6-month mark as a Data Scientist at Meta. Here are 3 lessons I wish I knew when I first started: 1. The biggest learning curve isn’t technical. It’s product & domain. ↪︎ Most statistical methods and tools already exist. The real challenge is knowing: - which approach fits the business problem - what tradeoffs matter - how the product actually works Even the best model will fail if it’s solving the wrong problem. 2. Documentation is not “extra” work. It is the work. ↪︎ Only you will understand your analysis best. But if no one else does, it has no impact. Clear storytelling, assumptions, limitations, and takeaways = what actually drives decisions. A great analysis poorly explained is just a personal notebook. 3. Real-world data science = decisions under imperfect assumptions. ↪︎ Business problems are messy. You’ll often: - not have perfect data - violate statistical assumptions - work with noisy signals But the goal isn’t perfection conditions. It’s moving the business forward with the best possible answer given constraints. If you want real examples of how these themes play out in practice, the Analytics @ Meta blog is a great resource: https://lnkd.in/gRSnB5hA It’s full of posts written by people doing this work every day.
874
30 Comments -
Amit Gangane
Eureka AI • 1K followers
🚀 Beyond Prediction: Unlocking True Understanding with Causal Inference & ML for Robust Decisions 📊 In the world of Machine Learning, we've become incredibly adept at predicting what will happen. But that's often just half the story. What if we need to know why something happens, or what would happen if we took a specific action? This is where Causal Inference becomes indispensable, moving us beyond mere correlation to true cause-and-effect. While traditional ML excels at finding patterns and making forecasts, it often falls short when answering critical "what if" questions for strategic decision-making. Think about it: - Are our marketing campaigns truly driving sales, or is it just a fleeting trend? - Does a new product feature cause increased engagement, or is there another factor at play? Understanding causation is crucial for making truly robust, confident decisions. It allows us to: 1) Understand the "Why": Move past superficial correlations to identify the real drivers of outcomes. 2) Evaluate Interventions: Predict the impact of actions before we take them (e.g., through techniques like uplift modeling or counterfactual reasoning). 3) Build More Reliable Systems: Design AI solutions that don't just forecast, but inform actionable strategies. By integrating Causal Inference techniques (like A/B testing, instrumental variables, and matching) with the scalability and power of modern Machine Learning, we can unlock deeper insights from complex, high-dimensional data. This bridges the gap between statistical rigor and predictive power, leading to significantly better outcomes in areas like: - Marketing: Optimizing campaign spend based on actual causal impact. - Product Development: Knowing which features truly drive user retention. - Healthcare: Understanding the real effect of treatments or policy changes. This isn't just about forecasting the future; it's about building more intelligent systems that help us actively shape it with a clear understanding of intervention effects. What's the most challenging "cause-and-effect" question you've faced in your work? Share your thoughts in the comments! #CausalInference #MachineLearning #DataScience #DecisionMaking #ProductManagement #AI #Analytics #UpliftModeling #TechInsights
7
-
Venkata Naga Sai Kumar Bysani
BlueCross BlueShield of South… • 246K followers
I'm attending NVIDIA GTC 2026. These are the 3 sessions I'm not missing. (And they're completely free) I went through the entire GTC catalog, so you don’t have to. 500+ sessions. Here are the 3 I'm personally attending as a data professional: 𝟏. 𝐅𝐚𝐬𝐭 𝐀𝐧𝐚𝐥𝐲𝐭𝐢𝐜𝐬 𝐨𝐧 𝐁𝐢𝐥𝐥𝐢𝐨𝐧-𝐑𝐨𝐰 𝐃𝐚𝐭𝐚𝐬𝐞𝐭𝐬 ↳ GPU-accelerated analytics on massive datasets ↳ Real-time insights without breaking the budget ↳ This is where enterprise data is heading → https://lnkd.in/dGFyH55Z 𝟐. 𝐆𝐏𝐔-𝐏𝐨𝐰𝐞𝐫𝐞𝐝 𝐒𝐐𝐋 & 𝐒𝐞𝐚𝐫𝐜𝐡 ↳ How GPUs are transforming SQL and search ↳ Faster querying at scale ↳ If you work with SQL, this is essential → https://lnkd.in/dzq4k-Sa 𝟑. 𝐅𝐚𝐬𝐭𝐞𝐫 𝐅𝐞𝐚𝐭𝐮𝐫𝐞 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐢𝐧𝐠 𝐟𝐨𝐫 𝐌𝐋 𝐌𝐨𝐝𝐞𝐥𝐬 ↳ Hands-on Deep Learning Institute session ↳ Feature engineering at 10x speed ↳ Perfect for data scientists building ML pipelines → https://lnkd.in/dFCvb2GM 𝐖𝐡𝐲 𝐭𝐡𝐞𝐬𝐞 𝟑? GPU-accelerated data skills are becoming non-negotiable. Companies are moving from traditional processing to GPU-powered workflows. These sessions teach exactly what hiring managers will be looking for in 2026 and beyond. The best part? You can attend virtually for free. I'll be sharing key takeaways from each session for those who can't attend live. Which session are you most interested in? 👇 ♻️ Repost to help data professionals level up their skills 𝘐𝘯 𝘱𝘢𝘳𝘵𝘯𝘦𝘳𝘴𝘩𝘪𝘱 𝘸𝘪𝘵𝘩 𝘕𝘝𝘐𝘋𝘐𝘈
311
40 Comments -
Mark Berkovics
Horizon's Edge AI • 644 followers
The 2025 State of AI (through Karpathy’s lens) - it’s not “the year of agents.” It’s the decade. Recently, Andrej Karpathy was interviewed on the Dwarkesh podcast (I'll put a link in the comments), and this is what he said on the current state of AI agents: 1) Creating AI agents that work reliably is a project that will take up to 2035 to complete. We don’t yet have the scaffolding for coworker-level agents: durable memory, reliable multimodal perception, continual learning, robust tools/OS control. This is a marathon of engineering, not a quarter’s roadmap. 2) Software progression from 1.0 to 2.0 to 3.0 1.0: humans write code 2.0: data writes weights 3.0: we program via language + LLMs Don’t skip the representation era. Strong “cognitive core” comes first, agents second. 3) We’re building “ghosts,” not animals. Models simulate human behavior from data. No instincts. No body. Expect brilliance in patterning, less so in grounded, embodied common sense (at least for now). 4) Reasoning lives in the context window. Weights are kind of like long-term memory. Context is kind of like working memory where thinking happens. Feed richer context, get sharper reasoning. Retrieval is way better than wishful prompting. 5) Only part of the ‘brain’ exists We have Cortex-like patterning. We don't have Hippocampus-like memory consolidation, emotion/instinct, coordination. That’s why current systems “think,” but don’t truly embed. 6) Building beats hand-wavy prompting Use AI to accelerate, not abdicate. Autocomplete helps but novel system design still needs humans who understand what they’re building. 7) RL is inefficient, process beats outcome-only rewards. Next step: process-based supervision and reflection loops. Teach models to review, self-correct, and generate useful training data. 8) Forgetting can improve generalization. Humans generalize because we forget. Models that memorize everything risk rigidity. Curate, prune, and reinforce abstraction, not trivia. 9) Bigger isn’t always better. Clean, high-quality data + smart architecture can outperform “just scale it.” A lean, well-trained cognitive core can feel startlingly capable. So, to summarize, here are a few practical takeaways: * Design for context: Retrieval + tool use + structured memory. * Instrument the process: Critique, reflect, and fine-tune steps, not just outputs. * Curate ruthlessly: Data hygiene is a moat. * Ship scaffolding: Memory, tool orchestration, reliability. Think in decades, execute weekly, and compound small real improvements. Let's not forget that we’re mid-build, not at the finish line. The transformer will likely be with us a while, just refined. The winners will be the ones who build patiently, instrument relentlessly, and curate obsessively. Let me know your thoughts in the comment section
1
1 Comment -
Filip Vítek
Global Digital Group • 11K followers
WIKIPEDIA also HIT BY AI Similarweb published stats of YoY traffic change and … it shows that Wikipedia (is similarly to Google Search, Stack Overflow and other “seek the answer” platforms ) seriously hit by LLM’s giving direct AI summary, rather than sending individuals to read the original info sources. I gonna attach a small hack. Do you want some topic to be represented in LLM training sets. Well, then create a wikipedia page about the topic and you are “guaranteed” to swim into training data.
74
9 Comments -
Spencer Hodapp
Alarm.com • 4K followers
Confession time: I spent 40 minutes debugging a query this morning only to realize I forgot a comma. 🤦♂️ We talk a lot about complex window functions and performance tuning, but let’s be real, 90% of the job is just managing your own syntax errors and ensuring your JOIN logic isn't creating a massive Cartesian product. If you’re just starting out: it doesn't get 'easier,' you just get faster at spotting your own typos. What’s the dumbest mistake that’s cost you an hour this week? Drop it below so I feel less alone.
1
1 Comment -
Shinibali Bhattacharyya, PhD
Amazon Web Services (AWS) • 913 followers
Over the last couple of years, as LLMs have become incredible at agentic capabilities and the workforce has started mass adopting agents to do anything and everything, there has been a strong trend of “build end-to-end on your own” in every layer of an org: from product, UX, sales, GTM, to devs & researchers. Leaders at all major corporations are actively incentivizing AI usage to boost every individual’s productivity at unprecedented levels. Part of our workplace benefits now includes free token credits, which often surpasses our annual cash bonus. This post is not to criticize solo-preneurship, AI usage and “build it alone” qualities as evil, harmful, or energy-guzzling debacles. But in a world that is becoming increasingly more individualistic and divided, is this the direction we want to head towards? Would you push back on the incentivization of the race of the individual winner? And champion for “we can all win together as a team”? How can we build models and systems that bring people closer, and not reward them for pushing each other farther apart? At the end of the day, the cherished memories you make from your entrepreneurship journey, or at the workplace, will be the kudos and the helping-hand you share with your team-mates and other humans. It won't be the AI’s sycophantic validation “You are absolutely right!”. I really hope this post reaches the leaders in charge of decision-making to rethink the future they want to create within the workforce. Shout-out to the #SiliconValleyGirl Marina Mogilko for touching upon this topic in her podcast: https://lnkd.in/gqpm3xJx #TeamCentricAI #TeamCentricAgents #siliconvalley #TechIndustry #Anthropic #OpenAI #Google #Amazon #LinkedIn
20
2 Comments -
Piotr Kalański
TrueBlue Inc. • 3K followers
We cut inherited ML infrastructure costs by 50% — $10k/month, sustained — without retraining a single model or triggering a single performance regression. The mechanism wasn't clever engineering. It was a two-list audit: Keep. Change. Everything resolved to one of those two columns. The keep list did real work. Both inherited models used XGBoost with genuine domain knowledge. The integration contracts, DynamoDB as the online feature store, the Pricing API safety guardrails — all kept, explicitly, with written justification. Resisting the pressure to prove value through change is a decision that requires justification. The keep list is that justification. The change list is where the costs lived. Feature store consolidation with Feast. A five-hop API path simplified to two (and SageMaker Endpoint dropped entirely — we didn't use it). Candidate scoping that cut the batch computation 10x. DuckDB replacing Spark/Glue for a workload that never needed Spark. Decommissioned experimentation infra that had been running with no active experiment for months. AWS account consolidation. The single most important factor: none of this was visible to the previous team. Not because they were careless. Because single ownership of two systems makes the overlap legible in a way distributed ownership never can. Try this: pick a system your team has run for two or more years. Spend one hour building two lists — what you'd keep and what you'd change if you inherited it today with no context on why each choice was made. The gap between those lists is your hidden cost. What did your last infrastructure audit find that the original team couldn't see? Full breakdown on Medium: https://lnkd.in/dQ6rT2DW
9
3 Comments -
Shishir Kumar Prasad
Instacart • 3K followers
Proud to share our latest work on PARSE — Instacart’s self-serve platform for product attribute extraction using multi-modal LLMs (text + image). From identifying “80 sheets” on packaging to reasoning over “3 boxes of 124 tissues,” PARSE brings accuracy, coverage, and speed to our catalog systems at scale. Key highlights: - Multi-modal LLMs for robust attribute extraction - Zero/few-shot setup with confidence scoring - 70% cost savings on simpler attributes - Built-in quality review, versioning, and ongoing automation Huge kudos to the entire catalog team and our cross-functional partners 👏 .
35
-
Namratha Bhat
Target • 424 followers
Towards the end of 2025, my colleagues and I at Target had the chance to attend a deep-dive on Agentic AI by Abhijith Neerkaje and Ajay Shenoy. Ajay broke down the math behind the concepts, while Abhijith focused on how it all comes together in practice—use cases, different types of agents, and how to think about evaluating them. Spread across two full-day (8-hour) sessions, they covered a lot of ground. Some of the concepts were intense, but the sessions were extremely informative. What really stood out was the patience and clarity with which everything was explained, and how open both of them were to questions throughout. If you’re looking for a solid foundational course in Agentic and Generative AI, this is a great place to start.
35
1 Comment -
Ashish Singh
Icertis • 2K followers
Hiring: GenAI Data Scientists (multiple roles) 5 questions. Answer yes to all 5? DM me. 1. Can you architect a RAG pipeline end-to-end: not just call an API? 2. Have you chosen, fine-tuned, or benchmarked embedding models in production? 3. Do you build eval frameworks or do you just vibe-check your outputs? 4. Has something you built actually served real users at scale? 5. Does the phrase "it works in the notebook" make you uncomfortable? All 5? Send me a 3-line intro: • Who you are • What you shipped • What you want to build next That's it. Just one builder looking for others. #Hiring #GenAIScientist
59
2 Comments -
Shivani A.
Adobe • 1K followers
Bias in Causal Inference | Framing the problem When you can���t run an A/B test and rely on observational data to understand the impact of initiatives (e.g., promotions or new feature introductions), careful problem framing and awareness of bias are essential. Two common biases in causal inference are selection bias and confounding bias. Here’s a summary on the former, selection bias: Selection Bias | Conditioning on an Effect (Collider Bias) Selection bias from conditioning on an effect arises when we restrict the analysis to a variable influenced by treatment. For example, measuring a promotion’s impact on cancellations only among purchasers introduces bias. The promotion increases purchases, and purchase intent affects both purchases and cancellations, leading to misstated impact estimates. Problem framing depends on the outcome to influence. For example, to reduce overall churn, measure the total effect across all exposed users, including non-purchasers, capturing funnel-level influence. For post-purchase interventions, measure the conditional effect among purchasers. MVP causal representation: Promotion → Purchase ← Purchase Intent (unobserved) Purchase → Cancellation TLDR: Measure total cancellation impact for broad churn prevention strategies OR conditional impact among purchasers for tactical post-purchase interventions. Selection Bias | Conditioning on a Mediator Selection bias from conditioning on a mediator occurs when we control for a variable on the causal path from treatment to outcome. For example, notifications influence engagement, which impacts retention. Controlling for engagement removes part of the effect, giving an incomplete estimate. For attributing retention impact from notifications, focus on the total effect, capturing both direct impact and that flowing through engagement. To understand how much retention is driven by engagement, measure the mediated effect, which helps guide targeted engagement-driving initiatives without conflating total impact. MVP causal representation: Notification → Engagement (Mediator) → Retention Notification → Retention (direct) TLDR: Measure total impact on retention for measuring the success of feature introduction (e.g., notifications) OR the mediated impact from engagement to retention for targeted engagement strategies. Note: All opinions are my own and shared for data science community discussion.
99
-
Sumit Kumar
Meta • 8K followers
I just published Vol. 139 of "Top Information Retrieval Papers of the Week" on Substack. My Substack newsletter features the 7-10 most notable research papers on information retrieval (including recommender systems, search & ranking, etc.) from each week, with a brief summary, and links to the paper/codebase. This week’s newsletter highlights the following research work: 📚 Self-Evolving Search Agents without Training Data, from Meta 📚 Applying Embedding-Based Retrieval to Airbnb Search, from Airbnb 📚 Rethinking Item Identifiers in LLM-Based Recommendation, from Kuaishou 📚 Is Agentic RAG worth it? An experimental comparison of RAG approaches, from Ferrazzi et al. 📚 Retrieval-Free Domain Specialization for Frozen Language Models, from Li et al. 📚 Reinforcement-Based Vector Selection for Scalable Multi-Vector Retrieval, from LG Uplus 📚 Rank-Aware Loss Functions for Autoregressive Document Ranking, from Google DeepMind 📚 Orthogonal Subspace Learning for Joint Search and Recommendation in Large Language Models, from Zhao et al. 📚 Bridging ColBERT and Approximate Nearest Neighbor Search, from Kumar et al. 📚 Learning-Free Binary Embeddings for Efficient LLM-Based Retrieval, from Nanjing University #InformationRetrieval #ResearchPapers #CuratedContent #Newsletter #substack
52
2 Comments -
Jesca Birungi
Freelance • 17K followers
One interesting thing about learning #machinelearning with #R is that there is no single “best” resource. Different resources work for different stages of learning. After sharing an introductory resource earlier, I thought it might also help to share something more hands-on. Hands-On Machine Learning with R: https://lnkd.in/d6zjNWzH This guide focuses on implementing machine learning techniques in R through practical examples, covering areas such as: • Tree-based models • Boosting methods • Clustering techniques • Neural networks • Ensemble models For #statisticians, #biostatisticians, and #dataanalysts working in R, it can be a helpful way to build practical machine learning skills step by step. I’d love to hear from my network: What resources would you recommend for learning machine learning with R? #Biostatistics #MachineLearning #RStats #DataScience #OpenScience #StatisticalLearning
40
4 Comments -
Rakesh Tripathi
Nielsen • 3K followers
In my recent conversations, I've noticed a 2 concepts that seem to be causing some confusion in the DS community, and I'd like to take a moment to clarify them: 1. Difference in Bias of an estimator and Bias variance trade off. 2. Covariate shift in causal inferencing vs feature drift. Bias in the Estimator refers to a single, standalone property of a statistical model's prediction. An estimator is considered unbiased if, on average, it hits the true value of the parameter it's trying to estimate. In other words, if you were to repeat your experiment many times, the average of your predictions would equal the true value. Whereas, The Bias-Variance Trade-off is a concept that describes a fundamental tension in model building for choosing the complexity level. Covariate shift in causal inference and feature drift in machine learning are related concepts, but they refer to different problems and appear in different contexts. Covariate Shift in Causal Inference describes a situation where the distribution of the input variables (covariates) in the test data is different from the distribution in the training data, while the relationship between the covariates and the outcome variable remains the same. In a causal context, this is a big problem because causation is lost which Invalidates Causal Conclusions. Feature drift, or concept drift, is a broader term in machine learning. It refers to the problem where the statistical properties of the target variable or input features change over time in an unexpected way. Hope this help to the readers.
43
1 Comment -
Anshul Yadav
DMI Finance Private Limited • 1K followers
🔍 "RAG Won’t Save You — Unless You Understand NLP First." Let’s talk about what’s really behind the hype around Retrieval-Augmented Generation (RAG). Everyone’s building “RAG apps” today — but very few actually understand what makes them work. Here’s the hard truth: RAG isn’t about large language models (LLMs). It’s about retrieval engineering — classic NLP meets modern AI infrastructure. 🧠 The Core Idea RAG = Retrieval + Generation But the “generation” part — the LLM response — is the last step in the pipeline. The real magic lies in what happens before that. If your RAG system isn’t performing well, 9 out of 10 times the issue is in your retrieval pipeline, not your LLM. ⚙️ The Engineering Layer Nobody Talks About To build a solid RAG system, you need to nail each layer: 1. Chunking Strategy: Don’t split text blindly. Use semantic or recursive chunking to preserve context. (Cutting mid-sentence = noisy embeddings = garbage retrieval.) 2. Embedding Models: Choose wisely — sentence-transformers, multilingual, or domain-specific models can drastically affect recall and precision. 3. Similarity Search: Cosine vs. dot-product vs. dense reranking — each has trade-offs in latency and accuracy. 4. Vector Indexing: Use the right vector DB — FAISS, Chroma, Weaviate, or Milvus — and tune your ANN (Approx. Nearest Neighbor) parameters. 5. Context Stitching: Injecting retrieved chunks into prompts isn’t trivial. Context windows, token limits, and ordering logic matter a lot. 🚫 It’s Not “Just Add Context to OpenAI” A well-engineered RAG pipeline handles: - Low-relevance edge cases via reranking - Hallucination reduction via better retrieval control - Component chaining (Retriever → Reranker → Prompt Template → LLM) That’s real AI engineering, not prompt hacking. 🎯 The Takeaway RAG is 80% retrieval and 20% generation. The LLM is the cherry — but the cake is built from solid NLP fundamentals. If you can design that foundation, your Agentic AI systems won’t just respond — they’ll reason. #RAG #AIEngineering #AgenticAI #NLP #LLM #VectorDB #GenAI #LangChain #ChromaDB #MachineLearning #MLEngineering
13
-
Blake Feiza
Apple • 3K followers
🎉 Very excited about this blog post from Ethan, making his debut on the “Feiza Data” blog on Medium! He does a great job taking a tricky concept, Bayesian A/B testing, and explaining it clearly with examples AND all Python code included! If you’re in the data science space or just interested in learning about statistical concepts, I highly recommend giving this post a read! Check it out: 👉 https://lnkd.in/g49pYvE8
19
-
Jin Liu
Nordstrom • 2K followers
Excited to share that my Temporal webinar — "From Chat to Docs with Agentic AI" — is now available! The problem we were solving: support channels are full of tribal knowledge — corrections, workarounds, code snippets — but none of it makes it back into the docs. What that means is your documentation slowly drifts from reality. So we built agentic workflows with Temporal and LLMs that close that loop automatically. The system monitors chat threads, extracts structured insights, and opens PRs to update documentation — no manual curation needed. I cover the architecture, the AI extraction pipeline, and how Temporal's durable execution keeps these long-running processes reliable. Big shoutout to Shubham Shukla for being an amazing partner on this work, and to Julie Creamer for the support and leadership that made this possible at Nordstrom. Thanks to Ethan Ruhe and Hannah Short at Temporal Technologies for the opportunity and for helping put this together. Also — I'll be at REPLAY May 5–7 at Moscone South, San Francisco. Excited to learn from other great teams in the Temporal community. See you there! YouTube: https://lnkd.in/ga73Y45A Webinar page: https://lnkd.in/gRTzwMC4 #AI #AgenticAI #Temporal #DurableExecution #LLM #SoftwareEngineering
32
2 Comments
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top content