Sign in to view Vikash’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Sign in to view Vikash’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Kolkata, West Bengal, India
Sign in to view Vikash’s full profile
Vikash can introduce you to 10+ people at Cognizant
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
9K followers
500+ connections
Sign in to view Vikash’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View mutual connections with Vikash
Vikash can introduce you to 10+ people at Cognizant
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View mutual connections with Vikash
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Sign in to view Vikash’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
About
With over 7 years of professional expertise, I am a seasoned Data…
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Services
Activity
9K followers
-
Vikash Das posted thisGenAI interviews today feel harder than the actual GenAI job itself. Feel free to disagree but that's what I observed recently. #GenAI #LLM #MLOps #AIEngineering #MachineLearning
-
Vikash Das posted thisThe biggest AI lie on LinkedIn right now: "AI will replace everyone." It sounds dramatic. It gets clicks. But it misses what is actually happening. Last year in my organization, our manager introduced several AI tools into team’s workflow. Instead of replacing people, the opposite happened. Developers produced output faster. Analysts processed data quicker. Writers drafted content in minutes instead of hours. No one lost their job. But one thing changed. The people who **learned how to work with AI suddenly became far more valuable**. Here is the truth most posts ignore. AI is not replacing professionals. It is **exposing professionals who refuse to evolve**. The real risk is not AI taking your job. The real risk is someone using AI doing your job better. What do you think is the biggest myth about AI right now? Follow me for practical insights on AI, GenAI, and careers. #ArtificialIntelligence #GenAI #FutureOfWork
-
Vikash Das posted thisMost companies are underestimating a quiet shift happening right now. A junior employee using AI can outperform a senior who doesn't. Last month I spoke with a project manager who hired two analysts. One had 7 years of experience. The other had just graduated. The senior worked the traditional way: manual research, spreadsheets, long reports. The junior used AI for everything. ChatGPT for research summaries. AI tools to analyze data. Automation to build reports. Result: the junior finished tasks in half the time and delivered deeper insights. This is the uncomfortable truth about the AI era. Experience is still valuable. But **AI multiplies speed and leverage**. And someone who knows how to use it can move far faster than someone who refuses to adapt. The real divide in the workplace is no longer junior vs senior. It is **AI users vs non-AI users**. Curious to hear your take. Do you think AI can level the experience gap at work? Follow me for more insights on AI and the future of work. #ArtificialIntelligence #FutureOfWork #AI
-
Vikash Das shared thisHiring for part-time position: We are building industry-grade GenAI talent, and we are looking for experienced professionals who want to mentor at that level. This is a part-time, contract-based opportunity for experienced professionals who are actively working in the GenAI ecosystem and want to contribute to building industry-ready talent. Before applying, please read the eligibility criteria carefully. We are looking only for high-quality, experienced professionals. Minimum Requirements: • 3 to 5+ years of hands-on experience working with the GenAI tech stack • Strong working experience with at least 90% of the following: [Advanced RAG chatbots, LangChain, LangGraph, LangSmith, Agentic AI systems, CrewAI, n8n, Deployment of GenAI applications, MCP protocol, Developer tools such as :Cursor AI, Windsurf, and similar platforms] • Availability to contribute 2 to 3 hours during weekends • Genuine interest in teaching and mentoring. Prior teaching or mentoring experience will be a strong plus • Compensation will be on an hourly basis Please apply and fill out the Google Form only if you meet the above criteria. After reviewing your application, our team will get back to shortlisted candidates. We are building serious, industry-grade talent. If you align with that mission and have the depth of experience required, we would love to hear from you. Apply here -> https://lnkd.in/gtMvmHw9
-
Vikash Das posted thisJan 2026 Kailash Nadh: “Talk is cheap. Show me the code.” Aug 2000 Linus Torvalds: “Code is cheap. Show me the talk.” 26 years. Same industry. Opposite directions. In 2000, writing code was rare. Ideas mattered. In 2026, writing code is abundant. Clarity matters. Today, anyone can generate code. But can you: explain why it exists? defend what it solves? align it with real business value? The bar has shifted. Execution is not enough anymore. Thinking out loud clearly is the new seniority signal. Funny how legends agree, just not on when. What era do you think we are really optimizing for right now. Builders or explainers.
-
Vikash Das shared thisThat's the type of "Good Morning" messages that I prefer😀 Good example set by Keshav Reddy🎉 Others can access the free playlist here: https://lnkd.in/g-93t5eP Free Roadmap & Resources: https://lnkd.in/gFtRh334 #datascience #mlops #course #jobs
-
Vikash Das shared thisLLMOps is becoming a real role, not just a buzzword. I put together a 100-question LLMOps guide covering: prompt versioning, model routing, cost control, evaluation, observability, safety, and incident handling. These questions reflect what breaks first when LLM systems hit production. Sharing the mini playbook below for anyone building GenAI systems seriously.
-
Vikash Das shared thisGood analytics is not about dashboards. It is about decisions. This 100-question Data Analytics guide focuses on: KPIs, SQL, experimentation, reporting, and stakeholder communication. If you want to stand out in analytics interviews by showing business clarity, not just tool knowledge, this will help. Posting the mini version here.
-
Vikash Das shared thisMLOps interviews are where theory meets reality. I created a 250-question MLOps interview guide focused on real production systems: pipelines, CI CD, deployment, monitoring, drift, Kubernetes, and failure handling. These are the kinds of questions that reveal whether someone has actually thought about operating ML systems at scale. Sharing the mini guide below for ML and MLOps engineers.
Experience & Education
-
Aspire General Insurance
****** **** *********
-
*********
**** *********
-
**** *********** ********
**** *********
-
*********** ********** ************ *******
******** ** ******* * ** **** ******** **** ******** ******* undefined
-
View Vikash’s full experience
See their title, tenure and more.
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View Vikash’s full profile
-
See who you know in common
-
Get introduced
-
Contact Vikash directly
Other similar profiles
Explore more posts
-
Shubham Rupareliya
KRISHNA TEX • 627 followers
🚀 Starting My MSc in Data Science – Day 6 Statistics Every Data Scientist Must Know 📐📊 Today’s realization: Data Science isn’t magic — it’s math + logic + context 🧠📈 Data Science is built on statistics, not just machine learning. Without statistical understanding, models become black boxes. 🔹 Key statistical concepts every Data Scientist should know: 🔸 Descriptive Statistics Mean, Median, Mode Variance & Standard Deviation → Helps summarize and understand data 🔸 Probability Basics Random variables Probability distributions → Foundation for predictions & uncertainty 🔸 Data Distributions Normal distribution Skewness & Kurtosis → Understanding data shape matters before modeling 🔸 Correlation vs Causation Correlation ≠ Cause → One of the most common real-world mistakes 🔸 Hypothesis Testing Null & Alternative hypothesis p-value → Used to validate assumptions with data 🔸 Confidence Intervals Measure uncertainty in estimates → Helps make data-driven decisions with confidence 💡 Key Student Insight: Machine Learning explains what happens. Statistics explains why it happens. Strong statistical foundations = better models & better decisions. 📌 Sharing daily learnings from my MSc in Data Science journey. Let’s master the fundamentals together 🚀 👉 Which statistics topic do you find most challenging? #DataScience #StatisticsForDataScience #MScDataScience #MachineLearning #Probability #Analytics #LearningInPublic #CareerInTech #LinkedInDaily #Day6
7
-
Amit Kumar
Bosscoder Academy • 3K followers
Why Handling Missing Values Is More Important Than Choosing a Model Hello I’m Amit, currently working as a Data Analyst and aspiring to grow as a Data Scientist. Today, I want to share a practical lesson I’ve learned while working with data and building ML projects. Before choosing a model, check your missing values. Early in my learning phase, I was more focused on algorithms — which model to use, how to tune it, how to improve accuracy. But over time, especially through hands-on data work, I realized something important: Missing data can silently impact your results more than the model itself. Null values can introduce bias Incorrect imputation can distort distributions Dropping rows blindly can remove meaningful patterns Now, before building any model, I make it a habit to: Analyze the percentage of missing values Understand why the data is missing Choose an appropriate handling strategy instead of defaulting to mean/median Model selection is important. But data preparation determines how trustworthy your results truly are. #DataAnalytics #MachineLearning #DataScience #DataPreparation #LearningJourney
13
-
Nitin kohli
LinkedIn • 5K followers
Rachit Jindal • 1st Data Scientist & Business Analyst | Python, SQL, Power BI, ML & AI | Helping organizations leverage data for strategic decisions 3h • Everyone talks about Data Science like it's just: ➡️ Data Prep ➡️ Modeling But the reality? It’s a whole ecosystem. From collecting and labeling data… to cleaning, exploring, and engineering features… to building scalable pipelines and deploying models… to monitoring performance, ensuring privacy, and maintaining systems… 📊 The actual work is far beyond just “training a model.” Real Data Science = Problem-solving + Engineering + Continuous improvement If you’re starting out, don’t just focus on models. Focus on understanding the entire pipeline — that’s what makes you valuable. #DataScience #MachineLearning #MLOps #AI #TechCareers
191
-
Sarah Rodrigues
Ford Motor Company • 756 followers
🔍 Active Review: ETL Pipeline in Practice Nothing better than a Hands On Active Review! Instead of just revisiting concepts, I applied them in practice: -Data cleaning and validation -Handling missing and inconsistent values -Binary and one-hot encoding for categorical variables -Feature scaling for numerical columns -Preparing a clean dataset ready for ML models What stood out for me is that knowing is not enough — you need to review, reshape your ideas, and explore new perspectives to gain deeper insights. Everyone says you need to code EVERYDAY, and it’s true: like learning a new language, you need daily immersion. Going step by step through a ML pipeline reinforces understanding and uncovers improvements you might otherwise miss, by the way... its so quick! Why not? Next steps: applying Logistic Regression and Random Forest to evaluate performance and interpretability 🚀 📊 Data preprocessing is not just a step — it’s the foundation. #DataScience #MachineLearning #ETL #ActiveLearning #ChurnPrediction #Python #DataPreparation #ContinuousLearning - check the ETL pipeline on my github: https://lnkd.in/dxNqtWtD See ya! (:
8
-
Saurabh Tiwari
Shemfort Scholastic School • 4K followers
🚀 Day 17 (22/11): Subqueries in SELECT & FROM — Seeing Data Inside Data! Today in my 21-Day SQL Challenge with Indian Data Club and DPDzero, I learned one of the most powerful concepts in SQL: 👉 Subqueries inside SELECT and FROM clauses These allow us to calculate something inside another calculation — like mini-tables inside a big table. This is used a lot in real analytics when we compare services, departments, products, or performance against overall averages. 🧩 Daily Challenge Task: Create a report that shows for every service: Service name Total patients admitted Difference between its total admissions and the overall average admissions Rank indicator Above Average Average Below Average And sort by highest admitted first This challenge made me think like a real data analyst — how to compare one service with the whole hospital. 📘 What I Learned Today ✅ How subqueries inside FROM create a “mini-table” ✅ How subqueries inside SELECT help compare values ✅ Why derived tables are powerful in reporting ✅ How to calculate averages and compare each service against it ✅ How to turn calculations into meaningful labels This exercise was super helpful in understanding real-world reporting logic — where we compare performance against overall benchmarks. A big thanks to Indian Data Club and DPDzero for building such a smart learning path. Feeling more confident day by day! 💪 #SQL #DataAnalytics #SQLWithIDC #Subqueries #SQLLearning #IndianDataClub #DPDzero
2
-
Daniyal Qasim
SquareTrade • 4K followers
People often look down on Excel as if it is “basic.” It’s fair to say that I have seen it do extraordinary things. Of course, grow your skills. Learn SQL. Pick up Python. Know when to reach for specialized software. But remember, tools are just tools. For example, I have worked with teams who: - Built reporting systems that ran like clockwork. - Solved problems no one thought could be solved. - Created models that shaped business decisions. And yes, all of it happened in Excel. Excel isn’t outdated; it is underestimated. You can only make an impact through the problems you solve in the business, instead of how good you are at using a software.
-
Vikky Singh
ApolloPharmacy • 3K followers
TRANSITION MATRIX to find errors in Agent :- When you have an agent with many steps (e.g., 'Plan', 'Search', 'Code', 'Finalize'), it's hard to know which step is failing most. A Transition Failure Matrix helps you find the hotspots. How it works: 1.Define states: List all the possible steps or 'states' your agent can be in. 2.Create a matrix: Create a grid where the rows are the 'From' state and the columns are the 'To' state. 3.Count failures: For each failure, identify the last successful transition that happened before the error. Populate the count of failures in the grid cells accordingly. #AI #DataScience
5
-
Akshay Jaryal
Wipro • 780 followers
Agentic RAG: The Next Evolution of Retrieval-Augmented Generation Traditional RAG = retrieve → generate. But Agentic RAG goes beyond—it's retrieve → reason → act → refine. 🔍 1. Adaptive Retrieval The model doesn’t just fetch top-k chunks. It plans retrieval, checks gaps, rewrites queries, and fetches again until context is complete. 🧠 2. Tool-Augmented Reasoning The LLM acts like an agent: selects tools (search, SQL, vector DB) decomposes tasks validates outputs self-corrects before responding. 📚 3. Dynamic Knowledge Routing Content is pulled from multiple sources: structured DBs, APIs, vector stores, web search, etc. The LLM decides where to look based on the question. 🔄 4. Iterative Refinement Loop Agent checks hallucination risks → re-queries → verifies facts → produces grounded results. ⚡ 5. Real-World Application Perfect for: Compliance Q&A Procurement copilots Contract analysis Enterprise search Technical troubleshooting 💡 In short: Agentic RAG is RAG with a brain + autonomy. More context → fewer hallucinations → production-grade answers. #AI #RAG #AgenticRAG #GenAI #DataScience #LLM
11
-
Arshiya Kishore
DecisionTree Analytics &… • 2K followers
📊 Let’s talk about Data Analytics for a second. Because honestly… it’s crazy how one tiny pattern in a dataset can completely flip the narrative. You open a spreadsheet thinking, “Yeh toh normal numbers lag rahe hain…” And suddenly you spot something that makes you go, "Wait… this actually means something.” That’s the real power of analytics. It’s not about tools or dashboards, it’s about how you think. It’s connecting dots. It’s asking “Why?” one more time than everyone else. It’s seeing the story behind the numbers. And trust me, you don’t need to know a lot to get started. You just need curiosity. Because once you start asking the right questions…The data starts revealing the right answers. Data doesn’t shout the truth. It whispers, and analysts are the ones who learn to listen. What’s the most interesting insight you've ever found in a dataset? #DataAnalytics #DataScience #Insights #TechTrends #DecisionMaking #BusinessIntelligence
12
-
Purushottam Sharma
EXL • 12K followers
The biggest mistake leaders make while managing data science teams. They measure output instead of impact. Number of models built. Number of dashboards delivered. Number of experiments run. None of these guarantee business value. High-performing data science teams are clear on: • Which decision the work supports • Who acts on the output • What changes if the result is wrong Without this clarity, even strong teams end up optimizing metrics that don’t matter. Over time, this creates frustration—on both sides: • Leaders feel AI isn’t delivering value • Teams feel their work isn’t respected The best leaders I’ve worked with do one thing differently: They connect data science work directly to decisions, accountability, and outcomes. That’s when teams move from “building models” to shaping the business. #Leadership #DataScience
24
-
Sughosh Dixit
Oracle • 1K followers
📅 Day 25 of my 30 Day Data Science Challenge — Pairing Complementary Segments Chanakya says, "समानता दृश्यते तुलनेन" (Similarity is seen through comparison). Complementary segments need aligned configurations! The Problem: Your system has paired segments: - Premium ↔ Standard - Verified ↔ Unverified - Enterprise ↔ Consumer Challenge: Ensure these pairs have: ✅ Same parameters defined ✅ Mutually exclusive coverage ✅ No contradictions Bipartite Graph Matching: ``` Premium ←→ Standard Verified ←→ Unverified Enterprise ←→ Consumer ``` Perfect matching = every segment has exactly one pair! Floor Fill-In Theorem: Missing values in B can use A's values as floor: ``` B[param] = B[param] ?? A[param] ``` Practical Tools: - `validate_and_pair_segments()` → Check consistency - `translate_bindings()` → Map parameters between pairs - `fill_missing_params()` → Floor fill-in for gaps Bottom line: Pair segments mathematically. Equivalence relations ensure consistency! 🔗 🔵 Full guide with bipartite matching and fill-in proofs 👇 🔗 https://lnkd.in/dmXVKp9C #DataScience #SetTheory #EquivalenceRelations #ConfigurationManagement #LearningBySharing #30DayChallenge #SughoshWrites
8
-
Khushboo Alvi
BharatPe • 32K followers
“From M.Tech at IIT Delhi to failing a DSA round- here’s what they don’t tell you about breaking into Data Science.” I thought I had everything it takes. But when I sat for a Data Science interview at a top MNC, I stumbled in the DSA round. It wasn’t because I didn’t try. It was because I wasn’t preparing the right way. We often think building cool models or understanding LLMs is enough but interviews test something deeper. That rejection hurt. But it also gave me direction. I rebuilt my prep, focusing not just on skills, but on structure. Mock interviews. Real projects. Feedback from mentors already working in the roles I aspired to. And that’s what made all the difference. Today, as a Senior Data Scientist - AI, I work on solving real business challenges using NLP and Generative AI but I haven’t forgotten the lessons I learned on the way. If you’re in Data Analytics or Engineering and want to transition into DS, then you have to check out Bosscoder Academy. You can check them out here - https://bit.ly/3TOtHp1 They offer - ✅ 1:1 mentorship from industry professionals ✅ Real-world projects across ML, GenAI, SQL ✅ Full support for interviews, resumes, and strategy #data #datascientist #sponsoredpost #machinelearning #sql
45
13 Comments -
Glorious Mumo
Tenacious Data Demystifiers • 3K followers
Don’t become a Data Scientist if ; 1️⃣ You don’t like upskilling. Because in this field, what you know today might be outdated tomorrow. Continuous learning isn’t optional; it’s survival. 2️⃣ You don’t enjoy coding, mathematics, or statistics. These are the language, foundation, and heartbeat of Data Science. You don’t have to be a genius, but you must be curious enough to get your hands dirty. 3️⃣ You don’t like working with huge amounts of data. Data Science is all about uncovering patterns and insights from massive datasets. If large data intimidates you, this path will feel like a mountain climb. Data Science is fascinating, but it’s not for everyone. If you love solving problems, learning new tools, and asking “why?” behind numbers, then it might just be for you. Remember: Passion + Consistency = Progress. Everything else can be learned. Glorious Mumo, The Data Demystifier.
32
18 Comments -
Dinesh Kumar
Itvedant Education Pvt. Ltd. • 3K followers
🚀 Non-Tech Background? You Can Still Build a Career in Data Science & AI “But I’m from a B.Com background… Can I still get into Data Science?” Yes, you absolutely can. Let me share a quick story 👇 🎯 Meet Anjali — A B.Com Graduate Turned Data Analyst In 2020, Anjali was working in accounts. She had zero experience in Python, SQL, or Machine Learning. But she was curious, and that was enough to start. She gave herself 6 months. ✅ Learned Excel + SQL for data handling ✅ Picked up Python and pandas for analysis ✅ Built dashboards using Power BI ✅ Took online courses in statistics & machine learning ✅ Created a GitHub portfolio with real-world projects Fast forward to today — she’s working as a Data Analyst at a global tech company, helping marketing teams make data-driven decisions. 📢 You Don’t Need a Tech Degree to Enter AI What you do need: 🔷Curiosity to learn 🔷Discipline to stay consistent 🔷Willingness to practice, build, and apply AI isn’t just for coders — it’s for problem solvers. Whether you're from finance, HR, sales, or operations, you bring domain knowledge that companies value — pair that with data skills and you're unstoppable. 🤖 AI is Transforming Every Job — Learn It or Lag Behind This is not the future — it's happening now. If you're still on the fence, this is your wake-up call. 🌐 Start with Excel → SQL → Python → Visualization → ML 💼 Build your portfolio 📚 Take one small step daily The best time to start? Yesterday. The second-best time? Today. #DataScience #AI #CareerChange #NonTechToTech #Python #SQL #CareerInAI #LearnDataScience #Upskill #RealStory #LinkedInCareer #MachineLearning
6
-
Albert Edwards
Vitality • 11K followers
Data Science is like dating It’s all about relationships ⬇️ It’s easy to focus on how exciting the project sounds. - The model. - The tech. - The impact. But none of that matters if it never gets deployed. To build anything meaningful, you need a stakeholder who: - Cares enough to push it forward - Is senior enough to influence decisions - Can be flexible If they don’t care, it stalls. If they have no influence, it dies quietly. A “high-impact” project with weak backing is just a proof of concept. When choosing a project, don’t just assess the problem. Assess the person backing it. That’s what determines whether it actually goes live. If you’re building a career in Data Science, don’t just optimise for interesting projects. Optimise for the right relationships. I write about non-obvious career lessons like this in my free newsletter: https://lnkd.in/eTAmRsDu
6
2 Comments -
Favour Ibude
Allianz • 30K followers
🚩My Life Working 9–5 as a Data Scientist Is No Joke Have you ever finished work, collapsed on the bed, and thought; “Learning today? Forget it.” That was me. 📍But if I didn’t create a learning routine, I’d be stuck in the same role, watching everyone else grow. 📍So I built a rule I now live by: 2 hours of learning daily. Not easy. Not every single day. But needed. → Break it up. 30 mins in the morning, 1 hour after work, 30 mins before bed. → Passive counts. Podcasts on commutes, YouTube while cooking. Learning doesn’t always need a desk. → Stick to a slot. Morning owl? Night owl? Doesn’t matter. Pick your lane, lock it in. → Switch styles. Some days theory, other days coding challenges. Keeps boredom away. → Start small. 10 pages. 1 problem. 1 video. That’s how habits grow. → Track it. Nothing motivates like flipping through old notes and seeing how far you’ve come. → Be relevant. Learn what feeds your current project. Don’t hoard random skills. ❓What’s your 2-hour rule? #dataliving #favouribude
823
143 Comments -
Deepak Kumar
One97 Communications Limited • 647 followers
After understanding Decision Trees, the natural next step is Random Forest, along with related concepts such as Ensemble Techniques, Bagging, Boosting, Bootstrap sampling, Out-of-Bag (OOB) validation, and the step-by-step working of Random Forest Regression. ### 🔹 Ensemble Technique * **Classification** → Combine predictions from multiple models using **majority voting** * **Regression** → Combine predictions using **mean or median** (commonly decision trees as base learners) --- ### 🔹 Bagging (Bootstrap Aggregating) * Models are trained **in parallel** * Each model is trained on a **bootstrap sample (sampling with replacement)** * Final prediction = **majority vote (classification)** / **mean or median (regression)** ✅ **Random Forest is based on Bagging** --- ### 🔹 Boosting * Models are trained **sequentially** * Each new model focuses more on **previously mispredicted samples** * **AdaBoost & Gradient Boosting** follow this idea 🚀 **XGBoost is an optimized version of Gradient Boosting** (regularization + performance optimizations) ### 🔹 Bootstrap * Sampling **with replacement** to create multiple training datasets ### 🔹 Out-of-Bag (OOB) * Data points **not selected** in a bootstrap sample for a tree * Used as **internal validation data** ### 🔹 Random Forest * **Self-validating model** using OOB error * Reduces **overfitting** by combining: * Data randomness (bootstrap) * Feature randomness (feature sub-sampling) --- ## 🔹 Random Forest Regrassion – Complete Steps (Training → Prediction) ### **1️⃣ Input Data** * Original dataset with features and target variable ### **2️⃣ Bootstrap Sampling** * Create **N bootstrap datasets** using sampling with replacement * Each bootstrap dataset trains **one decision tree** * Number of trees = **n_estimators** --- ### **3️⃣ Feature Sub-sampling** * max_feature = (let)2 ### **4️⃣ Best Split Selection (for each tree)** For each selected feature: * Sort feature values * Identify **candidate split points (midpoints)** * Perform **left & right split** * Calculate variance of left and right nodes * Compute **weighted variance** * Calculate **variance reduction** * Select the **feature & split with maximum variance reduction** ### **5️⃣ Tree Growth** * Repeat splitting until stopping condition: * Maximum depth reached OR * Minimum samples per node reached ### **6️⃣ OOB Error Estimation** * For each tree, predict using its **OOB samples** * Aggregate OOB predictions * Compute **OOB error** as model performance estimate ### **7️⃣ Final Prediction** * **Classification** → majority vote from all trees * **Regression** → average of predictions from all trees 🙏 **Special thanks to my mentor Monal S. and Krish Naik for guidance in correcting concepts and improving clarity.
1
-
Udit Soni
ICICI Bank • 6K followers
🎬 Sentiment Analysis: Then vs. Now 👉 Input: “The movie was not bad at all.” 👉 Goal: Predict Sentiment → Positive 🔹 How Traditional NLP Worked 1️⃣ Feature Extraction → Used Bag of Words / TF-IDF. ⚠️ Issue: Words treated independently. “bad” = negative, even in “not bad.” 2️⃣ Machine Learning Models → Logistic Regression, SVM on handcrafted features. ✅ Simple, but only as smart as the features we designed. 3️⃣ Deep Learning (RNNs, LSTMs) → Added sequential context. ⚠️ Limitation: Struggled with long sentences, sarcasm, nuance. Result: Often failed to capture meaning in real context. 🔹 How LLMs Changed the Game 1️⃣ Tokenization → Break text into subwords (e.g., “unbelievable” → “un”, “believe”, “able”). 2️⃣ Embeddings → Map tokens into vectors that capture semantic relationships. 3️⃣ Transformers → Process the entire sequence in parallel using self-attention. 4️⃣ Attention Mechanism → Learns which words matter. ✔️ Example: understands “not” flips the meaning of “bad.” 5️⃣ Scaling Up → Trained on massive datasets with billions of parameters. LLMs (GPT, LLaMA, PaLM) become general-purpose foundation models. ✅ Result: LLM correctly predicts sentiment = Positive because it captures context + meaning. 📌 Takeaway Traditional NLP = brittle, task-specific. LLMs = contextual, scalable, general-purpose. They’re now the backbone of summarization, translation, coding, Q&A, and more. 💡 Terms like tokenization, embeddings, attention, transformers, fine-tuning, zero/few-shot will be unpacked in upcoming posts. Stay tuned! Follow Udit Soni for more! #GenAI #LLM #NLP #ArtificialIntelligence
14
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More