Pane Abstraction in the Text editor(Rust) that I am building [Phase II] This was really an interesting problem to solve, especially dynamic resizing with mouse dragging. And folks ,Data Structure and algorithms is important stuff and are actually used in real world projects. (Binary tree in this case [for more detail checkout the article]) I have documented the process here(Medium Article) : https://lnkd.in/gh-T5gnN If you directly wanna go through the code (PR) : https://lnkd.in/gc5PhEZ4 This is also my first article, so your time, thoughts and feedback are very much appreciated. Thank you! #Rust #SystemsProgramming #TextEditor #BuildInPublic
Rust Text Editor Pane Abstraction with Dynamic Resizing
More Relevant Posts
-
Graph RAG Explained: How Knowledge Graphs Make RAG Smarter Graph RAG is the next step after traditional RAG. Instead of only retrieving similar text chunks, it uses knowledge graphs to understand entities, relationships, and connected context. This guide expl Read more → https://lnkd.in/drUH_9Us #TheCampusCoders #Tech #Developers #WebDev
To view or add a comment, sign in
-
Why do your retrieved chunks come back at wildly different sizes when you set chunk_size to 512? RAG fundamentals: You set chunk_size=512 and overlap=50 in the recursive character splitter and assumed the output would be evenly sized. You ran your indexing pipeline and moved on. You probably never histogramed the actual sizes. The recursive character splitter is a chunking strategy that tries delimiters in order (paragraph break, then line break, then space) and splits at the first one that keeps chunks under your size limit. The "size limit" is the only guarantee. No minimum size. No balance constraint. Histogram your indexed chunks and you will see a long tail of tiny chunks (50 to 100 characters) wedged between big ones (450 to 512 characters). Tiny chunks come from places with frequent paragraph breaks, like a table of contents or a bulleted list. Big chunks come from continuous narrative paragraphs the splitter could not break cleanly. Why it matters? A 50-character chunk is too narrow to carry context. A 510-character chunk packs too many ideas to embed cleanly. Tiny chunks dilute top-k with noise. Bloated chunks force the LLM to filter the right answer out from two unrelated facts. Two fixes : Enforce a min-size threshold and merge consecutive sub-threshold chunks into one. Or switch to semantic chunking. The first is a one-line change. The second is structural. When did you last histogram your chunk sizes, and what was the standard deviation as a fraction of the mean?
To view or add a comment, sign in
-
-
Just finished writing a complete guide to RAG optimization — 25 techniques, split across two parts. Not theory. Practical techniques with code, trade-offs. ━━━━━━━━━━━━━━━ 📘 Part 1 — Indexing ━━━━━━━━━━━━━━━ Everything that happens before a user sends a query. Because most RAG failures start here. 🔹 Fixed-length & overlapping chunking 🔹 Recursive & sliding window chunking 🔹 Semantic chunking — split on meaning, not character count 🔹 Hierarchical (small-to-big) chunking 🔹 Metadata-aware chunking — unlocks filtering later 🔹 LLM-based & agentic chunking 🔹 Late chunking & Matryoshka embeddings ━━━━━━━━━━━━━━━ 📗 Part 2 — Retrieval & Reranking (live now) ━━━━━━━━━━━━━━━ Everything that happens after — finding the right chunks and refining them before the LLM sees them. 🔹 Query rewriting & HyDE 🔹 Multi-Query RAG & Pseudo-Relevance Feedback 🔹 Self-query with metadata filtering 🔹 Hybrid retrieval — dense + BM25 🔹 Graph RAG — for multi-hop, relationship-heavy questions 🔹 Contextual retrieval & Multi-hop RAG 🔹 Cross-encoder reranking & ColBERT 🔹 Recency reranking & prompt engineering for RAG ━━━━━━━━━━━━━━━ Both parts include a full reference table so you can quickly find the right technique for your specific failure mode. If you're building RAG in production — or about to — this is for you. 🔗 Link in the comments.
To view or add a comment, sign in
-
-
Most document parsers see this 👇 on the left. A flat stream of text. Tables collapsed into noise. Charts erased. Images orphaned from their captions. That's what happens when you treat documents as text objects. Parsimmon sees what's on the right. Every document is a multidimensional visual object. Text. Images. Tables. Charts. Page context. Source evidence. Each layer extracted by a specialist model: tables reconstructed cell-by-cell, charts reverse-engineered from pixels back to data, images described with full caption context. The result downstream: → Cleaner chunks → Better embeddings → More accurate indexing → More reliable retrieval If your RAG pipeline is hallucinating numbers or losing the chart that proved the point, the problem probably isn't your model. It's the parser upstream of it. We're building Parsimmon to fix that. Parsimmon Parsimmon.io
To view or add a comment, sign in
-
-
Here's an update (https://lnkd.in/gvDg8TCi) on an article I wrote a couple of weeks ago - looking at the how far Claude Code can be pushed on simultaneous retrieval + reasoning tasks. Retrieving related chunks of information, reasoning and taking decisions across them is arguably an important capability for any coding agent. Here, I show that there is a steep cliff where accuracy rapidly falls off from n = 10 to n = 14 items to retrieve and reason across. I also found that for retrieval alone or pure-reasoning tasks, there is no sharp accuracy fall-off. The cognitive load of having to do both somehow causes reliable failures. Perhaps this is the answer then; models and harnesses need to get better at detecting when they're at their limits, write to file and handoff to a separate instance!
To view or add a comment, sign in
-
I finally sat down and ranked my most-used algorithms—not by complexity or theoretical speed, but by preference and ease of use. S-Tier (my go‑to tools): Binary Search, Dijkstra, Union Find, Sliding Window – reliable and intuitive. A-Tier: Segment Trees, Two Pointers, Prefix Sum DP – powerful with a slight learning curve. B & C: Reroot DP, Digit DP, Sparse Tables – great in specific contexts, but I reach for them less often. D-Tier: Alien Trick, Mo’s Algorithm – clever, but rarely my first choice. This list is deeply personal. What feels elegant to me might feel clunky to you, and that’s the beauty of our craft. The right algorithm isn’t always the fastest on paper—it’s the one you can implement confidently under pressure. So I’ll ask: What algorithm or data structure lives in your S‑Tier, and why? Let’s compare notes. Want to build a same list yourself and post on LinkedIn, try this out - https://lnkd.in/gNPraM24 Follow Vishu Kalier for more such insights. #Algorithms #DataStructures #CodingLife #SoftwareEngineering #TechRankings #CompetitiveProgramming
To view or add a comment, sign in
-
-
Most AI apps fail because the retrieval layer is bad. So I started building my own RAG engine. Introducing **RAG Engine** A lightweight open-source framework for: ✓ document ingestion ✓ embeddings ✓ semantic search ✓ retrieval pipelines ✓ LLM-powered applications Built for developers who want: • simplicity over bloated abstractions • modular architecture • fast experimentation • full control over the pipeline RAG is becoming the backbone of production AI systems, and I wanted to better understand it by building one from scratch. Would appreciate feedback, ideas, or contributions 👇 https://lnkd.in/eZsishNc #RAG #LLM #AIEngineering #OpenSource #Python #MachineLearning #GenAI
To view or add a comment, sign in
-
Burned $40 in a week routing everything through Sonnet. The expensive step wasn't the drafting. It was the scoring. 200 calls a day, all going to a model that didn't need to be that smart. Switched the scorer to Haiku. Same accuracy. $3 a week now. The rule I run on every build: Judgment steps (final output, client-facing copy, decisions someone signs off on) > Sonnet. Pay the tax. Volume steps (classifying, scoring, parsing, routing, deduping) → Haiku. Almost always. Most workflows have 5 to 10 steps. Run them all through the best model and the math gets ugly. Split them by what each step actually needs and the same workflow runs at 10% of the cost without losing anything that matters. Cheap model does the volume. Smart model does the judgment. What's your default model for classification?
To view or add a comment, sign in
-
-
🚀 LeetCode Day Problem Solving 🚀 Day-65 📌 Problem: Given an array nums[] 🎯 Define rotation function: 👉 Rotate array k times → arrₖ 👉 Compute: F(k) = 0*arrₖ[0] + 1*arrₖ[1] + ... + (n-1)*arrₖ[n-1] 🎯 Goal: Find the maximum value among all F(k) 🧠 Example: Input: nums = [4,3,2,6] ✅ Output: 26 💡 Key Insight (Most Important 🔥): ✔ Brute force rotation = ❌ O(n²) (too slow) 👉 Instead, use relation between consecutive rotations ⚡ Core Formula: F(k) = F(k-1) + sum(nums) - n \cdot nums[n-k] 🔥 Meaning: ✔ We can compute next rotation using previous one ✔ No need to recompute from scratch ⚡ Approach: 1️⃣ Compute: sum = total sum of array F(0) 2️⃣ Iterate k = 1 → n-1: Use formula to compute F(k) 3️⃣ Track maximum value ⚡ Why it works? ✔ Each rotation shifts elements ✔ Contribution changes in a predictable way 👉 That’s why recurrence works 🚀 📊 Complexity Analysis: ⏱ Time Complexity: O(n) 📦 Space Complexity: O(1) 🧠 What I Learned: ✔ Turning brute force → optimized using math ✔ Rotation problems often have hidden patterns ✔ Importance of deriving recurrence relations ✅ Day 65 Completed 🚀 Leveling up in Array + Mathematical Optimization 💪 #Leetcode #DSA #ProblemSolving #BitManipulation #CodingJourney #InterviewPreparation #Consistency #MilanSahoo 🚀
To view or add a comment, sign in
-
-
🚀 Blind 75 Journey Update – Phase 3 Completed (Sliding Window) I’ve successfully completed Phase 2: Two Pointers, and now I’ve moved through one of the most important interview patterns in DSA: 🔹 Phase 3 – Sliding Window (Completed) Problems covered: 1️⃣1️⃣ Longest Substring Without Repeating Characters (3) 1️⃣2️⃣ Longest Repeating Character Replacement (424) 1️⃣3️⃣ Minimum Window Substring (76) 💡 What I learned in this phase: This phase strengthened my understanding of one of the most powerful optimization techniques in arrays & strings: • Dynamic window expansion and contraction • Frequency tracking using HashMap / arrays • Maintaining optimal subarray/subsequence constraints • Handling real-time updates efficiently • Converting brute-force O(n²) → optimized O(n) solutions 🧠 Key Insight Sliding Window is not just a pattern — it's a mindset: Instead of recomputing, reuse previous computation while expanding and shrinking intelligently. It’s heavily used in: String processing problems Subarray optimization Streaming / real-time data problems 🎯 My Focus Moving Forward I’m continuing the Blind 75 journey in strict chronological order, focusing on: ✔ Pattern understanding before memorization ✔ Writing approach before coding ✔ Improving time & space optimization intuition ✔ Building strong DSA fundamentals step by step 🚀 Next Phase Coming Up: 🔹 Continuing Blind 75 with next patterns (Stack / Binary Search / Linked List depending on order) Consistency over speed. Depth over shortcuts. #Blind75 #SlidingWindow #DataStructures #Algorithms #LeetCode #CodingJourney #SoftwareEngineering #ProblemSolving #BuildInPublic #InterviewPreparation
To view or add a comment, sign in