Sign in to view Eddie’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Sign in to view Eddie’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
San Francisco Bay Area
Sign in to view Eddie’s full profile
Eddie can introduce you to 10 people at Echelon AI
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
11K followers
500+ connections
Sign in to view Eddie’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View mutual connections with Eddie
Eddie can introduce you to 10 people at Echelon AI
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View mutual connections with Eddie
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Sign in to view Eddie’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
About
Echelon is your AI architect that gets your…
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Activity
11K followers
-
Eddie G. shared this3 months ago I posted about ServiceNow down ~40%, over the today NOW is up double digits. And the headline driving the chatter? ServiceNow's CMO is leaving to run business marketing at OpenAI. Read that again. February's story was "AI is going to eat enterprise SaaS." Today's story is "a SaaS exec left for the AI lab and the stock soared." Same company. Opposite narratives. Just 12 weeks apart. Here's what didn't change in between. On his way out, the CMO described AI as collapsing the distance between an idea and a working system. A prompt becomes a prototype. A question becomes an analysis. A rough idea becomes shipped code. That's not a marketing line. It's the only thing that has ever driven platform value: how fast you can turn intent into something running in production. When the stock was down 40%, the customers who kept shipping (clearing backlogs, bringing workflows online, moving faster than their headcount should allow) never stopped getting value. The ones frozen by the ticker did. Markets narrate. Platforms compound. If you're a ServiceNow customer, AE, or CSM trying to make the platform-value case through all this noise, the answer hasn't moved: value follows delivery speed. Everything else is a headline. That gap between idea and production is exactly what we built Echelon to close. Always up to compare notes with anyone navigating it.
-
Eddie G. shared thisA $100K CMDB health scan delivered in 10 minutes by Echelon. Last week, we held our second webinar, this time on using Echelon to run your CMDB. A healthy CMDB is the foundation to any success with AI tooling on ServiceNow - a big talking point across our pipeline & customer base right now as C-Suite pushes for true AI adoption, not just experimental dollars. Alvis C. gave a great demo on how Echelon can be used to: - Accelerate CSDM Crawl with automated discovery / service mapping debugging - Accelerate CSDM Walk with automated creation of Technology Management Service & Offering. If you want to see the full demo from last week's session, head to the link in the comments for the full webinar & jump to ~7.30 seconds in. We have an ongoing webinar series covering Echelon for Catalog, CMDB, FSO, & more. Comment "ServiceNow" below & we'll add you to our list to notify you of upcoming streams.
-
Eddie G. shared thisMy alma mater runs on the same platform I'm now disrupting. I didn't know that when I was a student at University of California, Berkeley. You don't think about platforms when you're 19. You just register for classes, submit IT tickets when the dorm WiFi dies, and move on with your day. Turns out the entire university, 70,000+ students, faculty, and staff, runs on ServiceNow. Facilities, HR,m, IT, the whole stack. At Knowledge 2026, I sat down with UC Berkeley's ServiceNow team. We talked about how Echelon's autonomous agents can drive more innovation and automation on the platform that quietly powers the place that taught me how to build software in the first place. It's a strange and wonderful kind of full circle. The same university that gave me my CS foundation is now a conversation I get to have as a founder. Grateful to Cal. Excited for what we get to build together. Go Bears. 💙💛
-
Eddie G. shared thisEchelon pitched to 600+ enterprises at the Plug and Play Summit today. Every single organization we spoke to is looking to use AI to do more. That mandate is coming straight from the CIO and the CEO. But here’s what we haven’t heard: stories of IT organizations actually transforming the way they function. The new paradigm. The shift in how IT itself operates. That’s the conversation we came to have. When we sat down with Costco, BNP Paribas, and many other large enterprises and shared how we’ve helped Fortune 500 IT teams fundamentally change the way they operate, including cutting contractor spend by 50%, it resonated. AI shouldn’t just sit on top of IT. It should rewire how IT works. More great conversations today and tomorrow. If you’re at the summit, drop me a message. Would love to meet up. #PnPMaySummit2026 #plugnplay
-
Eddie G. shared thisToday, Echelon graduates from BCV Labs in Palo Alto. For two years, this was our first office. And almost every "first" that matters happened inside these walls. First idea. First failure. First time starting over. First version of Echelon. First customer pitch. First signed deal. First hire. First all-nighter that turned into a product. First time realizing we might actually be building something real. Next week, we move into our own office in San Francisco. But this place will always be the first one. To Aaref Hilaly, Rak Garg, True Sala, William Eleazer and rest of the Bain Capital Ventures (BCV) team, thank you for the curvy monitors, the topo chicos, the conversations, the patience, the belief, and the surprise cake today. You didn't just give us a roof. You gave us a home to grow. We wouldn't be here without you. Onward.
-
Eddie G. shared thisThank you, honest strangers of Reddit. I came back from Knowledge 2026 equal parts energized and overwhelmed. There's something uniquely powerful about anonymous practitioners telling each other the truth with nothing to gain. No marketing budget can buy this. No campaign can fake it. "The most impressive SN development agent on the market. Better than anything at SN." "We've been using this for a year and its the best we've found so far." "Tons of buzz about them at K26... I just get a genuine vibe from them, like it's not a ton of marketing polish on their stuff. I'm rooting for them." These are unprompted Reddit comments. No marketing, no incentive, no ask. Just customers and practitioners talking to each other. This is the moment. And we are too small for it. So we're going heads down and we're hiring. Across literally every function: - Recruiting (yes, building the whole in-house team) - AI / Member of Technical Staff - Account Executives - Business Development Reps If you've ever wanted to join a company at the exact moment the curve goes vertical, this is it. We're replacing the $500B IT services labor layer, starting with ServiceNow. DMs open. Tag someone. I read everything.
-
Eddie G. shared thisCome and work with Jennifer!Eddie G. shared this🎉 I’M HIRING! 🎉 Looking for a strong candidate interested in optimizing and automating IS Business Office administration activities and Asset Management functions in Hershey, PA. Come join my team! Apply within.
-
Eddie G. shared thisThree days of Knowledge 2026. 146 demos. One topic kept eating the conversation: CMDB. Not in a "yeah we should clean it up someday" way. In a "this is the single biggest thing blocking our AI roadmap" way. A few things I keep hearing: 1. Enterprises have paid hundreds of thousands for assessments that produced findings, not outcomes. 2. The half-life of a clean CMDB record is measured in weeks. Maintenance is the actual problem, not the initial population. 3. Every shiny AIOps, SecOps, or agentic capability ServiceNow is shipping assumes a CMDB you can trust. Most of our customers can't. The demo that resonated most at our booth wasn't the flashiest one. It was the CMDB agent quietly doing the boring, continuous work that no human team is staffed to do well. Foundations are unglamorous. They're also the whole game. Sign up for our CMDB webinar next week to go deeper on this and learn how Echelon can help. Link in comments.
-
Eddie G. shared thisServiceNow AEs are the most underrated people in enterprise software. There. I said it. Yes, they're chasing quota. But the good ones know quota doesn't come until their customers actually win. So they fight for outcomes first and I see them constantly going above and beyond at Knowledge. Three things kept coming up at our booth: 1. TCO. Customers want more value out of the platform they already pay for. AEs are the ones translating that pressure into action. 2. Shelfware. Modules go live, then sit. AEs know if customers don't adopt what they have, they won't buy more. That's why so many brought Echelon in. 3. CMDB. Still the universal headache. Still the unlock for everything downstream. A huge chunk of our booth traffic came from AEs personally walking customers over and saying "you need to see this." So we're returning the favor. We're launching a monthly newsletter for ServiceNow AEs. Real case studies on how Echelon agents are clearing backlogs, driving adoption, and unlocking shelfware for your accounts. Want in? Comment below or DM me and I'll add you to the list.
-
Eddie G. liked thisEddie G. liked thisKnowledge 2026 is a wrap. The keynotes and demos are always exciting —it's always inspiring to see the innovation our product teams are bringing forward, and how closely it aligns to what businesses actually need. In 2023 it was GenAI. In 2024 it was Now Assist. In 2025 it was AI Agents. In 2026 we are bringing Autonomous Workforce. Some of annoucements that I am excited about - 🤖 Autonomous Workforce—AI specialists that complete work on your behalf 🧠 ServiceNow Otto—a unified AI experience for every user across the enterprise ⚙️ Context Engine—connecting relationships, policies, and operational signals for deterministic business context A huge shoutout to the events team—year after year you deliver something truly world-class. It doesn't go unnoticed. And to my team—thank you. Weeks of preparation, some of the most attended sessions on the floor, and standing all day sharing with customers what you've built and support every single day. That kind of dedication is what makes Customer Zero real. I couldn't be more proud.. But the thing that stays with me most? The conversations. Customers sharing their real challenges. Us sharing our real lessons. What worked, what failed, and what it actually took to scale. No polish, no theater—just honest talk from people in the trenches. That's what Customer Zero means to me. Not a badge. A responsibility to show up with the truth. The vision is clear. The platform is ready. Now we execute. Want to learn about how we achieved 90% IT Self-Service, or what worked, failed or scaled. DM me. #Knowledge2026 #ServiceNow #CustomerZero #AgenticAI #AutonomousWorkforce #NowOnNow #ServiceNowOtto
-
Eddie G. reacted on thisEddie G. reacted on thisI started my ServiceNow journey back in 2019, and being here at Knowledge 2026 truly felt like living a dream. I’m especially grateful to the mentors, teammates, and leaders who believed in me and gave me the opportunity to be part of this experience and represent Echelon AI. Proud to showcase what our team can do and excited for everything that’s coming next. What impressed me the most was not only the technology but the people behind it. The size of the ServiceNow ecosystem is incredible — I always knew it was big, but experiencing it in person showed me how massive, connected, and innovative this community really is. From reconnecting with colleagues and meeting talented professionals from all around the world, to seeing how AI Agents and automation are transforming the future of enterprise workflows, every conversation reminded me why I’m passionate about this space. The energy, collaboration, and drive to create real disruption and innovation are on another level. Knowledge was not just an event — it was motivation, inspiration, and confirmation that we are building the future now. YEEEIIII !! #Knowledge2026 #AIAgents #Innovation #ServiceNowCommunity
-
Eddie G. liked thisVery insightful!Eddie G. liked thisEverlane sold to Shein for $100 million this weekend. The last round, five years ago, valued the company at $550 million. Common stockholders are getting nothing. Just last month, Allbirds filed to drop the shoes and reorganize as a GPU-as-a-Service company called NewBird AI. Two of the most-cited DTC brands of the 2010s stopped being DTC brands within one month. The Gen 1 DTC era is closed. Everlane to Shein. Allbirds to GPUs. Every DTC generation builds on something temporary and treats it like a permanent moat. AI is about to take the rent up again. 1️⃣ Gen 1 (2010-2018) treated "digital-native" as the moat. The moat was cheap Meta attention. Allbirds went from $4.1B to $39M in 50 months. Casper got taken private. Glossier sold a majority to Bain. Everlane just cleared at less than the last round. 2️⃣ Gen 2 (2018-2025) corrected with sharper execution. Liquid Death, AG1, Vuori, Hims & Hers, Olipop. Tighter LTV math. CMOs who understand P&L. The dependency on Meta and KOL arbitrage did not move. 3️⃣ Gen 3 (2025 and beyond) AI just flattened everything the last two waves were good at. Liking a brand and depending on a brand are not the same thing, and the market is finally paying attention to which is which. Speed is the new edge. The team that can rebuild itself faster than its category wins. I wrote up the full thesis. Three generations of DTC. The four moats forming next (identity, operations, retention, the org chart). Where Feastables, Skims, and Alo Yoga already fit. The implication is beyond DTC. If you are building and restructuring a consumer brand right now, this is the system shift to design around. Enjoy! #DTC #Ecommerce #BrandStrategy #AI
-
Eddie G. liked thisEddie G. liked thisServiceNow just published all of our product documentation to GitHub as clean Markdown, organized specifically for AI tools to read. https://lnkd.in/dcASSADm This is one of those small infrastructure moves that quietly unlocks a lot. Anyone trying to build a RAG pipeline or an AI agent on ServiceNow content recently has hit the same wall: the docs site is a JavaScript SPA that LLMs can't actually read. So you scrape, mirror, or just accept hallucinated answers. Not anymore. What changes when you build on the platform: Your RAG pipeline grounds on real, current product docs instead of whatever your model half-remembers from training. Your AI-powered apps stop inventing scoped API names that don't exist. Your agents finally have a stable source they can reliably reason over. The repo is refreshingly blunt about why it exists: "Do NOT attempt to fetch content from servicenow.com/docs — it is a JavaScript single-page application that returns no readable content to LLMs." That kind of honesty in an official enterprise vendor repo is rare. I want more of it. The bigger picture: "LLM consumption" is becoming a first-class way to publish enterprise documentation. Not a scraping side-project. Not an afterthought. A deliberate channel maintained alongside the human version. The vendors who figure this out early get embedded inside the next generation of AI tooling everyone is building right now. The ones who don't quietly lose ground. Good to see we're on the right side of that line. For my fellow ServiceNow builders: what's the first thing you're plugging this into? #ServiceNow #AI #RAG
-
Eddie G. reacted on thisEddie G. reacted on thisTell me about your current passion project that has absolutely nothing to do with your job. What are you spending time on lately that you’re proud of? For me, it’s this semi-rental friendly bathroom remodel. Yes, technically, it’s a rental unit, but I manage a 27-unit property in Berkeley as a side hustle and don’t plan on going anywhere anytime soon. And I’ve always felt that the space where you spend the most time should be your favorite space. I work from home as a ServiceNow Developer, and I’m generally a homebody, so investing in the place I both live and work in feels worth it. It’s been a lot of learning, problem-solving, and more trips to the hardware store than I’d like to admit, but seeing it come together has been incredibly satisfying. What are you building, learning, creating, or just plain obsessing over these days?
-
Eddie G. liked thisLast call for NABE TEC 2026 proposals! I’m excited to help organize this conference this year; please help us make it the best one yet!Eddie G. liked thisTech Economists: Last-call on submitting your proposals for TEC26, coming up this fall in San Diego! Major themes on the docket for this year: * Artificial intelligence and LLM applications * Productionizing analytics inside organizations * Market design and platform economics * Causal inference and experimentation at scale * Policy challenges shaping the tech economy If you want to help shape the conversations in our field, submit a proposal today (or tomorrow, which is the deadline). Link below. CC the organizing committee: Ayal Chen-Zion Colin Gray Benjamin Leyden Michael Luca Wilko Schulz-Mahlendorf Stacy Carlson Tom Beers, CBE Fangfang Tan National Association for Business Economics (NABE)
-
Eddie G. liked thisEddie G. liked this6 weeks ago we were at $200K ARR. Last week, we crossed $1M. The numbers behind it matter more to me than the headline: ➤ one customer saw her traffic and revenue 5x in 3 months with Helena ➤ 3,000+ automated tasks running right now, 24/7 ➤ one DTC brand ran 1,532 automated tasks for analytics/SEO/ads/social ➤ usage up 36% in the last 7 days ➤ 630+ commits in the last 6 weeks To every customer who took a bet on us early and told someone else - you made this real. Now onwards to the next milestone!!
-
Eddie G. liked thisEddie G. liked thisTechnical founders must earn the right to sell a "platform", and should say no to some initial inbound prospects to remain focused. I sat down with Saanya Ojha to discuss what she sees as a growth investor when evaluating founders' go-to-market acumen. These and more nuggets in this short clip.
Experience & Education
-
Echelon AI
******* * **** ** **
-
****
******* * **** ** **
-
*********
*** *********** ******** **
-
********** ** *********** ********
******** ** **** ******** ******* undefined
-
-
** ******** ******* ** ***********
********** ****** ********** ******** *** ********** *******
-
View Eddie’s full experience
See their title, tenure and more.
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Recommendations received
2 people have recommended Eddie
Join now to viewView Eddie’s full profile
-
See who you know in common
-
Get introduced
-
Contact Eddie directly
Other similar profiles
Explore more posts
-
Ashok Kumar Singh
MCAL Global • 11K followers
The GPU budget isn't what's killing your AI ROI. It's the silent compute burn happening before the model even touches the data. 🛑 Everyone is obsessing over the cost of model training, but here is the unpopular engineering truth: for most enterprise AI projects, data preprocessing consumes significantly more compute than actual model training. Heavy workloads like deduplication, normalization, and label cleaning are massive cost centers. Yet, while engineering teams meticulously track GPU hours, almost nobody instruments their data pipelines. You cannot optimize what you do not measure. If you lack granular visibility into your preprocessing compute, you are flying completely blind on a cost center that can easily exceed your entire training budget. Stop bleeding compute in the dark. Instrument your data pipeline. Take control of your AI unit economics today at https://zurl.co/bieZo. 👇 #DataEngineering #FinOps #EnterpriseAI
6
-
Ben Jackson
VenorTech Limited • 11K followers
AI Tech Stack Funding this week: $5.6bn Eleven companies that raised this week in the AI infrastructure / tool stack space: CoreWeave – Post-IPO Equity – $2bn: A cloud-based AI infrastructure company offering GPU cloud services to support large-scale AI and machine learning workloads. StepFun – Series B – $718.1m: An artificial intelligence company focused on advancing toward artificial general intelligence (AGI). AstraSync AI – Seed – Undisclosed: Know Your Agent platform providing identity, trust, and verification for the agentic economy. Databricks – Debt Financing – $1.8bn: A data and AI platform unifying data engineering, analytics, and machine learning on a lakehouse architecture. LiveKit – Series C – $100m: A cloud platform enabling developers to build, deploy, and manage real-time communication and AI-driven applications. AppliedAI – Series B – Undisclosed: Provides AI-augmented automation for mission-critical business workflows. Upscale AI – Series A – $200m: Redefines networking with purpose-built solutions for high-performance compute environments. Baseten – Venture – $300m: An AI infrastructure platform integrating machine learning into production business systems and workflows. humans& – Seed – $480m: A frontier AI research and product lab focused on human-centred AI systems and interaction models. Thunder Compute – Seed – $4.5m: One-click GPU instances designed to deliver compute at significantly lower cost. NENNA.AI – Seed – Undisclosed: An AI enablement platform focused on the secure application of AI technologies. #AI #ArtificialIntelligence #AITechStack #AIFunding #AIEcosystem #AIInfrastructure #MachineLearning #ML #Startups #TechFunding #VentureCapital
6
-
Chenxi Wang, Ph.D.
Rain Capital • 29K followers
Lovable hit $100 million ARR 8 months from launch. Anysphere's Cursor took 12 months to hit $100 million Before that, It took Wiz 18 months to reach $100M And Deel 20 months Ramp 24 months All these companies exhibited #escape #velocity. If you look at how they conducted business, these companies focused on #distribution early on in the company's journey. "The game of #startups is if the startup can gain #distribution faster than the incumbent gain #innovation ", as Alex Rampell wrote in his 2015 blog post Alex Rampell in his blog post “Distribution vs Innovation” As Lovable's stats demonstrate, the escape velocity is becoming larger and more aggressive, particularly with #AI. Will we see companies reaching $100M ARR in 6 months? Will we see solopreneur companies achieving the same escape velocity? The question for other #startups, those that funded but not achieving quite the same escape velocity -- what are you doing to steer yourself onto the escape velocity path? As Brian Balfour wrote in this excellent blog: "The Big Squeeze: Why Escape Velocity Is More Important Than Ever"-- #Distribution is the seed that unlocks harvest (I recommend every startup founder read this blog, link in the comments) - Who can you integrate with? - Which channels can you attain distribution faster than others? - Who should you partner with to give you that market #uplift? #distribution is indeed more important than ever. If you are a startup and you are not asking these questions from day 1, you will be left behind.
62
13 Comments -
Luis Saffie
Heirly • 1K followers
My notes from the AI Powered Engineering Productivity talk at Snowflake Toronto 👇 📈 The Real Productivity Gains (Not What You Think) 50% increase in code shipped per engineer. But here's what's actually driving it: → AI writes documentation (and keeps it current) → AI generates better test coverage → AI debugs production logs Result: More stable systems, less firefighting. 2️⃣ Two Use Cases Worth Using: 1. LLM integrated into Slack for support questions 2. LLM pointed at production logs for debugging Both reduced time teams spend on repetitive work. 📊 The Only Metric That Matters: Forget story points. Rather, measure time to productize. How fast can you go from idea → customer using it? That's it. 🤝 How to Interview Engineers in 2025 Stop asking leetcode puzzles. Give them AI tools, ask them to solve a real problem and have them explain their solution. You'll learn how they think, not what they memorized. 🛠️ On "Will AI Replace Engineers?" Steven Woods (Snowflake CTO) nailed it: "We don't hire engineers to write code. We hire intelligent problem solvers who understand the business and happen to be technical and relentless. Those people aren't replaceable." If you're just coding, you're at risk. If you're a problem solver, you're not. Overall a really great talk. Thanks to Qaiser Habib and the Snowflake team for putting it all together. Thanks to the speakers: John Abd-El-Malek, Josh Zana And amazing panel: Mohit Gupta, Leho Nigul and Steven G. Woods P.S. Feels good to be back in the arena. See you around!
41
3 Comments -
Chuan Qiu
Velda • 2K followers
xAI is sharing spare GPU capacity with Anthropic. Their cluster utilization was reportedly running at 11%. The industry average isn't much better — 40%. This isn't an xAI problem. It's a structural one. Most clusters are reserved by a single tenant. But compute demand from any one team is inherently uneven: quiet during planning cycles, maxed out before a model deadline. The gap between those two states gets eaten by idle GPUs no one is using, while teams without reservations queue for hours to get a single node. Velda is making compute fluid. If you reserve GPUs with us, your unused capacity doesn't go to waste: it converts automatically into spot credits you can draw on later. You will have prioritized access to the spot pool, which comes from unused compute of reservations that would be otherwise wasted. Velda allows you to run GPU jobs with a simple command prefix; Preempted jobs retry automatically from last check-point. Getting cluster access should feel like using your workstation, no manifest, no extra setup. And we're not going to mark up the hardware. You pay what the GPU provider charges, no platform tax on top. If you have an existing provider relationship, we'll work with them directly. The capacity is there. It just needs to move.
23
6 Comments -
Jing Xie
Clarion Intelligence Systems • 13K followers
𝗕𝗥𝗘𝗔𝗞𝗜𝗡𝗚: OpenAI just embraced open source with gpt-oss-120b for HPC and gpt-oss-20b for consumer machines. It's great to see this type of response from OpenAI. I speak with many F500 AI leaders and they are extremely wary of: - Relying on single closed-source vendors with third-party risk - The risks of not having data portability from day one - Privacy and regulation that will inevitably catch up The reality with closed models is that even if your data stays in your databases, it's getting passed through an exchange loop where the vendor can see, access, and understand everything before passing it back to you. Enabling enterprises to run local models returns control to users and enterprise AI teams. It's a really good sign toward a multi-model, multi-context world.
26
1 Comment -
Bar Haim
IBM • 14K followers
2–4x faster LLM output. Same quality. Lower cost. Meet Speculative Decoding 🤯 Traditional text generation—where Large Language Models (LLMs) predict one token at a time—is a major bottleneck for real-time performance. But there’s a smart solution that can drastically speed things up without sacrificing output quality: Speculative Decoding. Think of it as “draft and verify” for language generation: 🔹 A smaller, faster draft model quickly speculates what the next few tokens might be. 🔹 Meanwhile, your larger target model checks these guesses in parallel, verifying multiple tokens in a single forward pass. This parallelism is the game-changer. Instead of waiting for one token at a time, the system can validate several in one shot—massively accelerating output. To keep quality high, a clever step called rejection sampling compares token probabilities from both models: If the big model agrees (or is more confident), the draft token is accepted ✅ If not, the system rolls back and lets the large model generate the correct token 🔁 💡 The result? Same high-quality output, generated up to 2–4x faster. Why it matters: ✔️ Speed: Drastically faster inference—real-time responses are now within reach ✔️ Efficiency: Better GPU utilization by offloading to smaller models ✔️ Cost Savings: Fewer compute cycles, lower bills ✔️ Quality: No compromise—final output matches the large model’s gold standard Speculative Decoding is a practical, production-ready optimization. If you’re deploying LLMs at scale—or plan to—it’s a technique worth exploring. 🔍 Are you already using it? What’s your biggest challenge in LLM inference today? #LLM #AI #SpeculativeDecoding #MachineLearning #NLP #GenerativeAI #InferenceOptimization #DeepLearning #TechInnovation #ArtificialIntelligence
1
-
Jacob Warren
Rig AI • 7K followers
You don't need a separate reference model for RL. We built an RL pipeline built on a multi-turn-focused modification of Self-Distillation Policy Optimization (SDPO) that uses the same model in two roles: student and teacher. The difference? Context window. → The student generates actions with no knowledge of the outcome. The teacher re-evaluates the same tokens with hindsight: the observation from each action, terminal test output, and a successful rollout from the same prompt. The per-token advantage is just the logit gap between these two views. → This gives you dense, per-token credit assignment instead of a single scalar reward for the whole trajectory. Think tokens get 0.25× weight (loosely guided, not tightly constrained). Action tokens get full gradient. System/user/tool result tokens get nothing. → The "teacher" is not a frozen checkpoint or a separate reward model. It's your current policy, conditioned on richer information. This is well-defined under MoE because same parameters with different routing is normal operating behavior. → We use a tiered reward system with hard gates only for truly unrecoverable failures: sandbox escapes, test corruption, harness crashes, timeouts, infinite loops. Everything else like wrong directory, failed commands, incorrect approaches are learnable signal, not wasted compute. The key insight: most of the information you need for credit assignment is already inside the model. You just need to ask it the right question with the right context. On LiveCodeBench v6 SDPO reached 48.8% vs GRPO's 41.2% beating Claude Sonnet 4 with an 8B model! That's bonkers.
5
-
Ka Po Ng
Alberta Investment Management… • 742 followers
The New Interface Isn’t Visual — It’s Conversational On July 17, OpenAI didn’t just ship another feature. They dropped a signal: ChatGPT Agents. Not GPT-5 (yet), but arguably just as disruptive. At first? Clunky. Slow. Skeptics were loud. But step back, and you’ll see the bigger play: 👉 Replacing GUIs with AI-driven APIs. 👉 Turning agents into the new entry point to the internet. ChatGPT now handles 2.5 billion prompts/day — that’s already ~18% of Google’s annual search volume. And this is just the beginning. But it’s not just about language anymore. The real differentiator? → Agent architecture: sandboxing, execution, tools, planning, feedback loops. I outlined the current technical approaches powering general-purpose AI agents today. Full Details here: 🔗 https://lnkd.in/dCpUHBFe Missed something? Call it out 👇
5
2 Comments -
Sharan Shekar
T. Rowe Price • 208 followers
🚀 Stop treating AI as just a "faster autocomplete." Welcome to the era of **Agent-Native Engineering**. A new operating model is completely restructuring how software teams work. Instead of just helping you type, AI agents are now acting as actual contributors—taking on scoped tasks, opening pull requests, and iterating on feedback asynchronously. Here is what the new engineering workflow looks like: 🔹 **Simple tasks** ➡️ One-shotted by agents from a clear prompt. 🔹 **Manageable tasks** ➡️ Delegated to background agents with built-in review cycles. 🔹 **Complex tasks** ➡️ Kept with engineers using synchronous coding agents. The result? Engineers spend less time writing boilerplate code and more time doing what matters most: scoping problems, reviewing work, and shaping the product. 💡 Check out this great read from Cofounder on the future of software development: https://lnkd.in/dJX5phfz #SoftwareEngineering #ArtificialIntelligence #AIAgents #FutureOfWork #AgentNative #TechTrends #SoftwareDevelopment
4
-
Dr. Habib Shaikh, PhD (AI)
Northern Trust • 22K followers
Just getting started with LLMs? Don’t let the jargon slow you down. This 5-part LLM Glossary is your shortcut to mastering the fundamentals👇 1️⃣ Model Types • Foundation Model: Pretrained on large-scale, diverse datasets to learn general capabilities. • Instruction-Tuned Model: Further trained to follow user instructions precisely. • Multi-modal Model: Handles multiple input/output formats like text, images, and audio. • Reasoning Model: Optimized for logic, problem-solving, and step-by-step thinking. • Small Language Model (SLM): Lightweight models for fast, efficient, on-device tasks. 2️⃣ Training & Fine-Tuning • Pretraining: Initial learning phase using massive, diverse data sources. • RLHF (Reinforcement Learning with Human Feedback): Aligns model behavior with human preferences. • DPO (Direct Preference Optimization): Uses ranked preferences instead of rewards for training. • Synthetic Data: AI-generated data used to supplement real datasets. • Fine-Tuning: Targeted retraining on specific domains or tasks. • LoRA / QLoRA: Techniques for efficient low-resource fine-tuning. • Guardrails: Rules or filters applied to enforce safety, ethics, and compliance. 3️⃣ Prompt Engineering • System/User Prompts: Define model behavior and user context in a session. • Chain-of-Thought (CoT): Prompts that guide the model to reason step-by-step. • Few-Shot / Zero-Shot Learning: Demonstrating tasks with few or no examples. • Prompt Tuning: Optimizing prompts via training for specific outcomes. • Context Window: The total number of tokens the model can "remember" per interaction. 4️⃣ Inference & Generation • Temperature: Adjusts output randomness; lower values are more deterministic. • Max Tokens: Limits the number of tokens in a generated response. • Seed: Controls reproducibility of outputs. • Latency: Time taken by the model to respond. • Hallucination: When the model generates factually incorrect but plausible-sounding information. 5️⃣ Retrieval-Augmented Generation (RAG) • Retrieval: Fetching relevant data from external sources before generation. • Semantic Search: Search based on meaning, not just keywords. • Embeddings: Vector representations of text used for similarity matching. • Chunks: Dividing documents into smaller segments for better retrieval. • Vector Databases (VectorDBs): Stores embeddings for fast and accurate search. • Reranking: Rescoring retrieved documents to prioritize relevance. • Indexing: Structuring data for efficient retrieval during generation. 🌟 Follow the AIKaDoctor (Free AI & Data Science Resources) channel on WhatsApp: https://lnkd.in/dCTCEKKc 📌Follow Habib Shaikh For more such content.
232
36 Comments -
Arpit Tandon
CGI • 3K followers
This week, Claude’s coworker plugins (legal, product, research, etc.) shook SaaS stocks — with some calling it a “SaaSocalypse.” What’s really happening is simpler. They’re codified expertise — structured prompt libraries + agents delivering professional-grade work, without heavy SaaS UI. SaaS isn’t dead. But defensibility is shifting — toward domain depth, proprietary data, and distribution. #AI #SaaS #ClaudeCode #ClaudeCowork #SaaSocalypse
22
-
Andrew Ng
DeepLearning.AI • 3M followers
Introducing "Building with Llama 4." This short course is created with Meta, and taught by Amit Sangani, Director of Partner Engineering for Meta’s AI team. Meta’s new Llama 4 has added three new models and introduced a Mixture-of-Experts (MoE) architecture to its family of open-weight models, making them more efficient to serve. In this course, you’ll work with two of the three new models introduced in Llama 4. First is Maverick, a 400B parameter model, with 128 experts and 17B active parameters. Second is Scout, a 109B parameter model with 16 experts and 17B active parameters. Maverick and Scout support long context windows of up to a million tokens and 10M tokens, respectively. The latter is enough to support directly inputting even fairly large GitHub repos for analysis! In hands-on lessons, you’ll build apps using Llama 4’s new multimodal capabilities including reasoning across multiple images and image grounding, in which you can identify elements in images. You’ll also use the official Llama API, work with Llama 4’s long-context abilities, and learn about Llama’s newest open-source tools: its prompt optimization tool that automatically improves system prompts and synthetic data kit that generates high-quality datasets for fine-tuning. If you need an open model, Llama is a great option, and the Llama 4 family is an important part of any GenAI developer's toolkit. Through this course, you’ll learn to call Llama 4 via API, use its optimization tools, and build features that span text, images, and large context. Please sign up here: https://lnkd.in/gXKeipht
2,830
111 Comments -
Rhonda Coleman Albazie
PRIVILEGE HEALTH ™️ -… • 396 followers
Modeling approach (the part that beats “one-model” platforms) I run an ensemble (multiple models, one combined forecast) because housing is regime-based: Model A — Fundamentals forecaster (best day-to-day accuracy) • Gradient-boosted trees on engineered features (inventory momentum, price cuts, affordability shocks, etc.) Model B — Time-series trend + seasonality (keeps you from overreacting) • A forecasting model that understands seasonality and persistence (monthly/weekly cadence) Model C — Turning-point / regime detector (the “decline early warning”) • Change-point / hidden-regime model that spots structural shifts before prices fully reflect them Ensemble layer • Combines A/B/C with weights that shift by market regime Output = forecast + probability bands + “risk lights.”
-
Vijay Chattha
11K followers
The hits keep coming at VSC 💡 Inception Labs – Stanford professor+team raises $50M and builds a 10x faster and 10x cheaper AI coding model with the performance of Gemini Flash / Haiku. AI veterans Andrew Ng and Andrej Karpathy are among the investors, alongside Nvidia, Menlo, Microsoft, Mayfield, Databricks, and Snowflake. 🧠 Sema4.ai – launches its enterprise AI platform designed to streamline complex data and document workflows 🔒 Confident Security – debuts OpenPCC, an open-source standard that encrypts data fed into AI tools 📈 Noetica – offers perspective on tightening private credit terms and market risk 💊 Foresite Capital –Biotech 2050 podcast to discuss AI and investment strategy. ☀️ Sesame Solar – powers long-endurance drones and off-grid operations with its solar hydrogen nanogrid cc: Stefano Ermon, Daniel Wertman, Paul Codding, Ram Venkatesh, Jonathan Mortensen, PhD, Jim Tananbaum, Lauren Flanagan
51
1 Comment -
Pinaki Laskar
FishEyeBox AI • 33K followers
𝗔𝗿𝗲 𝗠𝗼𝗱𝗲𝗹 𝗖𝗼𝗻𝘁𝗲𝘅𝘁 𝗣𝗿𝗼𝘁𝗼𝗰𝗼𝗹 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝗳𝗿𝗼𝗺 𝗔𝗣𝗜𝘀? You may think APIs were enough for AI agents, Then when you looked into 𝗠𝗼𝗱𝗲𝗹 𝗖𝗼𝗻𝘁𝗲𝘅𝘁 𝗣𝗿𝗼𝘁𝗼𝗰𝗼𝗹 (𝗠𝗖𝗣), you realize APIs are powerful, but they weren’t built for how LLMs operate. • 𝘼𝙋𝙄𝙨 𝙖𝙧𝙚 𝙜𝙚𝙣𝙚𝙧𝙖𝙡-𝙥𝙪𝙧𝙥𝙤𝙨𝙚 great for structured system interactions, but they rely on pre-defined endpoints, and agents need custom adapters to use them. • 𝙈𝘾𝙋 𝙞𝙨 𝙥𝙪𝙧𝙥𝙤𝙨𝙚-𝙗𝙪𝙞𝙡𝙩 AI agents can ask what tools or data are available at runtime, and everything is exposed in a consistent, machine-readable format. One interface, Multiple services, No redeploys. Many MCP servers are just wrapping traditional APIs - but with a layer that actually speaks agent. Over the past few months, a range of protocols have emerged to standardize how AI agents communicate, collaborate, and function. As systems grow more autonomous, having defined protocols is no longer optional — it's essential. This 10 of the most prominent AI agent protocols shaping the future of intelligent systems: ➤ ACP (IBM) – Defines standard interfaces for agent interaction and lifecycle management. ➤ AGP (Industry) – Enables message transformation and access control between agents and external systems. ➤ A2A (Google) – Powers structured communication in multi-agent environments like Gemini and Project Astra. ➤ MCP (Anthropic) – A unified model context protocol for memory injection and tool usage in LLMs. ➤ TAP (LangChain) – A standardized JSON schema for tool abstraction and execution. ➤ OAP (Community) – An open protocol for framework interoperability across agent runtimes. ➤ RDF-Agent (Semantic Web) – Leverages SPARQL and linked data for semantic agent communication. ➤ AgentOS (Proprietary) – Provides a runtime stack for long-lived, enterprise-grade agents. ➤ TDF (Stanford) – Task schema for modular prompt planning and goal coordination. ➤ FCP (OpenAI) – A standardized way to invoke LLM functions with schema enforcement. Each protocol addresses a unique layer in the agentic AI stack — from communication to memory, task coordination to function invocation. 𝗜𝘁’𝘀 𝗻𝗼𝘁 𝗔𝗣𝗜𝘀 𝘃𝘀 𝗠𝗖𝗣 - 𝗶𝘁’𝘀 𝗮𝗯𝗼𝘂𝘁 𝗺𝗮𝗸𝗶𝗻𝗴 𝗔𝗣𝗜𝘀 𝘂𝘀𝗮𝗯𝗹𝗲 𝗮𝘁 𝗔𝗜 𝘀𝗰𝗮𝗹𝗲. #LLM #MCP #APIs #AIagents
3
-
Jan Beitner, PhD
Inflexion • 3K followers
#AI coding #agents are changing how I evaluate vendors. A strong argument for products is if they can be configured purely through code and #API / #MCP. Not because they have their own AI features - but because they let your AI agent do the work. Lightdash is a great example. It is not the most mature BI tool, but probably one of the most AI-ready - because they made all data and configuration accessible via API and gave AI agents the context to act on it, so Claude Code can end-to-end create dashboards by modelling data in the data warehouse and then build the visualizations. If your product is not API-first, AI cannot bring in outside context or act autonomously within it. That is becoming a real competitive disadvantage. https://lnkd.in/e6Gp3MGj
53
14 Comments -
Bunty Shah
MSCI Inc. • 4K followers
[AI paper] GRPO has a hidden flaw in Multi-Reward settings. It’s time to decouple your normalization. 📉 As AI Architects, we are moving from single-objective RL (just "get the answer right") to multi-objective RL (accuracy + format + length + safety). We typically sum these rewards and throw them into GRPO. A new paper from NVIDIA, "GDPO: Group reward-Decoupled Normalization Policy Optimization," demonstrates that this naive summation causes Reward Signal Collapse. The Architectural Failure Mode: If you sum disparate rewards (e.g., a binary Format reward and a scalar Accuracy reward) and then normalize, different raw reward combinations can map to identical advantage values. Example: A rollout with rewards (0, 2) might yield the same normalized advantage as (0, 1) due to group statistics, effectively deleting the gradient signal for the second objective . The Solution: GDPO The fix is architectural: Decoupled Normalization. Instead of normalizing the sum, GDPO normalizes each reward objective independently within the group before aggregation. Impact: This restores the resolution of the training signal. On Tool Calling and Math Reasoning tasks (DeepSeek-R1/Qwen), GDPO converges where GRPO fails or plateaus. If you are training agents that must balance strict formatting (JSON) with reasoning quality, this is the correct objective function to use. 👇 Link to the paper in the comments. #AIArchitecture #RLHF #NVIDIA #GRPO #DeepLearning #LLM #Alignment #Research
21
1 Comment
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top content