I Know We're in an AI Bubble Because Nobody Wants Me 😭 https://lnkd.in/gd_mjmFJ
There's a whole (lost) generation of engineers who learned that constraints are where creativity lives, that understanding the full stack creates value, that optimization is a social act embedded in teams. And now they're watching capital flow past them toward pure hardware accumulation, as if buying more GPUs is a substitute for knowing what you're doing with them.
You are the signal in the noise Pete, efficiency always wins out longer term. It’s human nature to ask us to “do more with less” the current approach feels “do more” at all costs.
Great post, Pete Warden — your point about #GPU spending turning into a signaling game really resonates. Efficiency will matter again, though NVIDIA’s work on #EfficientAI (e.g., Song Han) shows they’re preparing for that shift. One observation from the field: the level of intelligence people now expect from AI/LLM systems is rising far faster than edge compute can evolve. VLMs, long-context LLMs, and VLA-based agents simply exceed what CPU/NPU-class devices — even clustered ones — can handle. In practice, GPU clouds are just massively distributed compute nodes, and telecom vendors are moving the same way. Nokia and Ericsson, for example, are starting to treat base stations as “AI compute nodes” interconnected by ultra-low-latency fabrics. That shifts the efficiency frontier from individual chips to cluster-to-cluster communication. GPUs surpassed CPUs because internal data movement was more efficient; the next decade will be shaped by #NVLink, #InfiniBand (Mellanox Technologies), CXL, photonics, and AI-native base-station fabrics. Efficiency will win — but the efficiency that wins will be networked compute efficiency, not chip-level performance.
Compression work goes in cycles of ups and downs. Sooner or later when the outsized gains are no longer available, people will care about cost and efficiency. That said, there was a huge community around on device llama, local llm, and compression. We are still very actively exploring the boundaries of compression. +1 to the observation of the prevailing (mis)conception of cost == moat. Cerebras, clarifai, flower labs etc, for instance, are breaking this view. Keep up the good work Pete.
One of the most intresting post and prespective on the state of AI, as we move into 2026, with the fear of AI bubble. Go!!! #EdgeAI
The mega cloud caps needs good edge partners to continue their run 😉
1. I am sure companies who run their own llm inference would be happy to pay to optimize cost. Integration might be a blocker though 2. Often optimization needs to make assumptions about the process it optimizes and its possible some people think that making too many such assumptions at this stage is premature and that in 6 months these may break
Not sure who you hang out with these days but basically all the ml infra folks i know (especially og's like yourself) are compensated at top of market.
Maybe because you’re too early…? Bubbles have become the most likely way for investors to make a quick fortune, if they enter early and start pumping. And building up generational wealth has come out of fashion. Your time will come after the bubble pops. The growing over reliance on LLMs for coding will make sure of that.