What happens when the big AI companies need to charge $2000 per month for their model? What happens when the open source model I can run locally on $4000 hardware is 50% as good as the big AI companies' models? What happens when both these things are already true and become apparent this year?
The real question is what happens when the economics matter. If a model is $2000 per month then it needs to reliably offer more in value. If you can run a models locally that are half as good, then half as good needs to be good enough. The realer question is what's going to happen if we rely on cheap models so much that all alternatives go out of business? The answer is nothing good. If we give companies an effective monopoly on intelligence then don't be surprised if they start treating the rest of us as moochers.
$4,000 hardware? That's a max of about 96GB VRAM, the models that are closes to the big AI companies' models are 400B or larger. I'm running on $20,000 of hardware and I still can't fit good quants of the best models without spilling onto RAM.
About a year ago the cost to run DeepSeek locally on GPUs was about $55k for the cheapest hardware I could find. Mostly trying to get video cards with a total of 800Gb video RAM and a motherboard and case to hold the cards. Apparently you can get it to limp along and be as bit slow with a desktop system with 4 terabytes of RAM and interleaved ram modules. Which you can get for about $8k Consumer grade video cards with 128gb of VRAM could cut the cost down significantly, but this don’t exist outside of the black market where people are modifying consumer video cards with more RAM
The bottleneck is compute, hence hyperscalers popping up like a troop of moles, and mega-scale data centres from companies such as xAI and OpenAI. However, there are diminishing returns with infinite compute, so the key to sustainability is in the algorithms. The Chinese were perceptive to this constraint so you have the likes of DeepSeek optimising for efficiency on lower compute. My prediction is that large AI companies will be forced to pivot towards algorithm optimisation so LLMs will organically get cheaper. They simply have to for AI to become part of the fabric of society, which it is not currently.
John Crickett - It will happen, but by that time, Agentic AI will have achieved greater maturity and wider adoption. Without a local orchestrator approach, Agentic SDLC won’t achieve the maturity required for real-world or enterprise scenarios. While tools like Codex, Copilot, or Claude Code are great for prototyping or a few iterations, managing the asymmetric change paths in software becomes far more challenging without a genuine agentic setup. So, the local plus smart model approach will always win, even though a skilled AI professional at $2000 per month is several times more cost-effective than a pair of hands doing the same work.
On prem AI. I am working on concurrent drivers similar to database drivers for LLMs. Tokens in, tokens out. Between the efficiency gains and the on-prem security, I believe my API <-> LLM integration is a winning architecture. Ultra low or even on-box network latency for LLM inference mainlined into the API. No tokens. Adaptive throttling and capacity management. I'm 100% betting the farm on AI companies raising prices.
$2000/mo and $4000 hardware are mutually exclusive. They both cannot happen. Similar to a question: what will happen when cloud will charge you $2000/mo for a server which you can buy for $4000? Impossible.
you do realize advancements in hardware development are only making llms cheaper right?
This framing is problematic. Would you invest capital, deploy infrastructure, or ship systems to customers based on something that’s “50% as good”? There is no such thing as 50% correct in real engineering. In production: - It either works or it doesn’t - It’s either safe or it isn’t - It’s either correct or it causes damage So, “50% as good” doesn’t mean cheaper intelligence to me. It means cheaper mistakes at scale. What that framing really admits is uncertainty; but without the discipline to measure, surface, or constrain it. And uncertainty that’s hidden behind confidence is far more dangerous than expensive compute. If we normalize “good enough,” we normalize failure modes we won’t be able to unwind later. As engineers, we should be raising the bar for correctness — not redefining failure as acceptable because it’s affordable. #LoopEiOS
GitHub•2K followers
2mo> What happens when the big AI companies need to charge $2000 per month for their model? Same thing that happened when any vendor has ever charged a lot of money for services or seats. Ask anyone who’s ever had ERPs, legal knowledge bases, anything from Oracle, LMS software, enterprise grade cloud services, and many cybersecurity software licenses. > What happens when the open source model I can run locally on $4000 hardware is 50% as good as the big AI companies' models? Buy an $8,000 hardware rig… > What happens when both these things are already true and become apparent this year? Those who still see tangible value after cost is calculated will swallow the cost of doing business, just as we’ve always done since time immemorial. I thought this was obvious.