I think Anthropic and OpenAI have found product-market fit

924

ssimonw about 20 hours ago 1039 commentsRead Article on simonwillison.net

HI version is available. Content is displayed in original English for accuracy.

⚡ Community Insights

Discussion Sentiment

75% Positive

Analyzed from 33693 words in the discussion.

Discussion (1039 Comments)Read Original on HackerNews

trjordan•about 19 hours ago

They've got, ballpark, $5t to $10t to make back in the next 5 years, or the hardware buildouts will start getting written down.

This means we're going to need $1t+ per year in spending, per year, on tokens. 200m knowledge workers in the world, 30m developers. We're talking about a world where you need 5% of every knowledge workers salary to go into tokens. 20% if you're a developer.

That's a _huge_ shift. Most people I know cite +20%-40% velocity with these tools, against the actual work their company cares about doing. +20% speed for +20% spend isn't going to motivate a trillion dollars a year in spending.

We're not there yet. This is still the upswing of the hype cycle, and unless we figure out how to make developers 2x, 5x, 10x as productive on stuff that matters, this isn't going to play out well.

whatshisface•about 18 hours ago

Here are a few thoughts:

- The publicly available information about how inference costs compare to training costs is conflicted. EEs involved in datacenters talk about power usage spikes during training runs as if they were a major factor in the designs, but academic papers discussing cost-optimal scaling confidently treat inference-time compute as a major factor.

- On the side of the balance indicating that training is more compute-intensive after amortization than inference is that Chinese providers, constrained primarily by access to compute, have nearly unlimited token availability at a lower price than US providers (inference), but poorer model capabilities (training). That would make sense only if US providers are inflating inference costs by 20-30x due to amortized training costs that overseas providers were not able to take on (there are other factors too).

- If training >> inference, they're in a prisoner's dilemma that far exceeds the ordinary zero-marginals model of competition between firms (due to its huge discrete stepwise nature). On the other hand, if inference>>training, the high-level analysis popularized by certain thought leaders, that it's like a utility, would be true. You'd tend to count this as a vote for inference>>training, but the CEOs saying it at least have a huge incentive to agree because the alternative, the prisoner's dilemma, would stop investment very fast.

- The only voice in the story that I just told you to have anything to do with fact (as opposed to high-level analysis and ivory tower armchair management of a secretive business) were the rumors from facilities engineers. That shows you the state of our understanding...

- If we don't even know the ratio between amortized capital expenses and operational costs, outside investor analysis is impossible. It doesn't matter how finely they divide the accounting buckets for office ferns and indoor ferns if the single biggest part of their business is obscured for trade secret reasons.

materielle•about 18 hours ago

I'm about to leave a shallow comment, but I am a bit skeptical of the supposed drop in inference costs. If AI labs saw a lot of potential there, they'd surely be bragging about it non-stop? So the fact that publicly available information is conflicted is probably a sign that at the very least, the numbers aren't amazing.

Yes I know there's no evidence and this is lazy reasoning. But there's probably a bit of truth to this line of thought.

Tuna-Fish•about 17 hours ago

Why on earth would AI labs be bragging about how little the product they sell actually costs them to make? You don't want to do anything that reduces it's perceived value to the user, that might make them less willing to pay for it.

Also, inference costs are bound to go way down with more optimized architectures. GPUs are fundamentally not great at inference. No platform where the weights are streamed from a large pool of memory is. If the models ever quiet down, there will be massive step changes in cost/token, energy/token and tokens/second, as models are etched into silicon ala https://chatjimmy.ai/

whatshisface•about 18 hours ago

Inference has traditionally been far less expensive than training. One public example is the fact that hobbyists can run StableDiffusion ($600k training costs[1]) on their personal computers.

Speaking to your point, inference being dramatically less costly than training would not be seen as a delta from the norm. The model of providing inference for anything near the operational costs (like a utility would), would the delta from the norm if it were true.

[1] https://x.com/emostaque/status/1563870674111832066

lumost•about 15 hours ago

For equal capability tokens, there has been about a 10x drop in cost every 6 months.

We are still chasing the best because the best is moving rapidly, but it’s a simple thought experiment to work out what the cost to serve an 8B model from 2 years ago is in a world of 2T models.

Note: parameter counts are illustrative. Concretely, qwen3.6 27B delivers opus 4.5 capability at 1/27th the cost on openrouter. Single chip llama3 8b performance can exceed 17k tokens/sec.

no-name-here•about 9 hours ago

> I am a bit skeptical of the supposed drop in inference costs. If AI labs saw a lot of potential there, they'd surely be bragging about it non-stop?

Unless to the grandparent commenter’s point they’re using it to obscure their large prisoner’s dilemma (training) cost?

neuronexmachina•about 11 hours ago

> If AI labs saw a lot of potential there, they'd surely be bragging about it non-stop?

Google seems to pretty regularly post about how their TPU and algorithm advancements have been decreasing energy costs for both inference and training.

vlovich123•about 16 hours ago

Small alternative potential future changes that alter this analysis:

* At some point model capability reaches diminishing returns. Then inference >> training in the future but training >> inference now. It’s not a prisoner’s dilemma but a land grab to solidify market position and be one of the 2-3 firms left standing as dominant in the space. The model companies aren’t super sticky yet but they’re working on it.

* even if training remains >> inference, it’s possible to have multiple price points like they do today. If you need the most capable model you’ll be paying exponentially more per token to supplement the training cost even though the serving cost is marginal because most people will be satisfied with cheaper / less capable models for most tasks.

I buy that inference is a dropping line item while training is a growing one. There’s all sorts of things on the horizon that’ll be order of magnitudes improvements, from startups burning models into ASICs to get order of magnitudes more performance to alternate architectures like diffusion transformers that have orders of magnitude structural optimizations. It’s inevitable that it’ll come down even further from where we are. It’s possible model training also will go down but I’ve not seen any compelling research suggesting major “easy” reductions here.

janalsncm•about 15 hours ago

The issue is that most tasks do not require frontier-level intelligence, but companies like OAI can really only profit off of the frontier. Capabilities from a year or two ago are so outdated that even OpenAI gives it away for free and there are many other models biting at their heels. In other words they are spending huge amounts of money to cash in on a depreciating asset.

So one possible future is that frontier-level training becomes so expensive and the use cases so sparse that it simply isn’t viable to keep going bigger.

zozbot234•about 3 hours ago

Why would power spikes from training runs imply training>>inference? The cost of a training run scales with energy, whereas power is energy per unit time. All that tells you is that they're speeding up their training run so it will take less time overall (probably chasing some first-mover advantage, where they're out with a given model before their closest competitors), whereas they obviously can't do that for inference (which is a steady flow of requests over time).

twobitshifter•about 11 hours ago

We have GPU costs, power costs, and how many token/s models can generate on those GPUs. It’s possible to figure out the marginal cost based on this. The current estimate is about $0.40 per million tokens for gpt4 equivalent model. Sonnet 4 is $15 per million tokens, so they are charging high margins on inference. The issue is how large of a margin is needed to recover their costs before the GPUs age out, and how high of a margin can be charged before it’s not economically viable.

https://www.gpunex.com/blog/ai-inference-economics-2026/

rudedogg•about 10 hours ago

That seems way off to me.

I skimmed the article, but couldn’t spot any details on their estimates. They mention 70b+ params as being large in several places. But we’ve had several 100b+ param models that trail Sonnet.

IX-103•about 10 hours ago

I don't see how it would be possible for inference costs to dominate training costs, even after amortization.

Training involves multiple passes over the entire training dataset, ideally in large batches where you can perform inference on as many samples as possible simultaneously and then perform backpropagation to adjust the model weights (which is about as expensive as inference).

Let's consider the size of the dataset we're dealing with here. The dataset likely consists of practically every piece of digitized text they can get their hands on (including that extracted from audio and video). We know Google has digitized a large portion of the books in existence as part of their "search book contents" feature and we have no reason to believe they're not using it alongside their cache of 90+% of the internet to train their models. We're talking about 100s of millions of books each with an average of 100,000s of tokens. The internet has 10s to 100s of billions of pages on it with who knows how many tokens on average. This is a huge dataset that we've got to go through hundreds of times.

Second, let's consider the effect of batching and how it sets requirements for our hardware. We know that larger batch sizes converge faster, are more stable, and produce better models. So if you want a good model you need large batch sizes. This means that you need machines several orders of magnitude more powerful than you use for inference. From what I heard Google uses clusters of 100s of the their TPUs all located in a single rack for training. These clusters are organized in a customized computing architecture to maximize memory locality between cores (really critical for efficient back-propagation). Further, you can't use reduced precision weights for training like you can for inference, so there are no shortcuts.

Finally, the initial training stage is followed by reinforcement learning stages - this is key development in how AI models have improved in the past year. This may mean going through a curated set of traces (either synthetic or captured from users) and adjusting the weights based on experienced outcome.

Overall there's so many orders of magnitude more work and more hardware requirements for training that I find it improbable that inference dominates. The number of "inference" steps in training is freaking ridiculous and includes such factors as the "number of words ever written".

atq2119•about 8 hours ago

It's been a while since I saw a detailed paper on a high end training run, but extrapolating from what I remember, it seems those training runs are in the 10s of trillions of tokens. This already accounts for potentially sampling tokens multiple times during the training run.

That seems like a large number, until you realize that OpenAI claims to have almost a billion weekly users. And OpenRouter shows many models at over a trillion tokens per week.

So in pure token terms, I'd say it is in fact extremely plausible that inference dominates, at least for the popular models.

johnecheck•about 7 hours ago

Not saying you're wrong, but I'll note why inference might dominate despite everything you mentioned.

A given model is trained once but applied N times. A large enough N will dominate training, no matter how complex and costly it was.

But how long is a model useful for? How often will labs need to train new models? Time will tell.

upbeat_general•about 7 hours ago

This statement is well known to be incorrect for at least a year.

stevenally•about 12 hours ago

> If we don't even know the ratio between amortized capital expenses and operational costs, outside investor analysis is impossible.

And yet we surely need this data for the IPO? Or are they relying on rule changes on the indexes to force ETFs to buy shares?

somewhereoutth•about 13 hours ago

Yes the huge discrete stepwise training spend is critical.

Maybe investors will realise that "the only winning move is not to play".

And so we are left with (as was) frontier models getting more and more out of date as whoever their post bankruptcy custodians are tries to eek pennies on the dollar for inference on their decaying property. Perhaps along with local and/or highly specialized models still feeding on the after-glow of the huge amount of training that was (and is no longer) done.

The next AI winter is going to be deep, savage, and long.

galaxyLogic•about 13 hours ago

> frontier models getting more and more out of date

Why are they getting out of date? Is it because we have new content from the internet that the older models did not have? Or are we simply trying to increase the size of the training data? In other words not more up-todate in terms of time the content was created vs. wanting to use bigger training-input-sets?

FuriouslyAdrift•about 17 hours ago

I work for a tiny little company ($150MM annual rev with 9% net) and we are already looking at dropping $100k on hardware to run local models because, for us, they're "good enough."

Our estimated spend for AIaaS would exceed that cost in less than a year.

In a few years, there will be hardware capable of running frontier models good enough for most things at accessible prices for even tiny companies.

simplyluke•about 17 hours ago

Yeah, that's the part that just seems to be wildly under-discussed to me.

If open source models are ~3-6 months behind SOTA, and ~opus4.6 capabilities are good-enough for product market fit, do the frontier labs have half a decade to catch up on their prior burn?

AI cost ballooning faster than companies can afford is becoming a very common topic in my circles right now. The era of "I'll pay infinitely more for marginal gains" is over from what I can tell.

an0malous•about 11 hours ago

> If open source models are ~3-6 months behind SOTA, and ~opus4.6 capabilities are good-enough for product market fit, do the frontier labs have half a decade to catch up on their prior burn?

They know they do not and that’s why they’re all trying to IPO right now, so they can pass the bag to consumer investors

doug_durham•about 17 hours ago

Open source models that you can run locally are much more than 3 to 6 months behind. 6 months was the November inflection for Claude. No open source model is as good as Claude Opus 4.6.

svara•about 17 hours ago

There's still a lot of room for the best models to get better at coding .

Your argument rests on the "for marginal gains" part but it's really not clear that the gains are marginal in the foreseeable future.

swalsh•about 15 hours ago

Open source models, especially qwen are pretty dang good. But its not opus 4.6, the evals dont tell the full story. I question the assumption open source models are 3-6 months out.

vessenes•about 10 hours ago

You have to think about why open models are behind. Exfiltration is a big part of it. So you could change the Nash equilibrium by increasing your security, or other multilateral approaches.

w29UiIm2Xz•about 16 hours ago

If only the AI era was born in ZIRP.

EvanAnderson•about 17 hours ago

> ...we are already looking at dropping $100k on hardware to run local models...

Just think how much further that $100K would have gone if the hardware market wasn't so screwed-up.

Anecdote: I priced-out adding 1TB of RAM to a four node cluster a couple months ago. The cluster was purchased in fall of 2024 w/ 4 nodes, each with 256GB RAM. The nodes cost just over $14K apiece back in 2024 (entire box, not just the RAM).

Dell wanted >$90K a couple months ago to add 256GB to each node.

cyberax•about 16 hours ago

> Dell wanted >$90K a couple months ago to add 256GB to each node.

RAM is expensive, but not THAT expensive. I just bought 128Gb for about $5k for our build cluster (it's not even for AI, sigh). Even if you need larger-sized DIMM sticks, it's still going to be in the vicinity of ~15k tops.

rstuart4133•about 15 hours ago

I get the impression the hive mind hasn't come to terms with the point that a model is optimised for certain tasks. It's like having someone ask you "is that a good hammer?". Good for what? There are claw hammers, sledgehammers, ball-peen hammers, club hammers, mallets, .... Yes, in a pinch, they can all bang in nails, but you wouldn't choose a dead blow hammer for that if you had a choice.

The Gemini Flash is very good at searches. Just about any low end model can toss out a poem. All the higher end models (open source and otherwise) seem to be able to churn out code that passes tests. The smaller, "less capable" ones are much faster at it, which means in the hands of a skilled practitioner are the best choice for that task. But they rapidly fall apart where there isn't a hard source of truth (like a good test suite) to grind against. Because of that you have to use a bigger model for bug finding. In that task the open source models tend to fail on larger code bases, where something like Opus still shines. I gather Mythos is an absolute monster, and unparalleled, and unavailable. I'm sure one of the reasons for that is it's so expensive to run.

Or to put it another way - you don't use a 100 tonne crane to pick up the shopping. And ... the smaller models will happily run on in-house hardware. You may not do it today because of the current DRAM price and integrated NPUs have just started shipping, but in 5 years time models will be running on your phone.

safety1st•about 7 hours ago

Yes 100% this. A lot of people keep talking about how OpenAI and Anthropic will need to raise their prices. What is less discussed is how they CAN'T raise their prices because competition exists, and sure it's not SOTA, but it's literally an order of magnitude cheaper in many cases and the drive to figure out how to make it work well enough is going on right now (and will only intensify when the SOTA models raise their price).

It's a given that the SOTA models need to raise their prices. It's also a given that they can't. The more they raise the more customers will move to their competition.

So what happens next? Well I think it will suck horribly if you can't move off of SOTA sooner or later, because the Big Two are going to lose customers, and therefore have to raise prices on the locked in customers even more than these projections suggest.

Beyond that if you're looking to start a business, figure out how to use cheap models in new scenarios. Build software which does that and license it. This is kind of contrary to the idea that you shouldn't over optimize for deficiencies in the models that will likely go away in the next generation - for instance a lot of problems were solved when context windows got way bigger. So it's a thin line to walk but I think it's there because a lot of orgs are using Claude today for pretty basic tasks.

The dev who's addicted to SOTA models honestly is going to have to settle for less or get totally screwed. Most applications within business from what I see aside from complex research do not require SOTA. They summarize, they classify, they transform, and doing that accurately has been cheap for a while.

MASNeo•about 16 hours ago

On prem AI makes sense for more than just the cost. More control, IP, model improvements you can keep, data privacy to name a few. People will realize that AI is not like compute the moment they get their own knowledge sold back at a premium.

amelius•about 1 hour ago

> People will realize that AI is not like compute the moment they get their own knowledge sold back at a premium.

But what if your competitors sell their knowledge to AI companies?

Then you're still screwed.

fragmede•about 14 hours ago

What are the advantages to on-prem for a company that's already in the cloud and trusts it with their IP? That company can just rent GPU instances from the cloud if they want to train/fine-tune their own models and keep avoiding CapEx.

stopachka•about 15 hours ago

I don't quite understand, what would 100K buy you?

AFAIK you would get about ~5 concurrent users, with a max context window of ~128K tokens on the larger models.

This wouldn't be good enough for coding -- are you guys thinking of using it for something else?

bendews•about 8 hours ago

By my calculations 100k could get you 18 5090's + compute to host them, or 18 96gb Mac mini's. You can get a lot of context window and users out of that setup.

cmdrk•about 17 hours ago

Do you think this will be a trend for larger companies as well?

The decadal move to all-cloud-all-the-time killed off in-house hardware teams while the C-suite chased their OpEx dreams.

It would be interesting if we come full circle on this.

fragmede•about 14 hours ago

I doubt it. Companies that have moved to the cloud are already trusting the cloud with their IP. You can rent time on a high end Nvidia system from various clouds. OpEx means there's no write down in three/five years as that system goes out of date so it would only make sense if the performance/$ is there, or the company is highly protective of their IP and doesn't trust the cloud, at which point they're not on the cloud anyway.

fittingopposite•about 6 hours ago

Agree. You have these tipping points when a model is good enough to do some task. Yes, a better model will further improve your capabilities but the unlock is at a certain intelligence level. We see this also with humans. People with very low intelligence can't learn to read. Once you cross a certain threshold of intelligence you can learn to read. More intelligence doesn't really help you in the task of reading. A person with an IQ of 160 is not substantially better in reading than someone with an IQ of 85. If your IQ is 50, you might not be able to learn to read at all.

teaearlgraycold•about 5 hours ago

Have you considered that a smarter person will understand what they have read better?

physicsguy•about 4 hours ago

My much larger company has got people already using various models through Bedrock because the Claude and OpenAI limits are too harsh and it's too expensive.

mv4•about 17 hours ago

I configured a dual DGX Spark cluster, and it's certainly "good enough" for my agentic and coding needs.

datadrivenangel•about 17 hours ago

what models are you using on that? My experiences with apple hardware have convinced me that it is not really good enough for coding locally.

slashdave•about 10 hours ago

It might be possible that in a few years someone will be able to engineer a reasonably priced machine to run today's frontier models (hint, your price is an order of magnitude off). However, they won't be able to run the frontier models that will exist in a few years.

alex_suzuki•about 17 hours ago

I’m curious: are you spending on beefy developer machines, or some kind of shared local inference server? Would be interested to know more if it’s the latter.

irishcoffee•about 17 hours ago

I am aware of at least a handful of companies doing the latter. I don’t work for them and cannot speak to their setup.

arbuge•about 17 hours ago

> In a few years, there will be hardware capable of running frontier models good enough for most things at accessible prices for even tiny companies.

What makes you so confident about this prediction? Hardware costs haven't exactly been cratering recently.

sofixa•about 15 hours ago

.> Hardware costs haven't exactly been cratering recently.

No, but local models have been booming in performance/quality improvements. The RAM shortage won't last forever (more supply will come online when if demand doesn't diminish), and then the math would be pretty easy.

try-working•about 12 hours ago

What about using DeepSeek API? Practically free.

amelius•about 2 hours ago

Eh, one question. Where do you intend to buy the hardware if datacenters take over the market?

disiplus•about 16 hours ago

same, but you need more then 100k of hw to run something like kimi k2.6 for a bigger team. on the other hand there is a ds4 flash that you can run on a macbook with 128gb ram. an that one is perfectly usable for a lot of tasks.

https://github.com/antirez/ds4

nonethewiser•about 17 hours ago

What models? Last I tried different local modals there was a pretty big difference from frontier.

AndrewKemendo•about 8 hours ago

That’s exactly where the market is heading and it’s going to have to reckon with this fact

My guess is there’s gonna be some legislation or something “you can’t share anything over this level of complexity” and I think that that’s what a lot of that mythos rattling was all about

wyager•about 12 hours ago

> there will be hardware capable of running frontier models

The current frontier? Sure. The frontier then? No - obviously that frontier is going to keep consuming available datacenter compute capacity, which will be better

ai_fry_ur_brain•about 12 hours ago

You people are delusional. How many times a day am I going to read this fiction of "good enough in a few years for most things".

There are physical limits to how much you can compress data and how much is needed for a capable model. If by hardware capable for running SOTA you mean a 7 figure investment for a company, than sure. But how come these companies didnt do the same thing for cloud? There's been this option for self hosting infrastructure for a decade but companies don't use it, they pay AWS.

awesome_dude•about 17 hours ago

> In a few years, there will be hardware capable of running frontier models good enough for most things at accessible prices for even tiny companies.

I was going to say - the models are just going to keep growing at a pace exceeding the pace of hardware pricing/availability

But then I realised that, far more likely, there will be a plateau reached (again) where nobody is seeing gain, and at that point hardware will catch up

alexpotato•about 17 hours ago

I was in college in the late 1990s/early 2000s and I distinctly remember an econometrics professor state the following:

"As cable TV and Pay Per View came out, there were studies done about how many movies people would watch if given unlimited access to films. The results were bandied about as proof that we should build out all this infrastructure to support this line of business. When the data was further analyzed by statisticians etc, it turned out that people claimed they were going to watch films 10-12 hours a day, every day of the week. Impossible."

I feel like we are in a similar boat here where some people are assuming:

- EVERYONE is going to be using max tokens

- tokens will NEVER get cheaper due to improvements in hardware, software, design, market forces etc etc

protocolture•about 13 hours ago

>I feel like we are in a similar boat here where some people are assuming: >- EVERYONE is going to be using max tokens >- tokens will NEVER get cheaper due to improvements in hardware, software, design, market forces etc etc

I feel like the reverse assumption is being made, that the current model looks like IBM doubling down on Mainframes soon to become cheap enough to deploy everywhere, when the real action is that the costs coming down represents cheaper hardware or more efficient software, and that a big chunk of "cheaper" AI will be eaten by smaller products deployed by individuals. Whatever the Personal Computer of AI looks like is going to be more disruptive than just an API endpoint you can fling tokens at.

We already see this with things like chrome auto installing an LLM.

You cant tell me with complete certainty that theres a moat here for the people spending 1 trillion + on this infra.

>When the data was further analyzed by statisticians etc, it turned out that people claimed they were going to watch films 10-12 hours a day, every day of the week. Impossible.

I also think this applies to people suggesting that companies will sack workers for AI, when the costs of replacing everything someone does in a day is more expensive in terms of tokens (likely even at a reduced price) than just hiring a bloke.

hintymad•about 11 hours ago

> it turned out that people claimed they were going to watch films 10-12 hours a day, every day of the week. Impossible."

I realized it long ago: one needs output to make meaning. Input can only be the cherry on a cake in one's life. That, actually, makes FIRE or Fat FIRE not so sustainable unless one has other hobbies.

lmm•about 11 hours ago

> it turned out that people claimed they were going to watch films 10-12 hours a day, every day of the week. Impossible.

And what happened? How many hours per day/week are people spending watching now?

mrandish•about 10 hours ago

> they were going to watch films 10-12 hours a day, every day of the week. Impossible.

A lot of these LLM demand scaling scenarios make broad "up and to the right" assumptions about things which in practice have finite limits. Only some percentage of knowledge work benefits from acceleration, optimization or other improvements, and even then the amount of economic gain is capped.

j-bos•about 16 hours ago

But isn't it wonderful that they did?

wizzwizz4•about 15 hours ago

It's vaguely disturbing that people "watch" films 10-12 hours a day. Many of them are using it as a radio, for background noise, without really caring what the program is beyond vague genre, tuning in and out without particular regard to the plot… and yet we have all the cost of transmitting high-resolution video point-to-point.

Surely we could just put better stuff on the radio, and accomplish most of the same goals for a far lower price?

PunchyHamster•about 16 hours ago

> - EVERYONE is going to be using max tokens

anthropic already hunts down OpenClaw users for using too much on their plan.

I'll give different example: When LED lights started to be more popular, the power usage didn't drop by the amount of power saved

>- tokens will NEVER get cheaper due to improvements in hardware, software, design, market forces etc etc

Well, first, improvements in computing stalled or even rolled back just purely because price of everything compute shot up cos of AI and that will NOT be fixed for a while and ESPECIALLY if AI usage will continue to increase

Second, the token per model might go down in time but better models have more expensive tokens, so we quickly get into spot when:

* price increase in token might not be worth marginal improvement next, better model brings

* more and more models are passing "good enough for the task" threshold so for less and less companies there is any economic sense to pay for the "best" instead of paying deepseek or some other company to run "previous gen" models

regularfry•about 18 hours ago

The bottleneck has moved from producing a thing that works to knowing that the thing was the right thing to build. The more of the latter they can take on, the fewer knowledge workers are needed at all. So rather than 5% of every knowledge worker's salary going into tokens, 100% of the knowledge worker's total employment cost goes into tokens and you get a 20x productivity boost as a theoretical minimum across those tasks.

That's the game. There's a view you could take of this that this is just a growing of the pie: with those cost dynamics a lot more "small businesses" get a vast amount of leverage, so the overall economy grows without replacing the knowledge workers. I'm not sure I trust the MBA class to have that view.

seanp2k2•about 18 hours ago

>The bottleneck has moved from producing a thing that works to knowing that the thing was the right thing to build

I would argue that that's been the case for quite some time before AI. As an example, what innovative amazing world-changing products have Google or Meta launched in the past decade with their very high numbers of very talented and highly-compensated engineers? The issue with most big tech companies are leadership, strategy, and product direction. I'm not saying that they don't make any profits, just that they probably aren't "building [the right thing]".

AI for product development and management would be far more impactful than automating rote coding tasks / building React UIs that mirror API structures IMO.

Figs•about 18 hours ago

> AI for product development and management would be far more impactful than automating rote coding tasks [...]

Yeah, if this stuff actually worked that well already, OpenAI et al. would just run AI CEOs and engineers. Why get some other company to pay you at all when you can automate every other company out of existence and take all the money they make?

The fact of the matter is that while the tech has some uses, it sure as hell isn't a full scale replacement and you almost always actually have to massage the input into LLMs to get anything decent back out in practice. Some CEOs and managers can learn to do this, of course, and some already are... but that quickly turns into a second full time job. A "programmer" is still needed. The job might change from mostly hand-writing C++/JS/Python to prompt engineering + some manual coding to fix all the stupid fuck-ups that the bots can't solve themselves, but you still need someone to actually prompt the bot.

When that changes, it won't just be engineers losing work; there will be no reason to even have a human CEO any more.

aspenmartin•about 18 hours ago

I don't know, if you've ever tried to build something at companies of that scale you run into incredibly boring problems "what data table do I need for X" and "who is the right person to reach out to for Y" and "they aren't answering me I guess I'll have to escalate"

I don't think there is any shortage of great ideas at these companies, they are just extremely bloated. And I don't think its something like indecision or bad PMs, it's "we have a finite amount of time and resources so we need to be conservative but also not too conservative"

If you have AI systems that can simply build out POCs in days, backtest on real data, show reliable results and numbers, you get a suite of product options you were never able to get before. If you have coding agents that can speed up implementation, you can build more stuff and choose the things that stick.

It changes the cost/benefit calculus of the entire business. I think you are exactly right in that: PMs/leadership are by their nature orchestration machines. Other roles are as well, but I think PM's are at a particular advantage here in that it will be quite awhile I would expect before core product decisions and creativity can be delegated to an AI, but not quite awhile until virtually everything that they're blocked on (legal approvals, POCs, wire frames, etc etc etc) will become less and less of a blocker

regularfry•about 16 hours ago

Yes, that exists at the wider business level. No question. I think what needs to get asked is are we talking about a bottleneck within the business as a whole, or a bottleneck within the scope of the knowledge work in question. Within software delivery there's a very clear shift when it's suddenly trivial to drop a 100kLoC plausible-looking PR into code review within an afternoon. Producing working code with a whole bunch of tests which make a very clear assertion that it does, in fact, work has had (if you're going that way) all the human-scale thinking time taken out of it, down to a rounding error. It still needs to be checked by a human, which was previously assumed to be a comparatively quick task in comparison to producing the thing. At least, it does where I am, and I don't think that's a silly position today at all.

If they can crack that latter review/spec-check/assurance step, checking that what was built was what was demanded of the problem such that we don't have humans in the loop at that step either, then the bottleneck moves again. Then I think it moves to requirements capture and to product development, but that might depend on the industry.

zjaffee•about 2 hours ago

If you really think this you simply have no theory of mind for this stuff. There are tons of immensely successful products in the ad space that both of those companies have launched. They don't need to innovate in the product or technology space (doing so certainly makes a big difference in having more placement for ad real estate), but to suggest there have been no real innovations (specifically engineering specific innovations) related to ad tech would be completely ridiculous to suggest. You don't need to change the world to get rich, just look at wall street where major innovations have been made in the pricing models of fixed income securities.

Second to this are countless other areas that have a major impact on the companies bottom line that are entirely engineering driven, especially at google given they are a cloud provider and have meaningfully grown the workspace business and launched waymo in this time.

nilamo•about 16 hours ago

> As an example, what innovative amazing world-changing products have Google or Meta launched in the past decade

Kubernetes is at 11 years ago, and is huge enough to be included there. The Google Pixel was just under 10 years ago. So... not nothing haha

nostrademons•about 17 hours ago

Google's internally developed and sometimes even launched plenty of innovative new products in the past decade. Stadia, Fuchsia, federated learning, and the whole transformer architecture that underlies this AI boom are good examples.

The problem is they get killed by some other executive who is afraid of their department looking bad by comparison.

I think this is fairly illustrative of the challenges in AI becoming as impactful as the Internet. The bottleneck is not making things. There are plenty of people who are really good at making things and can easily be 10x or 100x as productive as the average corporate worker. YCombinator was founded on that premise - small teams of founders and early employees could be orders of magnitudes more productive than the 1000s of corporate employees at their competitors.

The bottleneck is on bringing your product to market. If your innovative new product is built within a corporate environment, it'll get killed unless the executive you work under can get a promotion out of it, and you'll be denied all sorts of help with approvals, launch process, PR, marketing, branding, etc. If it's a startup, they'll try to shut you out with exclusive distribution deals, legal threats, lobbying efforts to change the legal environment, PR campaigns, FUD, etc.

The Internet was revolutionary because it let millions of people bring products to market without asking permission. Instead of having to bid for retail shelf space among dozens of entrenched competitors that all had sweetheart deals with the retailer, you could just put up a website and sell it to anyone across the globe. Instead of following hundreds of regulations that governed existing commerce, you could just launch something and sort it out later. AI doesn't really have that property - if anything, it makes things more centralized, with more gatekeepers, and so seems more likely to destroy economic value than add to it.

nonethewiser•about 17 hours ago

>I would argue that that's been the case for quite some time before AI.

I would agree but it's really minimized the building. More and more time is being spent on pre-coding work.

beambot•about 17 hours ago

Google & Meta are illustrative of late-stage capitalism -- it's all about distribution, not innovation. Their job is (mostly) to just acquire the products that have passed the gauntlet, then scale up their monetization through their distribution-focused machine. The same dynamic plays out in virtually every industry (not just tech).

You'll find that most internal "innovation" teams are just lip service. In most cases, the "mothership" will be incapable of reproducing true innovation -- from a statistical perspective, culture perspective (mega corps are anti-scrappy; internal politics), and motivation perspective (startups aren't 9-to-5). It's much easier to have big M&A budgets, a VC arm, and some handwavvy internal innovation group.

Every now and again, you'll get real innovations (Waymo, transistors, GUIs), but even those have a spotty track record of commercialization when created internally.

cogman10•about 18 hours ago

This is the same argument that has been historically made for outsourcing developers. Get 20 more devs for the cost of 1 dev in the US.

I suspect that AI will fail to pan out to the same extent for the same reason why outsourcing hasn't fully panned out (even though every company tries it after getting big enough).

The problems that will come up will be and always have been ongoing maintenance. AI is great at writing new code without a brain behind it, but once you get to the point where you need to refactor code, you start really needing someone with coding experience to guide the AI or veto it's mistakes.

I don't think that's really fixable even with a lot better AI. It's not something that ultimately comes out of the likes of github data.

I'm not saying that AI isn't going to make things better, btw, I just don't think we'll see a 20x improvement. Probably more like 1.5 or 2x.

roncesvalles•about 18 hours ago

Outsourcing of knowledge workers didn't work out because at large enough scales, the geographic arbitrage disappeared. Companies mostly always got what they paid for.

The determinant of success was only whether the task needed American-tier labor or could make do with sub-American quality labor.

regularfry•about 15 hours ago

> I suspect that AI will fail to pan out to the same extent for the same reason why outsourcing hasn't fully panned out

My mental model for that is that outsourcing fails where the work is being done organisationally far from the knowledge needed to do it. We know that's true of teams inside organisations, there's been a lot of research on how distance in the organisational tree negatively impacts productivity. Outsourcing is a pathological worst-case of that.

The promise (promise! We're not there yet!) of AI is that I can have a cross-functional team on my laptop. Organisational distance is zero. Where previously the outsourced team has to wait for the time zones to roll round so I can answer their blocking question when I get to my email STRICTLY AFTER I have had my coffee, now it's a prompt in a chat window with a button I can click to make a choice in 5 seconds. Delay is gone, cost of delay is gone.

> The problems that will come up will be and always have been ongoing maintenance. AI is great at writing new code without a brain behind it, but once you get to the point where you need to refactor code, you start really needing someone with coding experience to guide the AI or veto it's mistakes.

Oh, absolutely. That's a minefield. Today. It will be, right up until it isn't. There are ways to set up agents and projects right now that make a dramatic difference to how this part of the picture plays out, but those will sink into the harnesses as time goes on.

But also the big problem with maintenance and outsourced teams tends to be the commercial structure around the contract. You get a Build team, who Build the Thing and then: no more features for you, anything you want to add past the original spec costs extra. They hand over to the Run And Maintain team, who get to fix all the bugs that the Build team left but without the knowledge gained from building the thing, but are scaled and located to be absolutely as cheap as the supplier can get away with so probably don't have the skill, inclination, motivation, or permission to take on any restructuring to make the bug fixing easier and they're on the wrong end of the globe so there's a 24-hour latency on any queries. It's a terrible way to set teams up, but it looks good on paper.

Again, that's peculiar to outsourcing and completely goes away if I have the same team that built the thing own the thing long-term. That's true if it's humans or AI!

> I don't think that's really fixable even with a lot better AI. It's not something that ultimately comes out of the likes of github data.

No, it's a harness problem. You need to start from a maintainable point and keep standards in place. It'll take work to get the harnesses there and it's not ubiquitous. You might also need better models, but I've already personally seen big differences in outcomes between projects that took certain steps and others that didn't; it's nothing revolutionary, mostly stuff that works for humans also works for AIs but you need to know to ask for it.

> I'm not saying that AI isn't going to make things better, btw, I just don't think we'll see a 20x improvement. Probably more like 1.5 or 2x.

I think people radically underestimate the cost of delay. I don't know if 20x is realistic for the AI itself, but I think it's not impossible once the inefficiencies of having to go to other humans is factored in.

layer8•about 18 hours ago

Who pays for that value, and from what, if all knowledge workers lose their jobs?

It sounds like the economy would largely reduce to the small minority class of independently wealthy people.

simonw•about 18 hours ago

The more time I spend using agent tools the less I worry about knowledge worker job loss.

It takes a skilled knowledge worker to use these things.

regularfry•about 15 hours ago

See https://news.ycombinator.com/item?id=48300427 for an alternative take. I don't think either direction is inevitable, yet.

To follow on from that comment, if the growth in breadth of capacity of AI leads to a decrease in the risk of running a smaller business, which I don't think is an unreasonable prediction, then it's not inevitable people do lose their jobs. Employers get smaller, higher-leverage, and more plentiful.

whatshisface•about 18 hours ago

There were no knowledge workers in the middle ages.

rvz•about 18 hours ago

> Who pays for that value, and from what, if all knowledge workers lose their jobs?

They do not care unless these companies can get a bailout.

UBI only exists for companies that are too big to fail. Case in point, 2008 and SVB when there was too much money on the line.

One of the AI companies attempted to guarantee themselves a way for the government to bail them out if they were close to defaulting on the debt from the data center build out.

kmac_•about 18 hours ago

Producing a thing has always been cheap since personal computers existed. From mail-order software companies' times to SaaS times, producing a sellable MVP was an initial cost that is relatively small compared to the later cost of expansion and maintenance. Marketing and selling was and still is the hardest part.

roncesvalles•about 18 hours ago

Why do you think of knowledge workers as a fungible commodity?

What makes you think the people who used to build (or would have built) software will switch into the industry of "knowing that the thing was the right thing to build", as opposed to something cooler like surgery, city planning or experimental physics? The roles within a tech company are not the only jobs in the world.

regularfry•about 15 hours ago

> Why do you think of knowledge workers as a fungible commodity?

I don't.

> What makes you think the people who used to build (or would have built) software will switch into the industry of "knowing that the thing was the right thing to build", as opposed to something cooler like surgery, city planning or experimental physics?

Because it's probably already part of the job. It's a change of emphasis, not a change of career. Your boss can already ask you to do it. If you're producing code, you're probably also reviewing code, checking it matches the acceptance criteria, testing it, sanity checking that it was the right code to have been written, today.

OtherShrezzing•about 18 hours ago

> The bottleneck has moved from producing a thing that works to knowing that the thing was the right thing to build

“There’s more capital than good ideas to fund” has been a complaint from the likes of A16z & other VCs for a long time now. It’s why we ended up with stuff like NFTs getting funded.

jimbokun•about 14 hours ago

That’s very unimpressive return on investment compared to what was promised.

radicaldreamer•about 18 hours ago

If knowledge workers get laid off in mass, you can expect political curbs on AI adoption.

spamizbad•about 17 hours ago

I will also tell you, as someone who works at a company that's trying to remain profitable, that token spend has caught the eyes of finance and much like cloud spend they've already started applying pressure to control costs. This May my team is protected to use 30% fewer tokens than we did in April - this was by intention. I suspect we'll drop more in June.

Gigachad•about 13 hours ago

I expect in the future, when these AI companies stop subsidizing costs, the idea of spinning up 20 agents to work on some brain fart idea that you throw out after looking closer will come to an end. It'll be seen like assigning developers on work that hasn't been properly planned for or reviewed.

bigbluedots•about 10 hours ago

It might be time to start interacting with agents using grug speak only

pas•about 2 hours ago

https://github.com/JuliusBrussee/caveman

fragmede•about 13 hours ago

Can't wait till June, when finance gives the team the choice: everyone gets double tokens if you choose to fire somebody.

spamizbad•about 10 hours ago

Oh we already had that with a RIF earlier in the year.

sberens•about 6 hours ago

> They've got, ballpark, $5t to $10t to make back in the next 5 years

OpenAI's spending commitment is in the ~1T range for the next 5 years, and Anthropic is ~300B.

If they continue to show strong growth, they likely need to be at 100-300B in revenue/yr to support their yearly payments + financing, not 1T.

unmole•about 10 hours ago

> They've got, ballpark, $5t to $10t

What are you basing this on? For reference, Anthropic raised ~$70 billion in total and OpenAI ~$190 billion. Why do they need to make 20-40x that?

oblio•about 4 hours ago

All the planned infrastructure commitments. At least for OpenAI I think they're supposed to spend $300+bn in the next few years.

unmole•about 1 hour ago

I still don't understand why that means they need to make 5-10 trillion over the next 5 years.

bradleyjg•about 12 hours ago

> That's a _huge_ shift. Most people I know cite +20%-40% velocity with these tools, against the actual work their company cares about doing.

We all have our own observations and mine don’t significantly diverge. But that’s bottom up. At this point shouldn’t we be seeing it top down?

If we are beyond potential and into significant productivity gains, why isn’t that showing up for the customers?

Why didn’t delta airlines get significantly more operationally efficient in the last 3 months due to the introduction of better software?

This is a genuine question, I am seeing a disconnect.

narnarpapadaddy•about 12 hours ago

Anecdotally, my take on this is that biggest value lever is strategy and alignment, not implementation. The typical company is dozens of little vectors pointed in different directions, and they cancel each other out. Scaling up the magnitude of each is still net zero.

I was recently consulting at org where two separate engineering teams were all in on two different, incompatible deployment platforms and using AI to accelerate adoption of each.

Management was mystified why their engineering leads kept telling them they couldn’t deploy a complete implementation of their solution.

simonw•about 12 hours ago

> Why didn’t delta airlines get significantly more operationally efficient in the last 3 months due to the introduction of better software?

The coding agents got good in November. Most individual engineers didn't fully clock this until January/February. This means that companies didn't really figure it out until March/April.

Assuming companies like Delta have adopted coding agents (which would be pretty fast) it still takes months from adopting a new tool to the code results of that tool rolling out to production.

I expect (and would hope) Delta's software development culture is very conservative. Since nobody can confidently tell Delta "here are proven practices for using this tech to produce high quality, more secure code" yet it would be surprising if they were blasting full-steam ahead.

I expect that even companies that got on board with coding agents in January will only just be starting to ship user-facing features that benefited from those new tools. Shipping software takes a long time, no matter how much faster the "typing the code in" bit gets!

smrtinsert•9 minutes ago

Let's skip to the part where they put the taxpayer on the hook for a bailout as an industry since they integrated everywhere with big promises

jgbuddy•about 19 hours ago

You are making the assumption that the models are only used / paid for by 2.5% of the population (your knowledge workers value). There will be new value created by these models which people are happy to pay for which simply did not exist at all before. It is also naive to say that the hyperscalers are going to be expecting a return on this in 5 years, it will be entirely propped up by investments / IPOs as has been the case with any tech company for decades now to reach scale. The hyperscalers are currently spending ~650b combined annually, which they have the cash for and can sell in future compute instantly.

specproc•about 18 hours ago

I'm sorry, what the feck does "value creation" mean here? I live in a place where people are so, insanely squeezed from every angle. Wages are stagnant, prices rocketing. Where is the money to pay for this value going to come from?

No one I know feels richer than they did a decade back. I've not been able to meaningfully put up my prices for a decade. People are tired and stressed and scared, particularly scared of a technology everyone keeps telling them will make them redundant.

There is no rising tide lifting all boats, just most of us drowning whilst a few whizz past in their yachts.

I honestly hope these guys faceplant ASAP. Couldn't happen to a nicer bunch of people.

dirck-norman•about 18 hours ago

Feelings aren’t fact. A lot of data shows the doomerism is not reflected in the actual numbers and much of it has to do with rapid inflation and continued vibes.

Consumption has risen, inflation adjusted wages have risen for blue collar and white collar alike. Most social mobility has been the middle class moving into the upper middle class, not moving to the lower class.

The main thing holding people back is the housing crisis. This is orthogonal to the value creation of businesses.

Value creation is growth. If it didn’t exist the S&P would still be 42.55$.

WarmWash•about 17 hours ago

Sounds like internet sentiment and not research data.

It's kind of become socially taboo to not be suffering "in this economy", but on paper it's hard to see weakness in places that there isn't always weakness. As long as the 65-95% are doing well, there isn't going to be a collapse.

jgbuddy•about 18 hours ago

A literal example is that I can use AI to file my taxes instead of spending a weekend and hundreds of dollars to have an accountant do it for me. It costs me like $5. that 245$ delta is the value of that output to me, as long as I am confident it is correct.

deaton•about 18 hours ago

Thats the thing; the "increase in productivity" isn't being felt by the general public, the end user. If your "increase in productivity" just means more money being shifted around at the corporate level then it is meaningless.

mrandish•about 17 hours ago

> There will be new value created by these models which people are happy to pay for which simply did not exist at all before.

True, but I think the GP's point was that what consumers will pay won't be nearly as profitable as what enterprises will pay to increase the output of their developers and knowledge workers. ChatGPT is currently the overwhelming leader in consumer AI usage but only ~5% pay $20/mo.

As a recently retired serial tech founder, I'm now one of those consumers. I use AI webchat daily for general search, Q&A and even to write little automation scripts for myself, yet I haven't paid anyone anything for AI yet. Even after being heavily restricted and performance nerfed to hell in recent months, free webchat AI is still fine for everything I do, and I'm not remotely price sensitive.

Even as AI compute costs fall over time, I doubt serving ads against AI webchat to consumers will generate the kind of high-margin, sustainable growth VCs get excited about. It's so undifferentiated I bounce around between all four leading providers because there's virtually no moat locking casual consumers to any chatbot beyond a single question thread. I guess if it had a nearly infinite context window seamlessly integrated across all sessions, that might be somewhat sticky for some consumers but it could also get creepy for some others - and it would devour gobs of the scarcest resource in AI. Beyond Maslow's Hierarchy of Needs, the mobile phone is the largest revenue, long-term mass consumer product ever but I just got a new flagship phone from a top-tier provider for $30/mo over 3 yrs. IMHO, even an all-you-can-eat, infinite context window, next-gen Mythos couldn't reach and sustain mobile phone levels of global consumer adoption at ~$20/mo. Unlike professional developers and knowledge workers, consumers don't have any "job to be done" big enough for an LLM to command that much of their zero-sum discretionary spend.

jgbuddy•about 17 hours ago

100%, a driving factor will likely be how good we can make models that are so small they use almost no compute. Until then it is a race for adoption and moat-building (or screwing people over?) once you have users

fragmede•about 13 hours ago

What are the non-tech people in your life using AI for? $20/month, next to Starbucks and avocado toast, is discretionary. Maybe the novelty will wear off and non-tech consumers will leave it in droves, but everyone declared they'd leave YouTube if they started playing ads, but YouTube doesn't seem to have noticed.

Planktonne•about 18 hours ago

> There will be new value created by these models which people are happy to pay for which simply did not exist at all before

What sort of new value, and why will people pay for it from someone else rather than prompting for it themselves?

PunchyHamster•about 16 hours ago

But will they pay big actors running top end models for that? You don't need latest openai or anthropic model to go thru your mails, get summary of the some products from web, or to do your to-do list.

The AI might very well be used by noticeable % of population daily, but that doesn't mean they will be paying trillion dollars to the leading US AI companies

hgoel•about 9 hours ago

The fuck is going on with HN that a comment making up completely fantastical numbers is on top?

orphea•about 4 hours ago

Perhaps it's not only CEOs who are delusional.

davnicwil•about 12 hours ago

> Most people I know cite +20%-40% velocity

Seems roughly right, that does seem to be about the boost in the most well-suited cases where you essentially know exactly how to solve the problem, the problem won't change much, and it's truly a matter of just churning out the implementation.

In that case precisely prompting, doing the review & nudge loop, can be a pretty nice (nice, still not game changing) speed boost over literally typing out the code to match the design in your head.

The less optimistic view though is that most things you build aren't like that. Even if they seem like it first. These things get booked as a nice speed boost, but you'll only find out much later they weren't.

A confounding factor is that it seems like many people not in the detail of building software do seem to think of most to all things are like that, even before AI assisted coding. Not much need to say more - see the entire history of the 'agile' movement for evidence of this.

And because most things aren't like that, I actually struggle to see fundamentally how more than 20-40% will ever be achieved (short of the ever-present deus ex machina of AGI argument), simply because the generation is already really good for these types of things. So since things like this aren't going to increase in overall proportion of things to be done, I don't see where the overall extra gains come from by models improving at this point.

TimTheTinker•about 18 hours ago

I thought Anthropic and OpenAI's combined CapEx has been <100B?

source: https://isaiprofitable.com/

kilroy123•about 18 hours ago

That site needs Apple on the list. ;-)

Danox•about 17 hours ago

Why? All their money is going to Apple Silicon and the five ecosystems, so far in Apples entire history, the largest acquisition has only been $3 billion dollars, OpenAI is currently getting nothing and they gave Google a measly $1 billion refund per year for the use of Gemini.

If John Ternus wants to spend some money, spend it on bringing memory in house. Apple has the money and the engineering talent to do so, have it fab/made onshore in partnership with TSMC.

Do it Apple because you have to not because you want to the Chinese probably will be taking over the memory industry, worldwide, by taking advantage of the greed from three memory companies and their AI overlords.

deaton•about 18 hours ago

Maybe so far, but they've committed to well over a trillion in future capex.

topaz0•about 15 hours ago

And there's the indirect capex that their revenues will pay for indirectly, like in the case of oracle

TimTheTinker•about 15 hours ago

Here's the question - does that future spending already appear on partners' balance sheets

onlyrealcuzzo•about 19 hours ago

> We're talking about a world where you need 5% of every knowledge workers salary to go into tokens.

They are assuming ~10% global GDP growth instead of ~3%. You probably don't need the same %s if the pie grows a ton.

I'm highly skeptical we get that growth, but if you aren't, it makes it easier to digest.

freakynit•about 18 hours ago

I mean this case with AI-productivity fires itself back when we talk about GDP.

The more AI causes productivity increases, the less and less number of workers will be needed. This will heat up the job market even more and bring salaries down.

Net effect of this productivity increase: less consumption by the masses, even though you may be producing more good and much more efficiently.

A third effect also comes into play that once all this starts to happen, common people, who are generally living paycheck to paycheck, will now start to hesitate towards making any long term investment, housing included. And that indirectly will end up impacting financial and banking sector, which will then impact existing savings, bonds yields and retirement funds, and the recession-like cycle starts.

This productivity increase only makes sense if it is capped to a very small number.. like 20% max. Beyond that, who these companies will even be selling to?

Am I overthinking all this?

simonw•about 18 hours ago

> The more AI causes productivity increases, the less and less number of workers will be needed.

That only holds if companies have a fixed need for "productivity" which is met by their current employees, such that their employees becoming more productive means they need less of them.

Every company I've ever worked for has wanted to achieve way more than they are able to get done with current resources.

But generally yes, the biggest open question about all of this is how the impact will play out on the economy, job opportunities etc. I've not seen anyone come close to a confident prediction about how this will play out.

seanp2k2•about 18 hours ago

>The more AI causes productivity increases, the less and less number of workers will be needed. This will heat up the job market even more and bring salaries down.

>Net effect of this productivity increase: less consumption by the masses, even though you may be producing more good and much more efficiently.

Big tech companies can't even create login flows and account recovery flows that work for everyone yet. There are countless stories of folks losing access to business Instagram accounts that get hacked, Google support from a human to fix a problem that is outside of their help articles is non-existent, etc etc. There's still so much "low-hanging fruit" IMO that isn't particularly fun or exciting to fix, but ask your average non-tech friend or family member what they think of the Facebook + Instagram security settings pages / sites / desktop-only settings.

Who is going to pay for all of these subscriptions that will power this GDP increase when average purchasing power of those outside of the top ~10% of earners is decreasing YoY? We're headed toward food and water shortages next to sprawling datacenters, not shared societal prosperity and a healthy middle class.

arjie•about 17 hours ago

First of all, common people are not living paycheck to paycheck in the sense that they're at risk of not having money[0]. This is corporate content marketing that has entered the collective memory of people, not anything close to reality.

Secondarily, reducing the cost of making a thing doesn't always mean you get less of a thing. For me, certainly, what happened is that I write way more software than I originally did. When we built compilers, the amount of human engineering effort required to do things plunged, but the amount of software engineering jobs didn't go down.

This is as bad as models will ever be. That part is true. And it's entirely possible we go foom. But it's also possible we don't, and then it depends on where the asymptote lands.

0: https://www.slowboring.com/p/this-economic-myth-needs-to-go-...

20k•about 15 hours ago

>Am I overthinking all this?

Nope, if AI were to realise the hype, you have to take into account macroeconomics. Usually this isn't a problem for most businesses

>The more AI causes productivity increases, the less and less number of workers will be needed. This will heat up the job market even more and bring salaries down.

People also underestimate that the reason why companies are so excited about AI isn't to increase productivity, its to fire workers and crack down on worker rights. They won't lay people off because AI means they don't need as many people to get the job done, they'll fire everyone while doing a much shittier job, because they hate having to abide by worker's rights and pay people

onlyrealcuzzo•about 16 hours ago

> The more AI causes productivity increases, the less and less number of workers will be needed.

Why does this have to be the case with AI but it didn't have to be (and wasn't) the case with the steam engine, electricity, the automobile, or the computer & internet?

Certainly, AI could be different.

It's curious to me why the vast majority of people on here think it must be different.

samrus•about 10 hours ago

> The more AI causes productivity increases, the less and less number of workers will be needed

This might not necessarily be true. Increased efficiency creates induced demand to the point where more workers are needed. Because the new capabilities unlock more value to extract and the economy rushes in to get it. The steam engine is a huge example of this

I dont exactly know what new value genAI will unlock but i think its more likely than not

seanp2k2•about 18 hours ago

And yet the job everyone loves to hate, the humble "burger flipper", continues to resist automation yet command minimum wage labor rates. This future of either being a CEO of a company consisting primarily of AI agents building some monthly subscription-based solution to some trivial digital chores OR manual labor that isn't [yet] fiscally viable to automate seems quite bleak. We'd also need a ton of robot technicians and manufacturing that the US has neither the educational and training institutions to support nor the will of the population to fill. Given the ongoing war on immigration, visas, and foreign-made hardware, if this continues, good luck.

stared•about 18 hours ago

This would be a Bladerunner future Pope Leo XIV warned against (https://news.ycombinator.com/item?id=48265206), though in different words.

dcre•about 16 hours ago

1. Global IT spend is $6T per year

2. Where does this $5T number come from? If they make $4T in revenue over the next 5 years instead, what happens?

jvanderbot•about 17 hours ago

Hey, I wrote this down one time. I estimated way higher yearly revenue required, to be adversarial. And you can keep the "cost per unit AI work" a parameter and play with the results.

But the point is that if people are willing to delegate part of their salary (e.g., buy consumer products), vs requiring employers to pay for the tokens, then it's quite possibly a net win. Something like "I pay a largeish fee every month to make my own job much easier", similarly to how we buy a car to make commuting easier.

https://jodavaho.io/posts/ai-jobpocolypse.html

vayup•about 6 hours ago

> They've got, ballpark, $5t to $10t to make back in the next 5 years, or the hardware buildouts will start getting written down.

Depreciation and write-offs are about accounting models. Hardware will still be running after five years and still be making money. They may not be as efficient as the new hardware, but they will still be making real money even though they are valued at $0 in the books.

oblio•about 4 hours ago

GPUs are driven really hard plus they use up a ton of energy and water, they cost a ton to run.

kdheiwns•about 2 hours ago

We're going to reach a point where these companies stop asking for money and start mandating it. They've got a vice grip around the nuts of many governments and loads of companies have gone all in on investing in these slop heaps.

At some point, companies are going to start removing basic features. Governments and essential services are going to make people go through chatbots to get basic service. They're going to require AI to validate stuff that's already automated and working fine. Google search? That'll be all AI (and I guess they're already rolling it out). Dentist appointment? Going to need to do it through some AI app that requires an account and tokens "for a better patient experience". Verifying your ID when buying alcohol? Going to need AI to scan it and take 90 seconds to determine whether it's real. And it'll say you're an 7 year old farm worker in rural Botswana, so you can't get alcohol. And they're going to milk money at every level of this.

__alexs•about 5 hours ago

I don't think the unit economics are too terrible. Expensive, but not impossible.

200m knowledge workers in US and EU. Total salary around $15T/year.

$1T/year in token spending is about $5k/year per person. A big number, but not totally mad. That's the low end for office space per person for example. Probably close to the existing SaaS spend per person for a lot of roles.

We are still early in the deployment cycle for these tools so I would expect them to get better and also cheaper too.

QuiEgo•about 8 hours ago

At some point, if we reach stability on the models, we'll start getting silicon optimized for individual models. They are optimizing for time to market, not efficiency right now. I don't know how much it will move the needle on the cost math, but at this scale any improvement has a crazy multiplier.

cm2187•about 4 hours ago

But that means going back to "80% profit margin" Jensen and further digging your capex hole. The benefits would have to pay not only for current capex but also past capex.

But by then, I will be able to go one line down in my dropdown menu to switch to a newer LLM provider who doesn't have to amortize those past capex.

tedggh•about 17 hours ago

Also, not all developers work on software products. The vast majority of developers work supporting software solutions as part of a much bigger business model, such as infrastructure, industry, healthcare and services. Many of these are complex organizations. So, unless you get to turn every employee into a 10x employee, the 10X coder along won’t necessarily make a 10X productivity contribution. What’s likely going to happen is the 10X coder will start to slow down or adding more (unnecessary) complexity to avoid having to sit and wait on overhead, for other areas of the business which are not easily automated away to AI to catch up. As a developer I can finish my project in June instead of December, but what if the customer is still not ready for integration until December? what do I do?

raxxorraxor•about 4 hours ago

Plus, at some point there are less tokens because local models being optimised and can work with protected information. For enterprises that want an AI with a knowledge base of internal documents, this becomes more interesting by the day.

bwhiting2356•about 7 hours ago

> 5% of every knowledge workers salary to go into tokens. 20% if you're a developer.

When you break it down like that it seems reasonable. I'm spending about $5k/mo on tokens, seems more and more normal.

thesparks•about 16 hours ago

Those are rookie numbers. We are going to blow past $1t per year in spending in no time. As a developer for 29 years, I couldn't go back to coding by hand. For better or worse, AI will be woven into the fabric of life in no time.

manquer•about 12 hours ago

One factor to consider , the base will not remain the same over the next 5 yearts.

Every generation of developer tooling that increase of absolute code throughput creates a new class of developers (and users).

Always been the case since first compilers, through eras of frameworks to today, and the skill level needed to be one has dropped. In mid/late 80s only Master / Doctorate level Comp Sci professional could write any applications. It dropped to undergrad and just Information Technology engineers and comp sci theory became mostly optional and dropped further to any college level educated with some training and has been trending below with no/low code tools like retool pre 2022, that was before agent codegen services such as v0/replit and so on.

The next generation developers will not produce applications and architecture as previous generations did, just as we most of us here don't produce the level of quality that pg did when building this platform[1] , but as long as the user can find value it doesn't matter as countless enterprise applications of middling quality already prove today.

All this to say the 200M/30M numbers will not remain the same is the thesis for these businesses, will it change by large enough at a fast enough pace to justify the capex, I don't think so either. However web 1 then 2.0 , saas and mobile revolutions were pretty quick with new class of users and developers so not completely unrealistic .

[1] While HN is a heavy outlier with its custom lang lisp implementation, there are any number of examples from previous eras that are more moderate in choices but written with solid architecture with skill levels would be hard to find in today's generation founders.

qaq•about 15 hours ago

Anthropic raised less than 100B up to now and as of March has 30B ARR. Why does it have to make back 2.5T to 5T ?

mv4•about 17 hours ago

If people figure out how to run agents on-prem (already becoming feasible for both agentic tasks and coding on consumer hardware like Mac Studio 128GB+ or DGX Spark with some models) these companies will be in deep trouble.

Privacy is also a huge issue.

jkelleyrtp•about 17 hours ago

I agree in principle with the math. But I believe that in reality if revenues don't show up quickly, then lenders will just restructure the debt and defer the payback period. Similar to SF commercial real-estate; many buildings should've come due during the depressed covid market, but lenders (banks) were willing to delay payment until the market picked up again.

The scale of these investments put the lenders at substantial risk, so the lenders will do anything to make it work. If the current lenders will be damaged by extended payback periods, they can simply sell the debt to someone else who won't be.

fulafel•about 7 hours ago

The most often cited figure for knowledge workers seems to be 1B, an order of magnitude difference to your assumption.

Also, according to https://isaiprofitable.com/ total industry spend is also an order of magniture less than what your assumption is.

So in your model 0.2% of knowledge worker salaries instead of 5%, IF all the AI players win the investing gamble and do infact make back their money.

motoxpro•about 15 hours ago

So you've got that market. Let's call it the demand BY knowledge workers to do the work. You've also got:

2. The companies themselves buying tokens for operations to make the work more efficent. e.g. Salesforce agent or Microsoft Office agent or random saas inventory agent. (and if you say those will go away (which I don't believe), it's even more bullish. The tokens just go to someone vibe coding XYZ, which is EVEN MORE than if you were to buy saas because it's SaaS product x Companies that built it instead of just one)

3. The companies SELLING tokens. This is also new markets like schools and small business (e.g. the local gas station buying an inventory tool)

4. The consumers "buying" (I put in quotes because it can be subsidised but the company) through chatgpt, strava, instagram/netflix recommendation, etc.

Local models still take compute, and while it may be cheaper, it is the same argument of on prem vs cloud. No one operates on prem unless you HAVE to for regulatory. Margins will come down and you just spin up a GCP/OpenAI/Anthropic agent.

It may be "cheaper" but rationally its better to pay someone to manage it. Thats why Hetzner only had $367M in revneue (a lot but tiny compared to managed services)

kopirgan•about 9 hours ago

Depreciation starts on day 1 and most likely they IMHO dont have 5 years. They dodged the deepseek bullet but who knows what is out there that will make all of this investment essentially worthless?

hintymad•about 14 hours ago

> We're talking about a world where you need 5% of every knowledge workers salary to go into tokens. 20% if you're a developer.

Just realized something: if one worries about losing jobs to AI, token's high unit cost is good news. To say the least, high cost would delay the displacement, if any, right?

In the meantime, someone shared the below on X. I guess the moral of the story is that "good enough" does not just displace software engineers, but also models.

   > I Went From $3,000/Month on Claude to $5/Week on DeepSeek

   > And honestly?80% of my work is identical.

   > For the past two months, I was burning $3-5K monthly on Claude Code. Every idea from design to development to testing - full end-to-end automation, even simulating users to test my products and provide feedback.

   > Extremely token-intensive. But Claude's caching sucked, making it insanely expensive.

   > Then I discovered DeepSeek V4.

keeda•about 16 hours ago

Putting some more numbers out there (some of the links are broken, but numbers look about right):

https://github.com/danielmiessler/Substrate/blob/main/Data/K...

Knowledge worker compensation is 35 - 50 trillion a year globally (6 - 12T in the US alone.) That's a huge TAM. It's still close but 5T over 5 years seems doable.

>... unless we figure out how to make developers 2x, 5x, 10x as productive on stuff that matters, this isn't going to play out well.

The way we make ICs 10x productive is not just making each of them individually more productive, but by removing the coordination overhead of large organizations, because overhead scales super-linearly with the size of the org. And orgs will shrink automatically as AI-assisted ICs take ownership of larger and larger scopes of work, leaving much more budget for tokens.

I went into this in a bit more detail along with some made-up numbers here: https://news.ycombinator.com/item?id=48040999

3abiton•about 7 hours ago

Not to mention the competition: chinese open-weight models and open-source harnesses. Qwen3.6-(27B and 35B) have proven to be worthy and capable of running locally. I am confident more SMEs would look into this as a solution given the ballooning costs of API usage. You get a decent setup with an RTX 6000 Pro.

karlkloss•about 7 hours ago

"5% of every knowledge workers salary to go into tokens. 20% if you're a developer"

Not unreasonable. I'm a hardware developer, and my employer spends ~10% of my salary on software tools. Add hardware tools and their maintenance and it's more like 30%.

jmyeet•about 18 hours ago

YEPPP... and I'm kind of shocked at how many people can't do simple math.

Let's put it context. Google's annual revenue seems to be north of $400B. So if OpenAI suddenly had Google's revenue, it would still be insufficient to recover their investment.

and it's a ticking time bomb because $1T in servers, CPUs, GPUs and memory is going to be worth $200B in 5 years. You can say they can keep using what they've got. Sure. But they're also not going to stop spending on new hardware. And the competitor that comes along in 5 years and spends $1T doing the exact same thing is going to have a huge advantage.

OpenAI at this point reminds me very much of the Russ Henneman pre-money hype cycle.

mfuzzey•about 17 hours ago

It's actually worse than that. It's not just financial depreciation or that the existing hardware becomes obsolete due to being less powerful than new hardware but also that hardware being run all the time at high load actually has a limited lifetime of a few years so it will physically break...

jmyeet•about 17 hours ago

I agree but it's even worse than that.

Data centers come down to performance-per-Watt. Electricity accounts for 20-30% of a data center's operating cost [1]. I don't know the exact breakdown but the GPU part of that is probably the majority given how power hungry GPUs are. The B200 is upwards of 1200 Watts [2]. The B200 is rated at ~4.5PFLOPS of dense FP8. So you're getting 3.75PFLOPS/W. We don't know what the next generation will look like. The A200 (Hopper architecture card that preceded the B200) had ~4PFLOPS apparently but also lower power consumption. Obviously this changes depending on whether you're looking at dense or spare and FP8 vs INT8 vs INT4 vs FP4, etc so we're just using FP8 as a yardstick.

Imagine a fictional B200 successor, the T200 that has 8PFLOPS of dense FP8 at 1000 Watts. Well then a DC built on that where the T200 will likely cost similar to what the B200 does now, you'll get nearly double PPW so the same size DC and same electricity load is going to be like 2 of your old DCs in operating costs. That's a big deal when you've laid out a trillion dollars.

[1]: https://iaeimagazine.org/electrical-fundamentals/how-much-el...

[2]: https://www.trgdatacenters.com/resource/h200-power-consumpti...

mountainriver•about 18 hours ago

How could extremely capable artificial brains ever pay for themselves?

WarmWash•about 17 hours ago

Prices are not going to stay where they are.

You have either never seen a tech cycle, or need to be reminded of that. The pressure to buy more expensive plans is already starting to form.

hansmayer•about 18 hours ago

This should be the top comment. Also, I think its not that many people, including our Simon here, are not good at math. Its more like, some of them seem to be incentivised to not be cough, cough, "good at math". How else will the hype sell?

simonw•about 18 hours ago

I thought my post was pretty free of hype. I said that this new revenue "Maybe even enough to start covering their costs!"

Imustaskforhelp•about 18 hours ago

At a certain point, I genuinely feel like the best way this hype is being sold is by making people genuinely believe in it.

and in that sense, if Anthropic and OpenAI are able to create the projection that they can-be profitable despite finances seeming bubbly at best, I think that what happens is that these companies spew so much amount of content that people like Simon get into it too.

There is a deeper problem of people falling into AI psychosis too, in general, I am not sure if Simon has fallen into it or not

I think that the greatest point which can be made here is to not offload your thinking to others and to think about the situation yourself. Sounds familiar (looks like we are all off-loading our thinking itself to machines)

Side-note: As humans, we have a tendency to quickly judge or make quick decisions which stems from our times foraging and scavenging in jungles.

Another Side-note: at a certain point, I am unsure of how much to think about AI or not, certainly discussions about it that were happening 2 years ago weren't helpful in contexts that they are used now (well not in any way or form that a person discussing and getting into the weeds of AI 2 years ago is better than a person just getting into it say 2-3 months ago)

With the industry (moving so fast) [but that doesn't mean that you can't catch up with it, I feel like the fast word has made people think that they are falling behind which is imo wrong i suppose]*, It is basically unsure to me of any FOMO or anything if you aren't using AI already, I find this notion naive.

People might be making strong opinions (AI psychosis) and skills on the tools available at the moment the same done 2 years ago. We don't quite know about the tech as these are still black-boxes and how they progress and what these "AI skills" might survive or not in future. Heck, we aren't even sure if these tools might survive or not or wouldn't be made magnitudes more expensive simply to break even as they are given to us for the first time at percentages of the price.

I don't know if I should form (strong) opinions yet and also a question of its worth so much thinking efforts in the first place, probably just gonna do my own thing (the way I want to) which includes learning C at the moment. because learning is fun.

panarky•about 13 hours ago

> 5% of every knowledge workers salary to go into tokens

In general, I don't think you can reason from the existence of potentially stranded investments back to revenue projections.

And when you frame this as percentage of salaries, that's a sneaky implication that this is only about reducing salaries and headcount, and not about adding capability, or doing things you couldn't do before, or making fewer mistakes, or capturing more revenue, or expanding margins, or competing more effectively.

That said, 5% of knowledge worker comp actually seems very low to me, given the capabilities, and considering the percentage of "knowledge work" that is absolute bullshit.

Two weeks ago I received an email from my HOA saying I'd been billed for a service I never asked for. So I replied to the email saying they'd made a mistake. There are now more than 30 messages in the thread, involving at least 8 "knowledge workers" at the property management company all passing the buck, and the problem is no closer to resolution.

An agent could wipe out all 8 of those bullshit jobs and solve my simple problem in five minutes instead of two weeks. Think of how many hundreds of thousands people are doing this nonsense just in the property management industry alone.

5% is nothing.

bg24•about 13 hours ago

Is it possible that you are narrowly sizing the opportunity? While PMF does not always mean that early pioneers will be the leaders, I think the market itself goes beyond knowledge workers and developers. Agents, robots, drones etc will all use LLM or some world model.

I am rather more concerned about competition from CHINA. With how Huawei (2000 -> 2020) crushed every other telecom company and went from nobody to the most revered leader in 20 years, and with the depth of leadership in manufacturing and work culture, if China surpasses USA in AI, all US companies lose.

datsci_est_2015•about 16 hours ago

I could see such productivity gains being possible, if only because the current tooling around LLMs is terrible. The fact that we have 30 blog pieces per day making the front page of Hacker News about someone’s convoluted system to guide LLM output to something reasonable is absurd. There needs to be standardization in tooling, and it needs to be open source. Then, and only then IMO, will we see huge productivity gains.

But, at that point I think the big players’ moats will have dried up. Local models will probably be sufficient for 99% of daily office worker tasks.

So I disagree with TFA’s premise. I think this fear is probably shared amongst the LLM giants, and they’re still hoping that neural network transformers are somehow the path to AGI (probably not, imo).

golly_ned•about 17 hours ago

This is why 'agents' are the solution for these companies. Token spending goes through the roof. As long as a human is in the loop needing to read or review at human speed, that's a ceiling on how many tokens per user they can generate.

gorgoiler•about 16 hours ago

What value do the big model makers provide other than having a head start on gathering up humanity’s IP to train their proprietary models?

What’s their moat? Is it hoping for regulatory capture where scraping is made illegal the day after they finally finish scraping all human language?

It’s like OpenAI dammed the Colorado, and Anthropic dammed the Hudson, and now they’re both trying to sell us bottled water subscriptions at $100 a month. I don’t know how well the dam part of the analogy holds up, but the water part feels strong. Compiling models based on humanity’s written output feels like something no corporation should own.

BadBadJellyBean•about 16 hours ago

This assumes that we won't need new hardware in ~2 years. I find that unlikely. So they have to make back what they got up until now PLUS the running upgrade/development costs. So what will it be in 5 years? $20t? $30t? It's all getting a bit outlandish.

What I'm often hearing though is the equivalent of "gg ez" when I bring that up. I don't understand how this will at any point blitz scale to profitability. As far as I know they don't have positive cash flow, no one has a moat and I don't think they will push out engineers.

AdamN•about 4 hours ago

It's worth noting that if each developer is 20% more productive with AI (let's take that as a premise and not dispute it), then it makes sense to go even further and reduce human headcount by more since the communication overhead of having 25% fewer developers is in and of itself a force multiplier.

tldr; 10 developers with 20% more 'productivity' can be replaced by 7.5 ideal developers and more like 6 or 7 developers due to the benefits of simply requiring less organizational communication.

I still think the ideal team size is unchanged however and that's 7-10 people. Note that teams aren't necessarily the same as direct reports. A CEO for instance has a certain number of reports and a leadership 'team' but they're not a team in the traditional sense since they are more about making good decisions and collaborating on specific things but mostly about leading their own orgs that have vastly different skillsets from eachother.

red75prime•about 15 hours ago

> 200m knowledge workers in the world, 30m developers

Your scope is too narrow. The companies target more than white-collar jobs. And $1t is around 0.5% of the world economy.

yalogin•about 17 hours ago

To get that revenue and adoption they have to vastly increase their infrastructure spending. If they are currently losing in even the 200/month plans how is it sustainable?

quality_life•about 8 hours ago

Also, with announcements of replacing developers with AI and consequent job losses, who is going to use the tokens? AI using its own tokens to produce code?

ai_fry_ur_brain•about 12 hours ago

Also hardware will be obsolete or dead in 5 years, and warrantys are 3 years from Nvidia. Ask crypto miners how these kind of hardware economics work. Numbers have to keep going up all around. Its a fundamentally broken business model unless prices increase 10x

jstummbillig•about 17 hours ago

> 200m knowledge workers in the world, 30m developers. We're talking about a world where you need 5% of every knowledge workers salary to go into tokens. 20% if you're a developer.

This is where the napkin math is breaking down in a big way. There is absolutely no reason to assume this will only impact "knowledge workers". Farmers use computers. Farmers will use AI.

vablings•about 17 hours ago

AI for what? None of the AI a farmer could or would use would be any more meaningful that light chatbot usage or already existing computer vision/gps

red75prime•about 15 hours ago

And around 400k H-2A workers. Humanoid robots... Who works on them I wonder.

quantumleaper•about 17 hours ago

The kind of farm that would use AI is already 99% machinery and automation.

Salgat•about 13 hours ago

My hope is that hardware improvements (better node densities every 2-3 years, better designs, etc) will pick up the majority of the savings for these companies in the future, assuming LLM performance starts to taper off with diminishing returns.

jimbokun•about 15 hours ago

That’s on the order of 1% to %2 of global GDP per year just to pay for their hardware commitments.

zaphirplane•about 10 hours ago

> 200m knowledge workers in the world, 30m developers

1 in 6 knowledge worker is a developer ! Surely that’s too high thou explains the job market

sowbug•about 18 hours ago

There is also the EV (expected value) of developing AGI. Even if you personally believe the probability is low within the lifetime of either of these companies, the value would still be extraordinarily high, enough to forgive a $5T or so miscalculation here or there.

jbreckmckye•about 18 hours ago

I don't think AGI was ever a serious endeavour, just something the labs talked up to grab attention.

I am willing to bet a Twix we'll look back on that stuff in 2 years with a lot of embarrassment

sowbug•about 18 hours ago

The high-risk side of that bet would need to win more like a lifetime supply of Twix. But in a post-scarcity nirvana, everyone already has that. So sure, you're on at even money. See you in two years.

dgellow•about 7 hours ago

Only if running AGI makes economic sense. We actually have no idea if that’s the case. We don’t even have a definition for AGI

gz5•about 10 hours ago

there are many paths towards ROI and ruin. but towards ROI:

+ LLM-powered robotics, autonomous, IoT, smart manufacturing

+ LLM-powered biotech, healthcare, genetic engineering, medicine

+ Recursive model improvement

+ Multiply the # of devs (software truly eats world)

+ Exponential increases in model performance / cost decrease (algorithms, power, infra, chips, architectures, etc.)

recroad•about 14 hours ago

I just don’t understand how people are getting negative value out of AI or even only 20% productivity boost. I can only conclude that people don’t know how to use agents.

dgellow•about 7 hours ago

I mean, it doesn’t really matter if it caused by people failing to use the agents well or not. You cannot assume everybody to use the technology the best way possible

oblio•about 14 hours ago

Are you mostly creating new things or integrating with complex, undocumented, untestable systems?

recroad•about 12 hours ago

Mostly brownfield systems in Java, Elixir and TS. I use OpenSpec in explore mode and point the agent to all the different repositories (when not working in a monorepo) to identify changes. Once done, i switch to propose mode and spend at least 15 minutes there iterating over the plan until I'm satisfied with the TDD approach (agents need tests to verify their work). Then apply and review. This also auto generates docs etc.

browningstreet•about 18 hours ago

Somehow Uber and WeWork survived the same kind of grand projections that they never met.

121789•about 18 hours ago

uber sure....but how did wework survive? they are a smoldering husk of a failed company looted by its founder

hamdingers•about 18 hours ago

I'm sitting in one right now and don't see any smoldering...

naravara•about 18 hours ago

The company’s gone but the assets just got sold to other commercial real estate firms.

Uber was basically only ever software to help people use their own cars so a very small part of their valuation was physical stuff to upkeep, it was just deals and obligations they had.

Not sure how it shakes out for Anthropic and OpenAI. There’s a lot of physical capacity that needs to be built out and can depreciate. But there’s also a lot of network effects and dependencies being built in with enterprise users.

I don’t know how swappable the tooling is either. I think over the long term the UI, model training and documentation, and infrastructure are going to end up being run by different parties and I’m not sure which leg of that chain ends up in a position to skim most of the profit off. My guess is that Apple and Google end up raking in all the money since they control the OS and app stores while the rest of the stack gets driven down to being generic commodities. At least where mass market consumer adoption is concerned.

tapoxi•about 18 hours ago

I don't think Uber was doing $1 trillion in infrastructure spend.

windexh8er•about 18 hours ago

The difference is that they had room to charge more of their customers and pay less to their workers. The AI industry doesn't have both sides to play at this point. Training and inference are getting more expensive and if you take on the high prices now you're just floating yourself further downstream from profitability long term (which does not look viable for any of them currently).

paxys•about 18 hours ago

WeWork absolutely did not survive

PunchyHamster•about 16 hours ago

uber doesn't own trillion in cars

xoac•about 18 hours ago

somehow the invisible hand of the market is also blind af

ArcHound•about 18 hours ago

Makes sense if you think about it: if all photons pass through you (invisible) then you can't capture them to get info (blind).

hansmayer•about 18 hours ago

Funny you should mention Uber. What was it their COO said recently about the AI costs?

simonw•about 18 hours ago

I quoted exactly what they said in my piece, under the heading "The AI-failure stories around this are pretty thin": https://simonwillison.net/2026/May/27/product-market-fit/#th...

> But then you sometimes go and talk to your senior engineering leaders and you’re saying, OK, how many projects that were on the cutting room floor got moved above the line because of the productivity gains because 25% of our code commits were via Claude Code last quarter?

> That link is not there yet, right? I think maybe implicitly there’s more that is getting shipped. But it’s very hard to draw a line between one of those stats and, OK, now we’re actually producing like 25% more useful consumer features, right? And that line is hard to draw.

That's pretty weak sauce. I don't think that justifies the headlines that came out of it, personally.

allthetime•about 16 hours ago

lol I’m spending max $50/month right now on a couple light subscriptions and my velocity is insane right now (full stack mobile app development) I’m leaning into it hard while these cheap plans still exist and building out a big platform that I can easily generate new apps from. Hoping by the time the rug pulls I can just go back to hand cobbling these apps together from the modules I’ve pumped out and never even consider giving these companies a massive portion of my monthly income

root-parent•about 15 hours ago

Author seems strangely unwilling to distinguish usage from profitable product market fit. And from his own numbers:

Anthropic Max: $100/month

OpenAI Pro: $100/month

Total paid: $200/month

API equivalent usage: $2,180.16 in 30 days

So paid only 9.17% of API-priced value a 90.83% discount, or about $10.90 of API priced usage for every $1 paid...

That proves heavy usage but not sustainable unit economics.

Anthropic reported numbers point the same way:

Q2 revenue: $10.9B

Adjusted operating profit: $559M

Margin: 5.1%

SpaceX compute: $1.25B/month = $3.75B/quarter

So one compute supplier alone equals 34.4% of quarterly revenue and 6.7x quarterly adjusted operating profit.

Its difficult for the blogger to understand something when its incentives depend on not understanding it...

simonw•about 15 hours ago

My point with the $2,180.16 thing is that the price for consumers like myself is heavily discounted... but the price for enterprise companies is not discounted.

My usage is therefore a useful indicator of quite how much those enterprise companies may be spending on tokens, given the new pricing scheme.

If enterprise companies were still getting the same discounts that I get myself I would not have written this article.

(I had to dig into your margin figure - looks like you calculated 5.1% as 559000000 / 10900000000 * 100 but that $559M "adjusted operating profit" figure includes training costs, where usually when we talk about margin on inference we're not including those since those costs are fixed, margin calculations make more sense against the variable costs of serving a token.)

what•about 12 hours ago

When you have to train a new model every few months to stay competitive, discounting that cost is rather dubious.

mirekrusin•about 17 hours ago

Now try to take back llms from developers and see what happens.

bigfishrunning•about 17 hours ago

If, by some miracle, all LLMs ceased working right this second, any developer who would no longer be productive should not have been a developer in the first place.

dgellow•about 7 hours ago

You don’t need a miracle, if Anthropic API is down due to technical issues you don’t have software development anymore. It’s insane how much we are delegating to 3rd parties. It’s not like having cloudflare down where your users cannot access your services. The AI tools used to investigate prod issues stop working, developers stop working. The AI support system that allowed the company to get rid of their support team stops working. In addition to all the issues that causes to customer facing products based on AI. The sales team cannot work anymore.

It’s like the industry is willingly introducing a common external risk to everything

mirekrusin•about 16 hours ago

True, but they will not want to work for you anymore, they'll want to work for company that provides it.

Gigachad•about 13 hours ago

Limiting token quotas would be fine. Encourage developers to use efficient models, plan the work first, and to not burn thousands of GPU hours on waste.

It's much like when developers would waste tons of money on AWS spinning up massive test VMs and leaving them running without care. Until the finance people cracked down on it.

npn•about 17 hours ago

we all know it is impossible goal to make. surely AI will be even more useful in the future, but as long as china exists and continue to undercut the price, the goal will be never meet.

> We're talking about a world where you need 5% of every knowledge workers salary to go into tokens. 20% if you're a developer.

with that much money, the companies can easily buy their own hardware and hosting free public models, no need for those expensive subscriptions.

ar_lan•about 18 hours ago

> unless we figure out how to make developers 2x, 5x, 10x as productive on stuff that matters, this isn't going to play out well.

Simple - you make them work 2x, 5x, or 10x more hours.

OtomotO•about 18 hours ago

There are not enough hours to do that

solenoid0937•about 18 hours ago

> 20% if you're a developer. That's a _huge_ shift. Most people I know cite +20%-40% velocity with these tools, against the actual work their company cares about doing. +20% speed for +20% spend isn't going to motivate a trillion dollars a year in spending.

Of course it will. The value of an employee is a multiple of what they get paid.

If you pay an employee $500k and they make $2M for your company (like Meta), then of course a 20% increase for the salary is justified if the velocity is increased 20% as well.

lunar_mycroft•about 18 hours ago

The difference between what the employer makes per employee and what they spend in compensation doesn't matter. If the increase in productivity isn't greater than the increase in cost, there isn't a reason to pay for AI over hiring more developers.

Imagine an employer with 10 employees paying $500k per employee and making $2M per employee in revenue (to use your numbers). They could hire two more employees and spend an extra $1M (+20%), but make an extra $4M in revenue (+20%). Alternatively, they could buy all ten employees a $100k AI subscription, for a total of $1M extra spending (+20%) but an extra $4M in revenue (+20%). You'll notice both scenarios are identical, so an employer optimizing for profit would have no reason to prefer one over the other.

chasd00•about 17 hours ago

There’s a lot relationship and culture management overhead involved when adding 2 more people to a 10 person company. I think any business leader would take the productivity speed up from buying a tool over hiring more people and integrating personalities/habits/viewpoints to an existing established culture any day of the week.

nl•about 12 hours ago

> They've got, ballpark, $5t to $10t to make back in the next 5 years, or the hardware buildouts will start getting written down.

I find it disappointing that a completely wrong statement like this ends up the top comment on HN.

It is wrong in both the math, the logic about public markets and understanding accounting.

> $5t to $10t to make back in the next 5 years

I don't know where this number comes from, but it has gone unchallenged.

OpenAI and Anthropic combined have raised around $100B. This is an investment so isn't something the have to "pay back" from earnings - instead investors expect to make that back from the share price being higher than what they paid for it.

> or the hardware buildouts will start getting written down.

The hardware buildouts get written down anyway!! That is a good thing for investors because as the value gets written down they can book a tax loss. ANd it turns out that generally agreed depreciation schedule for GPUs (used to be 3 years, now 5 years by places like Coreweave) is still too conservative since GPU rental prices for 5 year old chips are higher now than when they were new (!!)

All of this makes the rest of the math in the comment incorrect by at least an order of magnitude and under some scenarios possibly 2 orders of magnitude!

That's not a small error!

overgard•about 15 hours ago

One thing I genuinely don't understand is these companies are constantly taking in incredibly large amounts of investments, so presumably they're giving up large chunks of equity or these are loans that need to be paid back or they're committing to spending obligations they're very unlikely to be able to meet.

So besides the insane hardware buildouts you're correctly mentioning, I don't understand how anyone that invests in these companies is supposed to make their money back in any sort of reasonable timeframe?

The cynical part of me is looking at what happened to the NASDAQ rules recently where essentially index funds are going to be forced to buy SpaceX shares much earlier than they previously would have (ie, before the price has a chance to reach it's real valuation). Which, um, I'm guessing these stocks are going to drop pretty hard when people start looking at the financials of these companies.

My suspicion is that the point of these IPOs is essentially to dump the bill on the unwilling public by forcing various institutions to buy it (ie, your 401k or pension is buying this shit), and maybe their investors can squeeze some money out of this before the stocks reach an equilibrium that's probably like 1/10th of what they're "valued" at.

logtempo•about 18 hours ago

> +20% speed for +20% spend isn't going to motivate a trillion dollars a year in spending.

Except that if your company go 20% faster than the others companies, you win market shares. But then, everyone will use the same tools and companies will be at even speed, but the tool will stay.

Now...if the market is saturated, it's useless to try to do things faster. Cheaper yes, but not faster.

archagon•about 17 hours ago

Pretty much all major tech companies today are horribly bloated and mostly metastasizing instead of innovating. I'm not sure how 20% increased productivity will help in any way with that. If anything, it might accelerate enshittification and turn potential customers off even more.

Wowfunhappy•about 14 hours ago

...does anyone have a guess as to the total amount of money spent on software developer salaries each year? What percentage of that would the AI companies need to capture to be profitable?

(I'm not trying to imply that LLMs can replace software engineers, it's just an interesting comparison. If nothing else, I suspect that if the cost of development goes down, demand for custom software will go up.)

ciconia•about 17 hours ago

> make developers 2x, 5x, 10x as productive on stuff that matters

What does this even mean? Is this about speed of development? Is this about headcount? LoC? How are coding agents contributing to productivity in places like GitHub, Shopify or Meta? I mean companies that already have an established product. I really wanna understand this because I'm not seeing that GitHub's product suddenly became so much better than it was 2 years ago, so where's all that productivity going?

zamalek•about 17 hours ago

The productivity is going into perverse incentives[1], e.g. we have improved (by which I mean "increased") token use. More PRs every day. More lines of code. All things we knew were shit-brained metrics a decade ago (obviously except token use).

We've also increased how much our coworkers need to read, or deal with. You can get an AI to make any point you want, so you can ignore the 5 humans raising alarms due to the 1 clanker you made say what you want to hear.

All numbers going up.

There are obviously people producing additional true value with it, probably, but that's almost certainly scarce.

[1]: https://en.wikipedia.org/wiki/Perverse_incentive

flexagoon•about 17 hours ago

Productivity is measured in the number of AI-generated Twitter posts developers can make about their AI-generated startups

deaton•about 18 hours ago

Bigger than that, they have to contend with open weight local inference. Open weight models right now haven't caught up to the frontier models of right now, but they're as good as the frontier models of not too long ago. If open weight models reach a certain point, then frontier model providers are going to struggle to make anything selling tokens, because eventually people will realize they don't need Mythos for everything.

richardw•about 14 hours ago

I assume the bet is that as you swap humans for machines, this pays for itself. Swap entire devs and teams and frankly, managers, and you make up a lot of 5%’s fast.

If it works. And I’m not sure who is going to buy the stuff the machines produce, but shrug. Presumably some bots click ads for NFT’s that other bots generate.

amelius•about 17 hours ago

At least they're not going to make us watch ads.

aprdm•about 18 hours ago

"Next 5y" doesn't apply to AI factories

jatora•about 10 hours ago

imo if your developers arent at least 2x as productive, then something is being done wrong on the employees part and/or the organization's. cli tools are ridiculously powerful provided you were an actual developer before using AI.

Maybe it's just me being (trigger warning from me providing an honest self assessment) very intelligent + a generalist, but i went from only full stack webdev and .NET to being able to implement an end-to-end LLM training pipeline (data curation, tokenizer, pretrain, sft, DPO - using ~$100 in cloud compute to train a class-competitive 1B STEM model)...and a full economic financial modeling and quant analysis application that pulls up to date economic, economic, news, stock data from the entire world and uses Dagster to orchestrate tech ical indicators and fundamentals and signals... and i did these things for learning and for fun. i built my own sublime text and obsidian replacement. i built my own reddit/twitter/hackernews/substack/news aggregator. i built countless other useful tools and utilities for me personally and for work I build more that empowers multiple departments.

Ive built 2 browser games, one already released to great reviews and 100k+ hours played. Ive built a tool on top of claude code that does ~60% of my job. Ive run data analysis on company financials for forecasting that have been refined and are producing very accurate predictions. Ive built competitive analysis tools and trackers.

All of this in 3 years. The projects are all clean, documented, with great code practices and modularity. A purist would surely consider some of the code slop. But it all works completely and fills real needs.

This is a huge shift. Anyone not realizing it yet is just simply behind the curve. I would not have accomplished 1/10 of this without AI coding. I went from copying code into and out of browser chats for 2 years before getting on the CLI train, and it is absolutely ridiculous the ROI you get from subscriptions to Claude or Codex.

pryce•about 14 hours ago

I understand some startup deciding to take a punt on "this will all work out financially if our new product demonstrably boosts productivity of large sectors of the economy by a breathtaking factor that's incredibly rarely ever happened before in history: 2x. Sometimes a plucky group of people take a risk, it pays off. If it doesn't work, the company fails.

What I do not understand is: large sectors of the economy all simultaneously taking this punt, with the necessary productivity boost, as you say, far more like: 2x, 5x, 10x

rsalus•about 10 hours ago

they need to make 5t-10t back, but not necessarily through selling tokens. as we can see, the frontier labs are making vertically integrated products. their revenue is no longer strictly tied to inference.

PunchyHamster•about 17 hours ago

That assuming once they start squeezing people won't just go to deepseek or other cheaper competition

> That's a _huge_ shift. Most people I know cite +20%-40% velocity with these tools, against the actual work their company cares about doing. +20% speed for +20% spend isn't going to motivate a trillion dollars a year in spending.

And most research shows people far over-estimating their own gains. Once companies start counting the actual (and not just reported) gains, the AI budgets will be more limited as people realize it's an useful and versatile additon but not replacement for most types of work

> We're not there yet. This is still the upswing of the hype cycle, and unless we figure out how to make developers 2x, 5x, 10x as productive on stuff that matters, this isn't going to play out well.

Upswing of the hype cycle while growth of tech itself is flattening, both coz of techs innate issues (which might or might not be solved, but some papers claim they are unsolvable with current approach) and just the fact the spike in growth caused so high economy cost that it put brakes on itself.

Gigachad•about 13 hours ago

There's a lot of workslop pumping the numbers. People can generate a 300 page PDF in a tiny fraction of the time it would have taken, but now the report is full of mistakes and fluff, and the stuff that would have been learned and caught in the process of making the report is now not happening.

PunchyHamster•20 minutes ago

and the recipient pulls that into LLM and generate summary.

It's lossy compression for thoughts at this point

jauntywundrkind•about 9 hours ago

Given what costs are and availability of parts, that 5 year write down is not in practice going to be the case. Maybe tax wise perhaps but especially for big fancy expensive multi million dollar 100-500kW racks these things are going to stick around for a while, I think.

EGreg•about 19 hours ago

Here is a serious question.. Can we sell into the hype cycle and on the way down with this: https://safebots.ai/costs.html

adithyassekhar•about 19 hours ago

I asked claude to generate a frontend and it made the same template. Same san serif and serif fonts together. Same colors. Same typography. Same layout and animations even. It’s wild how similar it is. No not similar it’s the same damn thing.

dd8601fn•about 18 hours ago

I’ve seen the same dashboard for a dozen custom web applications now, including a couple I had it make for me.

It really does have a particular lane for each chore, and it’s reproducible.

jeffreygoesto•about 18 hours ago

It produces the "most average" web design unless you really prompt your way out, isn't it? If you don't care enough to prompt, Claude does not care to be individual.

cortesoft•about 15 hours ago

I don’t think these numbers are accurate? It seems to ignore the fact that the models have cache for ongoing sessions, which means you (normally) aren’t actually sending all those tokens on every request… you only need to if you go too long between requests.

superxpro12•about 17 hours ago

It's going to be a typical saturation curve. A lot of upfront tokens spent on things that have stockpiled over the years, and then the derivative on token spend trends to zero as the users run out of immediate things to try. Sure there will be ongoing maintenance and experiments, but it wont be nearly as close as the initial inrush.

notepad0x90•about 11 hours ago

consider cloud spending vs on-prem before the great cloud migrations. people are spending a lot more for cloud services now.

I hear conflicting things about finances, some have a different opinion, that it won't be written down so long as more funding comes in and revenue keeps increasing. it isn't like how you take mortgage or business loan, it isn't even a loan it's an investment funded by loans. So long as the investment is still promising, what are they going to do? destroy its value by calling in trillion dollar loans?

cyanydeez•about 13 hours ago

if you ignore all catastrophic mistakes, these numbers are true

YetAnotherNick•about 19 hours ago

> $5t to $10t to make back in the next 5 years

Wait what? They spent 2 order of magnitude less on hardware.

trjordan•about 19 hours ago

From the verge: https://archive.is/kU4Zg

> Gartner forecasts that large AI companies would need to earn cumulatively close to $7 trillion in AI-driven revenue through 2029, which is close to $2 trillion per year by the end of the period. In order to achieve “historic returns,” the providers would need to earn nearly $8.2 trillion in the same period.

YetAnotherNick•about 19 hours ago

Those numbers don't even track even in the same sentence. If it is $2T/year by the end of 2029, it would be something < $6T cumulative in 3 years.

b0r3dthisD4y•about 18 hours ago

The numbers are made up political correctness anyway.

Everyone's agency is 100% captured by belief in Wall Street. Too few <50 have any meaningful labor skills to blink.

We'll continue to have consent manufactured via media platforms and in 3 years no one will bat an eye at these companies being worth $12 trillion as Altman and Musk climb two ladders holding a "mission accomplished" banner.

mannanj•about 17 hours ago

One quick question. Did tax payer money fund these data centers? If so, how does that money translate to their profit and a return for the people whose work paid for the resources?

Or did we just get scammed?

HDThoreaun•about 19 hours ago

Source on 200 million knowledge workers worldwide? My understanding is that it's just above 1 billion. I dont think a billion subscriptions at $1000/yr is out of the question but it might take a decade to get roiling

swatcoder•about 18 hours ago

You're suggesting that 1 in 8 people worldwide, including every one from infants and the elderly, are knowledge workers. Are you sure that's what you mean?

I'm not even sure that 1 in 8 people I know would qualify as a knowledge worker, let alone a knowledge worker that might profoundly benefit from on-the-horizon AI. And I'm in a highly skewed population.

WarmWash•about 17 hours ago

I think the underestimation is how many people want a personal knowledge worker in their pocket, and are willing to pay ~$65/mo for it.

HDThoreaun•about 18 hours ago

Well around 40% of people work. I dont think its crazy to say around a third of jobs are knowledge jobs, but what do I know

hibgymnb•about 16 hours ago

A billion subs at 1k a year????

I see a lot of out of touch takes here but this might take the cake

rootusrootus•about 18 hours ago

A billion? Really? At 200M you’re already including a lot of people that stretch the definition of knowledge worker.

HDThoreaun•about 18 hours ago

> At 200M you’re already including a lot of people that stretch the definition of knowledge worker.

How do you know this? Im certainly open to recalibrating my numbers which is why I asked for the source

naravara•about 18 hours ago

A lot of those ‘edge cases’ in the definition of “knowledge worker” are probably the stuff that’s most likely to have significant parts of the work augmented or replaced by AI agents. Like, call-centers are almost certainly going to get turned over in a big way. It’s not like the median tier-1 support operator just reading off a script is much better than an LLM anyway.

esseph•about 18 hours ago

Yeah, just looked into this. Knowledge workers is a big group and probably much larger than you think it is.

Basically if you're not doing manual labor, it's probably knowledge work.

Roughly 1/3rd of the working population.

Some data tucked in here: https://gist.github.com/danielmiessler/2dc039762a202b083753b...

AndrewKemendo•about 8 hours ago

Your severely underestimating the idea that people are just not going to use developers for certain things in the future

For example I don’t anticipate somebody making a living off of making website ever again

Somebody with absolutely no technical experience who needs a website for their business can now make one with almost no money whatsoever.

That’s good enough for their business. and the code can be totally shit and it does not matter because it’s meeting their business objectives. I am seeing this in the wild and I’m paying money to companies that have these types of websites and because it doesn’t matter I don’t need for the website to work perfectly on all my devices all I need to be able to do is pay them through the website which is what they need me to do and our transaction is done.

Don’t forget ultimately the people who pay technologists right now are primarily advertisers

work on hard problems is going to continue to be some tiny fraction percentage of the software engineering discipline

just expect a total bloodbath because the goal isn’t developer productivity the goal is that “I don’t need to pay somebody $200,000 a year to build a website authoring tool like WordPress.”

simonw•about 8 hours ago

Why would a small business use a coding agent to build a custom website when they could use something like Squarespace or Shopify with prebuilt templates that mean they have to know even less than if they were to use some kind of chat UI?

AndrewKemendo•about 7 hours ago

Cause it’s still easier and cheaper apparently

This is the most recent example I found last week for a local barber:

https://news.ycombinator.com/item?id=48166050

They seem to be using Manus: https://manus.im/

And my other assumption is that it immediately integrates with IG/Facebook which is where they do a lot of their marketing

I see no reason that trend is going to slow, especially if you can go to meta to manage your entire business marketing.

Regular people running business just want fast cheaps and good enough.

BoorishBears•about 14 hours ago

> +20% speed for +20% spend isn't going to motivate a trillion dollars a year in spending.

I'm increasingly realizing this math is wrong, because LLM use is really sticky.

If Anthropic 100x'd prices tomorrow for their best model, so some companies offered 50% salary to keep 100% of your AI usage:

a) There are programmers who would take this deal. They've gotten to the point of doing what feels like even less than 50% of the work, developers were already pretty well paid, so they'll take it.

b) There are companies that'd offer this deal. Even if the only people who are taking this deal are not the best engineers, and the AI output is not the greatest, I think the last 6 or so years have seen a lot of companies realize capitalism is not as competitive as it seems.

They're not worried about putting out a worse product because... frankly, what else are you going to do? CF lay a bunch of people off, support gets awful: well you're probably not building a new Cloudflare in the next few years.

In the meantime the AI will get incrementally better, their market share will grow, and you won't be able to compete without taking the same faustian bargain.

Maybe I was just naive but it's making me realize how much we take for granted in the world. Both the quality and relative value of things don't have to go up over time. Quality can go down while prices go up, and nothing will really stop it. Competition should stop it, but competition is really slow and can be interfered with. And as prices go up competition gets really hard.

TacticalCoder•about 16 hours ago

> We're not there yet.

And that's not considering that capitalism is going to do what it does best: if they really found a way to be profitable, competitors are going to fight them on pricing. Anthropic, OpenAI, Google, etcetera 's margins are a competitors' opportunities.

It's not as if there weren't chinese models nearly SOTA. Don't know where the french (Mistral) are but they may try to get in the game if there's a way to be profitable (not that France or the EU for that matter are relevant in anything tech or had any tech company besides ASML and SAP in the Top 100 but who knows).

cryo32•about 18 hours ago

This is never going to materialise. It’s dead in under 2 years.

The market is shrinking and saturated already and it’s not because of AI gains but geopolitical instability and supply chain issues, some of which are caused by AI spending and stupid ass PE firms refocusing on AI supply chains.

Only our pensions and futures burning.

aspenmartin•about 18 hours ago

What do you mean by the market is shrinking?

cryo32•about 17 hours ago

Literally revenue is collapsing in most sectors. Technology purchasing is declining. Service models are failing to turn a reasonable ROI.

People stopped buying shit.

packetlost•about 17 hours ago

It's consolidating into fewer, higher value assets. Over 40% of the S&P500 is in companies that are heavily (potentially over) invested in AI.

noddingham•about 15 hours ago

I feel like there's a bit of AI psychosis in this particular post.

>"These are tools which burn vastly more tokens, but are also quickly becoming daily drivers for the work carried out by extremely well-compensated professionals."

>"Somehow this fragment turned into headlines like Uber’s COO says it’s getting harder to justify the money spent on AI tokenmaxxing, because the market for stories about AI failures remains enormous."

Yes, it's just the yearning for AI failures. It couldn't possibly be runaway costs, record revenues, and massive layoffs. It couldn't possibly be that these tools are lighting dollars on fire by people already paid significantly well and not producing any increase in "value" for it (I recognize that output is 100x but outcomes are flat by all measures).

[1] https://cmr.berkeley.edu/2025/10/seven-myths-about-ai-and-pr... [2] https://futuretech.mit.edu/publication/crashing-waves-vs-ris...

MichaelDickens•about 11 hours ago

Being pedantic, but I don't want to lose the meaning of the term: "AI psychosis" doesn't refer to someone who thinks AI is really good. It refers to someone who develops symptoms of psychosis from talking to an LLM, e.g. believing they have developed a new Grand Unified Theory of physics.

mold_aid•about 1 hour ago

I don't know, "workaday professionals will find $200/month a particularly good deal, such that there will be widespread adoption" sounds either credulous enough to support the diagnosis or dishonest enough to dismiss. I am a "knowledge worker" who is doin' fine, has a lot of templated written work/report writing, and there is no way in hell I am justifying that kind of spending to my boss or my family.

kazga•30 minutes ago

"I firmly believe this technology will create business value" is so obviously and categorically different from "Humanity has birthed a silicon god that I have also developed romantic feelings for" that I'm not sure if your comment is even trying to be in good faith

spacechild1•about 2 hours ago

Please let's not dilute the meaning of 'AI psychosis'. It is a real phenomenon that involves actual psychosis.

simonw•about 15 hours ago

What's the psychosis?

yokoprime•about 14 hours ago

Sometimes it feels like theres this opposite AI psychosis, where anything AI is bad and boils the ocean, takes our jobs and makes RAM expensive. Its a component in the current economy, but things like tariffs, closing the strait of hormuz etc is equally bad for the economy. Anyway, just find it strange to be so militantly anti a certain tech.

dmix•about 14 hours ago

That’s the modern internet. What sells is the most overdramatic doom and gloom take possible.

girvo•about 1 hour ago

> takes our jobs and makes RAM expensive

I mean it is doing both of those, so thats fair to be honest.

lelanthran•about 4 hours ago

Hi Simon; while it's true that the token providers have found that their product is 90% useful to devs and 10% useful to everyone else, this is something they found out in the first quarter of 2025 anyway.

It's not exactly news, is what I'm saying. And even with the PMF they found, the product is still only a commodity i.e `tokens`, which is what every other provider on the planet is also providing.

All their other products boil down to "harnesses", which does not look viable as a product in the sense of PMF - you cannot sell it, you cannot lock it to your own subscription, API, etc. so you can't use it to generate revenue any more than the free harnesses do.

PMF has a specific meaning, and "code harness" or "coding model" does not satisfy the commonly accepted meaning. Maybe Mythos (or similar) will.

brazukadev•about 13 hours ago

> because the market for stories about AI failures remains enormous

How enormous? 1 trillion dollars, 2, 10 trillion enormous?

shimman•about 13 hours ago

That leaders are completely fine with impoverishing vast swaths of American workers because of "progress?"

simonw•about 13 hours ago

If they are then yes, that's psychotic. Not sure how it's relevant to my article about Anthropic and OpenAI's enterprise pricing though.

aspenmartin•about 12 hours ago

I’m so curious: let’s say you’re the president or the CEO of a major tech corporation building frontier AI systems.

What are your directives?

mock-possum•about 7 hours ago

You can’t just redefine ‘AI Psychosis’ to mean ‘somebody whose opinion about ai I disagree with’

aerhardt•about 18 hours ago

I find this analysis confusing. PMF for coding was likely reached some time last year. Profitability, which is different, we don’t know. The article kind of confuses both without making a strong economic case or using numbers in a compelling way. I don’t understand what the Uber case has to do with this either. The Uber COO clearly said that at least in terms of ROI he’s not seeing the results either.

My take is the product has been very useful for coding (PMF) for months. But it’s certainly not useful at any cost…

aspenmartin•about 17 hours ago

What I also find confusing though is that folks seem to ignore trajectory which is maybe the biggest lede to bury. As Simon says, we have had "good enough" coding agents for 6 months, that is a blink of an eye, and at my company my job has now completely changed. It's almost like a dream.

And that's just one inflection point. We've had several and there are many more on the horizon. So while I could be convinced that ROI is maybe not even positive today despite the ridiculous enterprise spend, it's perfectly rational to pave the way today for what's coming over the next few months let alone years down the line.

plaidfuji•about 11 hours ago

There may be additional major leaps forward, and there may not. I kind of struggle to imagine what the next step actually is. Certainly there will be improvements in performance (speed) and cost. But at a point you reach a barrier where the limiting factor is the specificity of the human prompt and our ability to manage all the code we’re generating.

Somewhat oversimplifying; writing software and building apps was a bottleneck - now it is not. What is the next bottleneck that LLMs can solve? Is there one? And is there enough publicly available data to solve it repeatably at scale? Or did we just automate stack overflow searches and now we’re stuck again?

Or is the endgame of this innovation cycle the complete removal of interaction with machines through code? Will we simply interact with machine coworkers purely through natural language? Can an LLM make PowerPoint slides and run a meeting? So far not seeing much progress on that.

vidarh•5 minutes ago

I am currently eating lunch. Meanwhile Claude is triaging and writing reproducers for 70+ tickets nobody has had time to look at. Next it will attempt to fix them. I have not read the tickets. I will not look at the code until there are review ready PRs and a code review bot have done the first pass.

In other words, most of the prompting will also go away.

pas•about 2 hours ago

Based on how much money is chasing returns, and how steep the slope is, it's almost certain that we are still not at the end of this sigmoid cycle.

Sure, it might start to slow down, but even then we will likely see a doubling in the next 10-15 years.

https://substackcdn.com/image/fetch/$s_!_ZW2!,f_auto,q_auto:...

dcre•about 9 hours ago

Judging from the fact that the Opus 4.5 inflection point was not really anticipated, and we still don’t really know what threshold was crossed that suddenly made agentic coding accessible to so many more people, I think it’s safe to say we don’t know what the thresholds will be until they’re crossed. The fact that we don’t know exactly what they’ll be isn’t a good reason to think there won’t be any more.

Oras•about 4 hours ago

yeah but if you have to pay $2k to $3k per month, would you still use it?

sixhobbits•about 18 hours ago

Pmf is this weirdly defined thing where "if you're not sure you have it then you don't".

I think it was clearly useful for months to people who had tried it and taken the time to understand it, but now that knowledge has spread to the point where wallet holders are convinced it's not just passing fad or hype so now pmf can be "claimed".

I agree it's weird to say "those people have pmf" though, usually it's something you define for yourself

timmg•about 13 hours ago

> Pmf is this weirdly defined thing where "if you're not sure you have it then you don't".

I'm not sure if this runs counter to your point or not, but: I don't see any future where LLMs aren't a core part of Software Engineering. The horse is out of the barn. There is no going back.

theschmed•about 13 hours ago

Yeah but the product is not “LLM” it’s “proprietary frontier model LLM paid by the token”.

And I don’t even necessarily disagree with OP! It’s more like the competition is shifting so quickly that your competitors could undercut your PMF in a blink of an eye.

airstrike•about 13 hours ago

True but that is maybe 5% of what is being promised by the average booster

repeekad•about 15 hours ago

> clearly useful for people who took the time to understand it

people -> programmers, I haven’t met a non-developer who reports getting more time out of current AI platforms than they put in. If anything I’ve anecdotally heard the opposite, introducing AI at work creates so much slop (output) it takes more time to process it all without a tangible bump in overall productivity

AndrewKemendo•about 8 hours ago

I have at least a half dozen examples of people not hiring people or buying other tools/subscriptions because they built their own with Claude

squeegmeister•about 16 hours ago

The article also treats the word "good" as load-bearing in a way that should have you questioning their analysis:

"I’ve called November 2025 the November inflection point because that was when GPT-5.1 and Opus 4.5, combined with their respective coding agent harnesses, got good—good enough that we’ve spent the last six months adapting to agent systems that can reliably get useful work done."

aspenmartin•about 12 hours ago

Yet it’s backed up by adoption across the industry

nozzlegear•about 9 hours ago

MongoDB was once backed up by adoption across the industry. Or for a more recent example, blockchain took off like wildfire across the industry before ultimately fizzling out in all but the most niche applications.

Not saying this trend will do the same, just that the industry adopting something doesn't guarantee its success.

0xDEAFBEAD•about 5 hours ago

And what happens when open models catch up in 6 months or so?

grttq•about 14 hours ago

Correct the cost is part of the economics.

Thats why most here shouldn’t engage in the discussion - they parrot on about benefits without identifying and articulating the costs and moreover how it affects the firms financial position.

righthand•about 18 hours ago

It’s not supposed to be logical, it’s an LLM evangelism blog that rarely, if ever, has any critical analysis that isn’t pro-industry. Read any/all of the other posts and you won’t find much skepticism but you will find a lot of shilling how great it all is.

aerhardt•about 17 hours ago

I like his other posts. He's bullish on AI, which is fine. I'd like to read a mix of bearish and bullish level-headed takes from people who are subject matter experts. His technical credentials are well past discussion - I love Django, and he comes across as a pretty upbeat but level-headed guy. Certainly beats radical takes in either direction from people who have no clue what they're talking about. It's just this article that I find rather confusing.

simonw•about 17 hours ago

The thing that matters most to me is if reading what I wrote teaches you some new things and gives you something useful to think about.

If I make an argument and you disagree that's fine with me, provided I didn't use misinformation or sloppy thinking in making that argument.

simonw•about 18 hours ago

308 posts on AI ethics: https://simonwillison.net/tags/ai-ethics/

52 on AI misuse: https://simonwillison.net/tags/ai-misuse/

149 on the unsolved challenge of prompt injection: https://simonwillison.net/tags/prompt-injection/

40 on slop: https://simonwillison.net/tags/slop/

If you want an "LLM evangelism blog that rarely, if ever, has any critical analysis that isn’t pro-industry" there are plenty out there. I'm not one of them.

saulpw•about 16 hours ago

People are confusing "excitement" with "evangelism". Your blog is definitely on the pro-AI side of things, but as you say, it's not one-sided or uncritical.

alexchamberlain•about 17 hours ago

I think you should highlight your exemplary pre-AI writing too.

csomar•about 17 hours ago

All of these are about AI misuse, not skepticism of AI. By skepticism I mean doubting whether AI actually delivers on its promises which, based on this last post, sounds like something you think we're already past.

Many people still think AI coding agents are slop on steroids despite all the current hype around AI actually shipping functional products.

binary0010•about 19 hours ago

So how do openai and anthropic plan to keep customers when GLM-5.1 is just as good and open source and a lot cheaper?

I don't see the business model working. My closest friend actually does automation software for large companies.

He does not use Claude or openai at all. He primarily uses gpt 120b on cerebras and glm-5.1 for heavy thinking work. And some other small models for various tasks. All open source.

And these systems are extremely useful for the businesses and are able to run fully automated pipelines that are very stable and fast.

We discuss this a lot, and we both think any business doing heavy agentic work on Claude and openai just aren't aware of exactly how good and cheap open source has gotten on the last year.

So... once the legacy businesses and developers catch up, won't Claude and openai be unable to recoup their costs?

mesmertech•about 19 hours ago

For coding you always want to go with the best model in the category, not something that would be the best model if we went 1 year back which GLM 5.1 is, and I'm saying that as a big fan of GLM cause I run a translation site where GLM is good enough for the price.

Most of the money right now is in coding. Openai and Anthropic just have to be 6 months ahead of SOTA open source models and they'll capture most of the enterprise and dev market

binary0010•about 19 hours ago

Yes I'm an engineer (20 years most in games/graphics industry) and only use it for code. I've been using glm 5.1 this week a lot. I went in expecting another "decent" but not really "up to standard" open source model.

I highly doubt I'll ever use Claude again.

I think you are wrong about Claude being any significant level better

cassianoleal•about 18 hours ago

I've been mostly coding with GLM-5.1 as well and I agree with you. DeepSeek V4 Flash is another very good surprise. Incredibly cheap, fast and effective.

aspenmartin•about 12 hours ago

Well I think there are a multitude of harder measurements that would disagree with you, but ultimately there is absolutely a use case for cheaper open models (or even cheaper tiers of proprietary models) and in fact the unsolved optimization everyone is trying to get to is how much spend to use for a given task. But there will always be a market, especially in enterprise, for the best performance there is to offer

RevEng•about 9 hours ago

I strongly disagree. I'm an engineer - I'm all about the fastest, cheapest thing that meets the requirements. I don't need Opus 4.7, even for my complex programming tasks. It costs over 10x other models available that still give good enough answers. Those smaller models are also a lot faster to output tokens, which saves me time.

Once the model gets good enough, the returns on bigger models diminishes quickly. I don't want to spend 10x the money and wait 5x the time to get answers that are equivalent.

yokoprime•about 6 hours ago

Same here, i can't say i've seen any difference in 4.6 vs 4.7 other than price

odie5533•about 17 hours ago

If I generate code with Claude, ChatGPT, and GLM 5.1, I can't say which model is which reliably. I exclusively use Claude more out of superstition than reason.

kgwgk•about 19 hours ago

For coding like for everything else in life cost is a factor.

mesmertech•about 18 hours ago

Cost for the value delivered. Like if you offered the current SOTA open source models at $0.1/M, I still think I'd be using Opus or 5.5 at $30/M. Or say GPT 5 which was released Aug 25, I don't think I'd use it for coding for even $0.1. I'd def find other uses for it(translations, agentic workflows, prompt guards etc), but for coding I don't think I'd ever completely switch to a SOTA open model

Unless ofc there was an actual speed difference, only reason I'd be willing to go with a worse model couple of percent worse than current best model is if the speed was at least 5x higher. Looking forward to kimi k2.6 offered publicly by Cerebras

eikenberry•about 17 hours ago

> For coding you always want to go with the best model in the category [..]

And this is why many companies go out of business. You always want the best bang for your buck, sometimes this is the "best model" and sometimes it is not.

lunar_mycroft•about 14 hours ago

> For coding you always want to go with the best model in the category

This is transparently false, because the best "model" is still competent human developers. They're just more expensive. If you're willing to use current LLMs at all, it means you're willing to sacrifice quality for a better price, and your disagreement with the comment you were replying to is entirely about what the optimum tradeoff is.

noname120•about 5 hours ago

It was true 6 months ago, not anymore. Frontier models now outperform developers on many tasks, be it on quality/readability/maintainability, and let’s not talk about speed…

aspenmartin•about 12 hours ago

Well it may be false that you always want the best model, but the point is performance of you+<agent> is far more cost effective than you+someone else

Perz1val•about 4 hours ago

And you propose the same companies that have been cost cutting and avoiding buying you a chair for ever won't start objecting to a $200/dev/month subscription? The finance department won't have a say?

solomatov•about 13 hours ago

>For coding you always want to go with the best model in the category, not something that would be the best model if we went 1 year back which GLM 5.1 is, and I'm saying that as a big fan of GLM cause I run a translation site where GLM is good enough for the price.

Currently, the difference is substantial, but what happens if capabilities saturate?

aspenmartin•about 12 hours ago

Then the house of cards comes crumbling down, but there is so much evidence to point to this not happening that it requires a bit of a theory for how that may happen

danny_codes•about 7 hours ago

Why? If it's good enough, it's good enough. Though I read the code that gets vibed so maybe my use-case is different.

yokoprime•about 6 hours ago

It's driven a lot by the harness too. If you're using claude code, you're actively being pushed towards newer models, even though older ones work perfectly fine for your use cases

Andrex•about 18 hours ago

> For coding you always want to go with the best model in the category

Will this always be true? There will never be an event horizon/point of diminishing returns where something not-bleeding-edge is "good enough" for 51%+ of users?

mesmertech•about 15 hours ago

As long as closed source is 6 months ahead in terms of current difference. Although this is hard to figure out using simple percent based coding benchmarks, you def. notice it when you're actually trying to do a long task. Even simple things like UI "taste" is enough for me to use opus instead of 5.5 though even though 5.5 is strictly better for anything that doesn't have a UI, ie backend, scripts, making agent workflows etc

blackjack_•about 18 hours ago

This is a silly take. There is a line of "good enough" for most coding (most CRUD apps and APIs are nothing special), and once we are past that, nobody will care about having the "newest, best" model except extreme outliers. And this base "good enough" model will become an ultra cheap commodity as we already see with GLM, deepseek, etc.

mesmertech•about 15 hours ago

As long as closed models are 6 months ahead I won't be switching from them to prev. 6 month SOTA open source models. Maybe its just a different calculation if you're in a job, but as an indiehacker I'll take any edge I can get

Ofc again, can be convinced to switch if there's however a clear speed difference, like 5x+ for a open source sota even if it was SOTA for 6 months ago

dogleash•about 18 hours ago

> For XXX you always want to go with XXX, not XXX

Oh, hey, I recognize you. Thank you for the very forward and thorough orbital sander recommendation at Home Depot. That's exactly what I wanted to deal with on my holiday weekend. You just know so much about this and the rest of us are simple passersbys.

mesmertech•about 14 hours ago

Yep sorry was just pulling it out my rear, not like a market trend that nearly every enterprise uses Anthropic or Openai models for coding or that Anthropic has had such ridiculous growth that they're 10x-ing year over year

EGreg•about 19 hours ago

Most work is not coding.

And also, people have it wrong… their models are not the main problem anymore. It’s the RAG

tomrod•about 17 hours ago

Would love to hear more about your thought about the RAG.

obsidianbases1•about 19 hours ago

Depending on RAG is a workflow problem, not an AI problem

peder•about 18 hours ago

> I don't see the business model working.

Same. It's a nightmare from a Porter's Five Forces perspective.

There will be a ton of businesses competing in this space, and there will be something of a moat due to how capital intensive the business can be, but there will still basically be infinite competitors.

Great for consumers.

ex-aws-dude•about 17 hours ago

Well in reality AWS will just host one of them and most companies will use that

Like how snapchat kind of fell off because the feature could just be a subset of instagram

It seems like it would just become a commodity like EC2

bakies•about 11 hours ago

Snapchat is huge and growing

smokel•about 19 hours ago

For coding assistance, I have tried OpenCode with several large open models through OpenRouter. All were fairly bad compared to Claude Opus. Could you provide some hints on how I should be holding these open models so that I might get more value out of them?

I agree with the common trope that open models lag behind by about a year, but something magical happened just around a year ago when the state of the art models became extremely useful. By this reasoning we're about to see open models perform well, but I'm afraid there is more to it than just waiting for another revolution around the sun.

Note, my application is coding assistance. Open models can be great for other purposes.

tariky•about 18 hours ago

I tried almost all OS models on opencode, none of them is on levels as opus 4.7.

In latest experiment I used opus for implementation plan then used cursor composer 2.5 for execution.

I must say that combo is really good. Main drawback of claude code is that is super slow. So when paired with composer that is super fast it flies.

cainxinth•about 17 hours ago

No one is claiming that OS is as good. They are saying it isn't that far behind SOTA commercial products. So why pay exorbitantly just to get something only a few percent better than the free option?

But there have been very good open source office apps for decades and few enterprises use them, so perhaps this is just the nature of B2B purchasing committees and 'nobody getting fired for buying IBM.'

slopinthebag•about 17 hours ago

Do more planning yourself, be smart about the context, break down tasks into smaller components, give it more guidance. You can't just lazily prompt it to complete large features autonomously and expect good results.

amilios•about 17 hours ago

But if the closed-source models can do this without the additional effort, that's a significant gap, no?

eikenberry•about 17 hours ago

+1 .. just wanted to reiterate that this is the answer. The open models work great if you just do a little more of the design/architectural work up front and organize your work appropriately.

aniceperson•about 16 hours ago

a good harness is supposed to do what you are describing. sonnet on pi.dev is pretty terrible but fast. Claude Code has ridiculous amounts of prompt engineering at system prompt level and sub session spawing combined with low temperature, to provide the predictable results people like. CC screws up and you never see, because the harness auto corrects, while on OSS you see everything, and does not comes with the level of monitoring by default.

doug_durham•about 17 hours ago

GLM-5.1 isn't just as good. It is no match for Opus running in Claude Code. Please try it yourself. Open source models are about a year behind at least.

jeremyjh•about 13 hours ago

This is profoundly misinformed. I use all three of those models regularly and the difference is just not that big anymore. GLM 5.1 is at least as good as Opus 4.5 - when it’s my dime it’s the primary model I use and switch to GPT 5.5 for planning and review but it’s also very capable at those things. If I had to pay API rates for everything there is no question I would only use GLM 5.1 (and Minimax for exploration tasks).

At work I mostly use Claude Code and a bit of Codex; personal projects are OpenCode and honestly I prefer it.

listless•about 9 hours ago

I would agree here. And in my experience Qwen 27B and Deepseek v4 are also extremely good.

None of them are quite opus, but they are damned close and a no brainer if you care at all about cost.

clhodapp•about 11 hours ago

In the second half of last year, I found that agentic coding with proprietary models (≈ vibe coding) reached the point where it actually speeds up my ability to deliver useful code at work. Before that, AI-based autocomplete definitely helped, but (despite the claims of the people selling AI coding tools) letting an agent author more than a file or so at a time (often a function or so at a time) required a very intricate plan or it would create a mess. Creating that plan or cleaning up the mess would take longer than just doing everything myself.

For me, it feels like widely available open models have recently crossed that same canyon. Are they as good as e.g. late-model Claude Opus? I don't think so. But they have absolutely gotten past the point where they are beneficial. This means that, for me, they are about six months behind.

jeremyjh•about 11 hours ago

Exactly this. GLM 5.1 is the first open model that I thought "actually worked" for agentic coding, which puts it in the same tier as Opus 4.5 - which was where I flipped.

osti•about 16 hours ago

For coding I wouldn't say a year, last year this time claude or gpt definitely weren't able to do what GLM is able to do today, but easily 6 months I'd say.

Not sure about other domains though.

_pdp_•about 14 hours ago

A year behind is still very very very good at this price. ;)

RevEng•about 10 hours ago

I use composer-2 daily for complex programming tasks. It's a fine tuned Kimi 2.5 - nothing groundbreaking. I've even had reasonable success using Qwen 3.5 on my desktop GPU. Opus might be better, but it's certainly not necessary to get good results.

ruben81ad•about 5 hours ago

I have 26 years experience. I code using GLM-5.1. Fron time to time I switch to Codex / Claude, and honestly I don't understand why people uses Claude or codex. With the right prompting, GLM is awesome.

locusofself•about 15 hours ago

Don't you need to spend 5-10 thousand USD to run these models that are "as good" as frontier models from 6-12 months ago? I haven't seen a convincing breakdown for ROI of running your own coding models. Especially against a $20 or even $200 plan

IshKebab•about 14 hours ago

I assume you can run them in the cloud. $5-10k doesn't sound like remotely enough to run a not-shit model locally based on my experience.

e2e4•about 16 hours ago

Agree. Also reasonix with deepseek is super cheap and quality is only slightly worse (in my experience)

mock-possum•about 7 hours ago

Do you have a good source to refer to, to map out migration from Claude code to a cheap setup using small open source models like you’re describing? I’d certainly like to experience how good they’ve gotten.

IAmGraydon•about 16 hours ago

The only way I see it working out for them is if some legislation is passed that eliminates the competition by making it illegal to run local models. They could claim that the models are dangerous and could be weaponized without oversight, or something along those lines.

csomar•about 17 hours ago

They are both (and also spacex) sprinting for IPOs. They know that the opportunity window is closing fast and that advancement in model quality has largely plateaued in the last year. Take as much investor money as you can get away with for now.

prepend•about 19 hours ago

> $2,180.16 worth of tokens for $200

“Tokens” don’t have an intrisic cost or value. Saying that I used $2,180.16 worth of tokens is like relying on the salesperson to convince me I’m getting a billion dollars worth of pots and pans for $19.99.

I think it’s funny how we are throwing critical thinking out the window when it comes to evaluating biased sources of info.

simonw•about 19 hours ago

I'm not sure what you're pushing back against here.

I spent $200. If I had been paying API pricing it would have been $2,180.16. The article is about how enterprise customers get charged API pricing, which means if I had been employed by one of those companies I would have cost them $2,180.16.

What am I missing?

eqvinox•about 18 hours ago

Just because API pricing would've been $2180.16 doesn't mean that's the value of those tokens. For starters, you personally probably wouldn't have paid that. But also, sales price isn't value. This is like saying, oh, I saw this bar of gold somewhere for $10000 but got it here for $1000! So I got $10000 worth of gold for $1000! - no, the value of that gold is determined by its weight, which wasn't even mentioned.

We have no market convergence on tokens yet (and it'll differ between LLMs), so it's impossible to say what value you got for your $200.

aspenmartin•about 17 hours ago

He's saying he's getting a great deal...a token from Opus on Claude code is the same as a token from Opus on the API. I remain as confused as Simon. He's not talking about "here's the ROI I got from my $100 subscription" it's "here's how much I saved from getting the monthly subscription instead of sending things through an API".

jonnat•about 15 hours ago

No, value is determined by what participants in a market are willing to pay for something. The only reason you are able to say that the value of gold is determined by its weight is that gold is a commodity and no matter what you paid for it you'll find others willing to pay market price.

Simon is saying that companies are (today) willing to pay API prices for tokens which is as good as any determination of value.

yokoprime•about 14 hours ago

Is this some anti-FIAT take? The value the author got is not value as in intrinsic value, it simply means value as in better deal than the alternative. This is often called "value" and you will see this used when products are sold in "value packs" etc.

remus•about 17 hours ago

> Just because API pricing would've been $2180.16 doesn't mean that's the value of those tokens.

You seem to be suggesting the price of tokens is entirely disconnected to the cost of providing the service? I don't see much basis for that assumption.

prepend•37 minutes ago

Lets say McDonalds charges $2000 for a BigMac. If they offer a deal and sell it to you for $200, did you save $1800?

Maybe if you spend $2000 on a BigMac. But it’s unlikely you would buy such a burger.

What is a hamburger worth? Don’t look to McDonalds to set the value.

recursive•about 16 hours ago

I'm willing to charge you $100k for those same tokens.

Does that mean you'll be saving $99k?

It sounds an awful lot like the mark-up to mark-down scheme where the price stays the same.

OrangeDelonge•about 19 hours ago

Large enterprises make deals and won’t be paying 2,180.16$ either. Just like with AWS

simonw•about 19 hours ago

That doesn't seem to be the case. From what I've seen enterprise deals get API pricing now. Have you seen evidence that's not true?

waisbrot•about 19 hours ago

And "large" just means that AWS will assign an account manager to talk with you. I was at a start-up who spent $300k/year on AWS and that was enough to get special attention and discounts. Enterprise pricing is confusing.

apsurd•about 19 hours ago

The point is that those a real prices real people are paying for real API usage. it's not made up.

your point is large players won't pay those prices at massive volume. ok

yokoprime•about 14 hours ago

They pay sticker price. There may be exceptions for very very large companies like Amazon or Microsoft which have their own deals where they rent out compute in return for usage.

Anon1096•about 18 hours ago

Claude is so in demand at the moment that there aren't really volume discounts. Anthropic sets the terms and you either accept them or get lost they have that much of a lead (mindshare/desirability wise).

altruios•about 19 hours ago

> If I had been paying API pricing it would have been $2,180.16

The point being made above is that API pricing is calculated... somehow... seemingly arbitrarily. Possibly untethered to the infrastructure costs entirely: which would be the basis of any 'value', however that holds the labor theory of value, which isn't accurate either. So how do you accurately price these tokens at all (other than through price-discovery: which is slow, messy and fuzzy)?

NitpickLawyer•about 19 hours ago

> So how do you accurately price these tokens at all

Like anything else in the economy: at the point where enough customers can pay you, and not enough will go to the cheaper competition.

lbreakjai•about 13 hours ago

I don't really understand the holdup about that specific line? Anthropic publishes the cost per million of tokens, what's arbitrary about looking at your consumption, and doing the math?

Or are you saying that Anthropic is determining that cost per million arbitrarily?

If so, it'd be like asking to explain why things fall on the ground other than through gravity. Companies pick the price they think the market will bear, and adjust based on new information.

pembrook•about 19 hours ago

API pricing drops DRAMATICALLY in enterprise agreements.

As with pretty much anything priced on volume/usage.

Enterprise deals are negotiated ad-hoc, the listed pricing is simply a jumping off point for the final negotiated discount.

If you’re going to give 20,000 employees Claude code you are not going to be spending $1B per year on Anthropic tokens as if you gave everyone an individual API key. Just as Anthropic isn’t paying AWS SES $10,000,000 to send 1 email update to their massive user base when the next Claude version drops.

taude•about 18 hours ago

This isn't true at the moment, though. So far there hasn't been the negotiating power. What happens is you end up capping usage for employees at a fixed amount. I think eventually, prices will come down and there will be discounts, but for enterprise accounts at least of our size (<5000), we're paying almost 100% retail, which kind of sucks, because it's expensive, and pretty easy to burn $50 to $100+ in a day, if you're not careful. In fact we got pushed off the former plan to the token-utility one at the last contract negotiation.

Going to be interesting to determing the metrics we give to engineers for determining whether the spend on this is worth it. Measuring PRs, lines of code committed, commits fully generated by agentic workflows, etc.....

yokoprime•about 14 hours ago

As someone who has seen the enterprise deals, they are not subsidised in any way shape or form. Anthropic may wave the seat fee, maybe get lower "expected" consumption. This changes what the company pays up front. but token prizing is fixed.

simonw•about 19 hours ago

> API pricing drops DRAMATICALLY in enterprise agreements

Do you have any numbers or reports to back that up?

lrae•about 16 hours ago

> Just as Anthropic isn’t paying AWS SES $10,000,000 to send 1 email update

How much do you think emails cost? That number is just so far off?

But besides that, running SES is also quite a bit cheaper than SOTA ai models with high demand (and comparatively) no competition. And quite a bit more pressure to make money (soon).

xnorswap•about 18 hours ago

Have you or I misunderstood the "teams" plan?

edit: I missed the "enterprise" feature matrix with the usual audit/compliance stuff to force the biggest enterprise customers onto enterprise plans. Otherwise the "teams" plan is much better value for any business.

orig-continued:

https://claude.com/pricing/team

Teams premium is "Everything in standard, plus more usage*"

And from my experience, it's a very generous usage, I've only hit the limits once or twice, and both times required multi-boxing agents.

I could single-window agentic development all day on opus-4.7 auto-mode without hitting limits.

If you're a business using claude, then that seems like the right plan, the enteprise/API plan seems more suited to where your product is built on top of the agent themselves, so seats/limits aren't really meaningful?

nr378•about 18 hours ago

Claude Teams and Claude Enterprise are 2 distinct plans. Simon is right that Enterprise seats have no included usage (and so all usage is charged at API billing rates), whereas Teams seats do.

troyastorino•about 19 hours ago

Tokens do have a clearly calculable intrinsic cost. There's the marginal cost of production (i.e. the inference cost) and the amortized R&D cost that goes into the model producing them.

Yes, value is hard to calculate, but luckily market pricing mechanisms exist exactly for this purpose. There isn't a better number to use than what people are willing to pay for them.

So he's saying that on an enterprise plan, he'd be spending $2,180.16. He's not paying that much, but enterprises are.

prepend•32 minutes ago

There’s a cost, certainly. I expect Anthropic knows. But we don’t know what that is.

m3kw9•about 9 hours ago

Enterprises also use 200$ plans, they are not that stupid. They add a spill over API key for over usage.

simonw•about 9 hours ago

OpenAI and Anthropic won't let them any more. Once you get above a certain size (I believe 150 seats) they push you onto the enterprise plans.

I don't believe you have the option to keep with the $200/month flat rate subscriptions any more. I'd be happy to be convinced otherwise.

(I dug into this a bit more and couldn't find anything in their consumer terms that say "you cannot use this personal account if your company has more than X people", so I imagine the pressure is more that your big company's purchasing department really doesn't like managing hundreds of individual subscriptions as opposed to a single, stable, predictable negotiated contract.)

rldjbpin•about 3 hours ago

> “Tokens” don’t have an intrisic cost or value.

i am pretty sure these services know what it truly costs them to serve you tokens, maybe not in realtime but at least periodically.

however, what they charge us is a constant exercise in price discovery. i agree with this sentiment in the sense that we don't have a stable sense of the cost. all of these comparisons are good for the moment, or at most the near future.

i believe that even the "all you can eat" approach with the max plans, regardless of their crazy pricing, is not sustainable only with the power users. if most of us gets this kind of value through our plans, surely it does not incentivise the service providers to continue pushing it. maybe they can regardless just to gain market share, but not forever.

woah•about 16 hours ago

From my back of the envelope analysis for my own projects, paying per token on OpenRouter is competitive if not cheaper than running the same open weight model on a rented GPU. Per-token pricing is in the same ballpark (although more expensive) for closed frontier models and open weight models (cents to dollars per million). To me this says that the pricing is somewhat grounded in reality.

rohansood15•about 11 hours ago

Are you comparing single-user requests or multiple concurrent requests when you say comparable to rented GPU? Most of the cost efficiencies kick in with concurrent/batch requests. A single H100 node can provide like 5k input + 2k output tok/s on a model like Qwen 3.6 35B-A3B with 30+ concurrent requests.

john_strinlai•about 19 hours ago

a little critical thinking led me to read that sentence as $2180 worth of tokens [at current api pricing]

seattle_spring•about 8 hours ago

Reminds me of a car dealership talking about how good of a deal their extended maintenance plan and warranties are.

dnnddidiej•about 17 hours ago

His point is more he was surprised enterprises weren't getting the discount. And so indeed maybe it is not a giant ponzi after all! (Could be a bubble)

FergusArgyll•about 19 hours ago

I think it's funnier that you can believe some things have an intrinsic cost and others don't

jfrbfbreudh•about 18 hours ago

Lol. They obviously have intrinsic cost, the floor being the cost of electricity. It’s hilarious how we are throwing critical thinking out the window when it comes to evaluating biased sources of info.

hintymad•about 18 hours ago

The real timing is that we don't have strong enough new business needs for now and we have accumulated enough tech assets, so our work has been increasingly incremental. That means we can build reliable features on top of vast amount of past work - where AI really shines. So, with or without AI, companied would hire fewer software engineers if majority of our work is incremental: add a feature here, fix a bug there, tweak a configuration and etc, then we wouldn't need as many software engineers anyway. AI just accelerated such squeeze.

In contrast, imagine if we had the same AI 20 years or so ago. Could AI really write Jersey? I guess not as people were still trying to understand JAX-RS. Could AI really answer all the questions about React? I guess not as React was just invented. Would we use 10x fewer people to build out infra on the public cloud or the entire so-called Big Data platforms? I guess not, as they were still rapidly evolving and we'd need so many engineers to explore so many different possibilities? Could we use AI to build our ML ecosystem with 10X fewer people? I highly doubt so. Heck, 20 years ago R was all the rage and Python's ecosystem was not mature at all. Oh, and mobile computing, could AI lead to 10X fewer people to build all the mobile apps and the underlying infra?

aniceperson•about 16 hours ago

> Could AI really answer all the questions about React? yes, due to ICL

> Would we use 10x fewer people to build out infra on the public cloud or the entire so-called Big Data platforms?

No, cannot solve core problems, makes a mess at scale

You are right about the incremental work. But most of the work is historically incremental imo, only few positions are R&D.

no-name-here•about 9 hours ago

ICL = In-Context Learning where LLMs learn to perform new tasks by reading examples directly within the prompt, rather than requiring permanent model retraining.

overgard•about 15 hours ago

Anthropic isn't actually profitable from what I'm reading, a discount briefly pushed them into the black. This guy makes the case well: https://www.wheresyoured.at/anthropics-profitability-swindle...

I'm skeptical that their current price raise is sufficient, and I'm also skeptical that most users/businesses will accept more significant price raises that will be needed. Especially for individual users, $200 a month is already incredibly expensive, I really don't think most people are going to be willing to pay more like $1000 a month.

ralphael•28 minutes ago

this - I'm wondering why this isn't getting picked up more in the analysis? Its an accounting trick, which is not sustainable. Time will tell.

jwpapi•about 12 hours ago

I have to give them kudos. This whole thing is the greatest swindle of all time.

AI has some use cases, but not at the price it’s currently priced at. I’ve been on AI since GPT-2 with a lot of heavy users. Every user has the same story, curiosity, surprise, hype, hate, realization. Enterprise is usually a bit behind and are right now at hype cycle, that’s where they sold all the deals and do the IPO.

It’s really a VC masterclass.

Don’t get me wrong there is are useful cases of AI, but not the way the want it to be. Quite similar to Blockchain. The idea of decentralized money has right to exist. 99% of other coins not.

AI is a faster, but still less accurate search engine. AI is great in finding bugs, it’s great at ruber duck debugging.

The reason I call it a swindle is, because along with the marketing it gives tons of people in the world the impression, they can now build their own startup, game, infra etc without the need to learn it themselves. This leads to millions of abandoned and low qualiy projects and products, because the vast majority has never built the mental modal necessary to solve the problem thoroughly. In the end they’ve wasted months and money (but burnt tokens). This is what I call a swindle.

All early adaptors I know have not drastically winded down their usage, not because of money, but because there is no new case. If you want to explore a new project you can get onboarded quickly learn a lot and then switch to documentation and live testing. For me usage is the lowest it has been the last 2 years.

I would not let AI touch my code. I have anxiety around it, because it will gripple back up. I will let it read my code and let me know what I did wrong so I’m sharpening myself.

100s of companies including open source solution can offer that for me.

All my non-tech friends are now in hype cycle and share their hype and fore forseeable frustration with me.

I have to say I’m in a way impressed in how AI has been rigorously vc-utilized (conciously or not-conciously) to generate these vast companies with the whole world watching.

aspenmartin•about 12 hours ago

Trying to parse the raw claims here maybe you can help me out

- it’s a swindle because ROI of tokens for coding models is not positive? As in it doesn’t bring enough value to charge like the $100/mo?

- enterprise customers are too dumb to see this

- IPO to max out the CEO profits for what is ultimately blockchain vaporware

Am I getting that right? Or am I putting words in your mouth?

> it gives tons of people in the world the impression, they can now build their own startup, game, infra etc without the need to learn it themselves.

I can’t speak for peoples beliefs and motivations, but this seems to be strawmanning, no? AI is a powerful tool to force multiply people. You can’t just prompt “build me an enterprise SaaS app worth $1B” or “build me GTA6 and don’t hallucinate” but is that your impression of what’s happening? Dario and Sam are saying “if you buy our coding agent subscription you can build a game with zero skill and one shot and then be rich”?

If you don’t find value in AI agents I can see reasons why that could easily be true today. Also if it just gives you the heebie-jeebies. But to say it’s a swindle on par with the blockchain I think that contradicts an enormous amount of signals and also the actual dialogue (not just headline sound bytes) around what these systems are capable of today and what we expect them to do say at the end of the year.

jwpapi•about 11 hours ago

You kind of got it right, but the biggest loser of them all are the investors especially the index investors. They don’t even decide what they invest in but the savings that goes in funds need to invest in these stocks.

It’s quite an elaborate swindle obviously. But you generate hype with underselling your core product, you claim way more usability then there is. Users will experience usability initially. Everything multiplies with each other and then you put it on the market. Everybody involved makes money and you’ve succesfully extracted money from everyone who’s invested in NASDAQ index funds at the very least.

> Dario and Sam are saying “if you buy our coding agent subscription you can build a game with zero skill and one shot and then be rich”?

That’s Anthropics marketing, yes.

Also their offering is not uniqe that justifies a 1 trillion valuation. The first companies are already rowing back. It’s a really certain time window that they are about to hit now with their IPOs

The companies that have signed these enterprise deals haven’t done a ROI analysis. They had Fomo.

brazukadev•about 12 hours ago

> Dario and Sam are saying “if you buy our coding agent subscription you can build a game with zero skill and one shot and then be rich”?

They are saying profitable companies should replace the engineers that built their systems with a subscription (while they are hiring).

sothatsit•about 7 hours ago

I don’t remember ever hearing Dario or Sam recommend replacing people. Rather they say that smaller groups of people can do more work, so hiring will slow because small teams can do more.

The only times when people talk about actual full replacement of people is always when they are talking about some “future AGI” that is far more capable than the tools we have today.

RevEng•about 9 hours ago

I usually agree with Simon but I think he is overlooking an important factor.

There is a lot of AI usage happening not because it shows benefits, but because the business has mandated its ubiquitous use. Companies having dashboards for token usage and rewarding people for using more tokens is a real thing. I just spoke with someone today who works at Microsoft and they are required to use AI for all of their work - they have to make a special request with justification if they decide not to use AI for even a single PR. This kind of demand isn't driven by value from either the company itself or from its workers; it is the kind of artificial demand you get from make-work projects to keep people employed during hard times.

We have to wait for the hype to settle down and people start making business decisions based on results before we can really value these AI products.

ggttk•about 9 hours ago

Correct the question is one of real vs fake demand curves.

no-name-here•about 9 hours ago

I think token leaderboards are idiotic, however… many companies are requiring AI use, with by far the most likely reason being that they truly believe AI-assisted development gets (or will after the time needed to get employees acclimated) better returns.

antman•about 19 hours ago

The costs are exorbitant and most software is not produced by companies with such a huge moat. Anthropic made a profit through their recent bait amd switch pricing. There is zero useful insights online to indicate whether this might die due to commoditisation with good enough open models or fail the race to get more people subsidising unsustainable growth with other people’s money. Who knows? In any case they dont seem to be able to drop usage costs so the business model seems based on wishes

j_w•about 18 hours ago

Continuing with your skepticism:

> Stories are circulating of companies surprised at how expensive their LLM bills are becoming from usage by their staff

> Enterprise customers are now paying API prices

How long before enterprise customers start to question the bill? Anthropic goes from not making money to doing pricing shakeup, and now they are making money and the biggest spenders are shocked at prices.

Seems like things are still very uncertain.

danny_codes•about 7 hours ago

And switching cost is basically zero, so there's not much friction. In my harness it's literally just saying model=X

brokencode•about 19 hours ago

Usage costs will come down with better hardware. Hardware is improving rapidly each generation.

eikenberry•about 17 hours ago

Costs will plummet as better hardware becomes available and priced reasonable so that people can more easily run their own open models locally. But that won't help Antropic/OpenAI make more money, quite the opposite.

simonw•about 19 hours ago

That trend held true for the past three years, but it doesn't feel as safe to me now.

But memory costs are going way up. And both OpenAI and Anthropic bumped up the price of their frontier models in April.

brokencode•about 17 hours ago

Yeah, it’s called supply and demand. Demand for memory went way up suddenly. Now supply is going up rapidly as companies try to cash in on that demand.

Supply will eventually catch up with demand. Then the prices will come back down.

StrauXX•about 18 hours ago

Algorithms are also improving. I believe it's very unlikely for these two improvements together to not result in one to two orders of magnitude cheaper cost per "intelligence". Of course, that might just make use cases that are too expensive today viable and thereby increase usage further.

overgard•about 13 hours ago

A lot of the new hardware requires retrofitting existing datacenters for appropriate cooling, or is waiting to be installed because the new datacenters haven't been built yet. By the time they're installed it's likely a lot of Blackwell GPUs are going to be very out of date. Newer hardware is turning into huge capex bills along with the corresponding depreciation costs. Basically, it's not the same as plugging a new GPU into your desktop, the upfront investment is extremely expensive and all the numbers I'm seeing suggest that the newer GPUs are costing more to run, not less.

brokencode•about 12 hours ago

Sure, with all the component shortages it’s not surprising the current GPUs are coming at a massive premium.

Eventually either the supply will go up or companies will start buying fewer overpriced GPUs.

Either way, the price per token will come down as hardware improves and supply and demand reach equilibrium.

grttq•about 13 hours ago

Technological obsolesence is a bitch eh.

bambax•about 17 hours ago

> That’s $2,180.16 worth of tokens for $200

So the author claims he's getting $2000 per month worth of frontier AI free of charge. Ok. If he's been doing that for 6 months that's $12k. What has this produced concretely? For $12k you can find a used car in decent condition. Heck for $1200 (his actual out-of-pocket spend) you get a brand new ebike! (on which you could put a pelican and make a photo of both if that's your fancy). But here it's unclear what has come of it.

simonw•about 17 hours ago

I've written a great deal of code - code that would have taken me years of work to produce without LLMs.

(It's mostly open source, you're welcome to dig around in https://github.com/simonw and https://github.com/datasette if you like.)

My time as an experienced software engineer is worth a lot of money - a whole lot more than $12,000 for the past six months.

bambax•about 16 hours ago

> code that would have taken me years of work to produce without LLMs

As you might suspect, this is what I have an issue with. Without LLMs, isn't it possible or even likely that that code wouldn't have been written at all, and wouldn't have been missed? If LLMs are mostly used to produce throwaway prototypes then it's a stretch to say that's money well spent.

If indeed it let you advance your main product much faster then sure it's a different story. You're the judge of that. It's hard to see the impact from the consumer side; everything is still broken and no extraordinary app seems to be emerging. Maybe it's just a question of time. We'll see.

simonw•about 16 hours ago

I've thought about this a lot. I am very confident that the way I use LLMs is both accelerating progress on my core projects (here's a substantial, reviewed PR I landed just yesterday https://github.com/simonw/datasette/pull/2741) and helping me create plenty of projects that otherwise would not have existed.

odyssey7•about 14 hours ago

I’m watching to see what happens to big enterprise software contracts. Why pay some vendor $800k annually for something a couple mid-level devs can replace—-and tailor closely to your needs——by leveraging AI.

Open source software changed the world. AI that will cheaply write whatever you want in a few days will also change the world.

nevertoolate•about 16 hours ago

> My time as an experienced software engineer is worth a lot of money - a whole lot more than $12,000 for the past six months

From this I assume you think that what the llm has generated is as valuable as your own work generally is. How do you even calculate this?

ex-aws-dude•about 17 hours ago

And what was your return on investment?

simonw•about 17 hours ago

As I commented elsewhere, I'm still bad at making money from my open source work: https://news.ycombinator.com/item?id=48296794#48298909

(I have a feeling if I could say "and I closed $2m in sales with the software I wrote!" people would find a way to say that didn't mean anything anyway, because how can I prove I wouldn't have made those sales writing it by hand?)

aspenmartin•about 17 hours ago

I would be very curious what kind of answer would satisfy you here. Simon isn't building a product, where $200 is a line item on a balance sheet. If he tells you what sort of analyses or time savings $200/mo on coding agents have enabled him, do you honestly think that would satisfy you?

realo•about 19 hours ago

200$ per month per seat is nothing .

A single 3D CAD license pack for the guys in our R&D group costs multiple thousands of dollars per seat, per month.

It's about time software seats get some love too.

smokel•about 19 hours ago

AutoCAD is $175 per user per month [1].

[1] https://www.autodesk.com/products/autocad/buy

bigbuppo•about 18 hours ago

AutoCAD is still the budget-friendly CAD program it has always been. You don't build big boats in AutoCAD.

rrr_oh_man•about 18 hours ago

Winch Design [0], which have built some of the world's largest superyachts [1], seem to be using AutoCad. [2] Afaik it's also the same with Lürssen (but don't quote me on that)

[0] https://winchdesign.com/ [1] https://www.superyachts.com/directory/1516/winch-design/flee... [2] https://www.autodesk.com/design-make/articles/naval-architec...

so_it_be•about 18 hours ago

Except LLM's even with Vision are still useless at AutoCAD let alone Revit (please dont quote SCAD LLM's at me, useless). Knowledge based approaches still win.

I might agree "AutoCAD" is the current level LLM's are at, but wait until your design departments discovers "Revit", its another ballpark (in wasted cots, engineers on site still get "clashes").

Revit costs are high, and the end results are marginally better - but local LLM's tokens are cheaper 24/7 at "AutoCAD" level - "Revit" level tokens will make Ubers CTO/COO weep harder than they already do. While producing results no better than "Revit" does (engineers still face "clashes").

Our_Benefactors•about 17 hours ago

As someone completely outside the 3D design world who always thought of AutoCAD as the gold standard - really? What program would be used instead? Please enlighten me.

Hasz•about 18 hours ago

Cadence and Ansys have entered the chat. A bunch of other highly-specialized engineering software has entered the chat. Licenses are on the order of 10-100k/seat.

For a pretty funny comment about pricing.

https://www.reddit.com/r/chipdesign/comments/1ajrli2/cadence...

analog_daddy•about 1 hour ago

Glad to run into this after some time!

I guess we are welcoming the software people to the world of expensive tools. Just sad that the FOSS alternatives of these tools are not as powerful whereas software industry still has FOSS tools to fall back on.

chatmasta•about 19 hours ago

Yeah, it’s nothing, and it’s also not the cost that enterprises are paying. As the article states, the price is $20 per seat per month, PLUS per-token API usage. Enterprises are paying consumption billing, not fixed rate oversubscribed “all you can eat per seat.”

avree•about 18 hours ago

CATIA licenses which are the most expensive I've seen are roughly $600/month per user. Where are you seeing "thousands of dollars per seat"?

mountainriver•about 17 hours ago

CATIA with plugins can go up to 100k a year. That’s what we currently pay

avree•about 14 hours ago

Wouldn't each plugin be a different piece of 'software'?

Ardren•about 16 hours ago

100k per seat? That's crazy. How do you even hire or train employees with software that expensive?

AlotOfReading•about 18 hours ago

CFD might reasonably be considered part of CAD and something like ansys costs about as much as catia. Still only doubles it though.

dnnddidiej•about 17 hours ago

Sure. Is CAD going to be used by every working human?

benhurmarcel•about 15 hours ago

Now add up the engineer’s salary and you’ll find that software seats already cost more than those R&D ones.

krupan•about 18 hours ago

But when previously your software developer tools were free, that's a huge increase

esafak•about 19 hours ago

How many guys is that? Every single white collar worker is in the AI ICP (customer profile).

edit: typo

smt88•about 18 hours ago

white collar*, not color

What does ICP mean?

simonw•about 18 hours ago

Insane Clown Posse, though given the context here probably Ideal Customer Profile.

darth_avocado•about 19 hours ago

How is the lack of bad news declaring a victory for AI? I am yet to see any company concretely publish analysis about the ROI from AI. Most companies as far as I know are still treating AI investment as sunk cost with no expectation of returns at the moment. We could very well see a world where companies heavily scale back investment.

dev_l1x_be•about 2 hours ago

I think the third company (likely Google) is going to make LLMs financially feasible with:

- dedicated hardware (https://cloud.google.com/tpu)

- optimized models (https://research.google/blog/turboquant-redefining-ai-effici...)

sourcecodeplz•about 19 hours ago

With deepseek and xiaomi mimo models slashing their prices 99%, I don't see a great future for openai / antrhopic with regards to their 1T valuations. Maybe 1T valuation will be the whole market, West + East.

jillesvangurp•about 15 hours ago

Most of the corporate world in the EU or North America will be hesitant to rely on Chinese AI providers. There are some very real blockers for that for things like data security, compliance, etc. And recent geopolitics don't help.

Legalities aside, you need to look not at the model quality but at the infrastructure needed to scale these models from tens (now) to hundreds (soon) of millions of users. Only a handful of companies actually have the resources and funding to do that. That's what these huge valuations are based on. These companies are gearing up to scale to these levels. That's why they are spending on data centers. Whoever has access to those data centers gets to tap into the revenue stream of people using models running on those.

The market for frontier models is roughly split between OpenAI, Anthropic, and Google. And then you have companies like X/SpaceX, Amazon, and Microsoft being more successful with their infrastructure than their AI products and companies like Apple, Meta that have the money and the aspiration but are so far not really managing to be very successful with their AI strategies.

Deepseek is just very poorly positioned to capture a lot of the enterprise revenue in the EU or North America. But they might become very dominant outside the US/EU. And of course China itself is going to be a huge market and equally unlikely to want to be depending on US owner AI companies.

sourcecodeplz•about 14 hours ago

Deepseek and all the other Chinese models have open-weights. You can host them yourself, no need to send data to China or rely on them.

tredre3•about 13 hours ago

There is still a risk of supply-chain attack. People give LLMs direct access to their entire infrastructure via tools, and never check the code produced. It's not difficult to steer an LLM during training so that they'd output malware only when prompted a certain way, and that wouldn't come up during the initial evaluation.

Personally I see no difference between China and America in terms of risks of them embedding "backdoors" so to speak, but I disagree when people claim that open-weight models are obviously safe just because they can be ran locally.

dannyw•about 11 hours ago

It is not a trivial challenge setting up model serving infra for ~1T or larger models, especially in a high reliability environment (e.g. your team is using it for work, or you're using it to power production apps). Sure, there are third party providers, although the quality and reliability of their inference varies.

lbreakjai•about 12 hours ago

Run it on Bedrock. If you're already on AWS, procurement doesn't even need to be involved.

slopinthebag•about 13 hours ago

Run Deepseek on Deepinfra then? Or Fireworks if US-based is important. None of these are real issues outside maybe convincing your legal team to do a bit of homework.

jillesvangurp•about 6 hours ago

I don't think you are appreciating the physical constraints here. Deepseek doesn't really have the hardware in the US or EU to do anything at scale.

Sure, you can self host a non-frontier OSS model yourself; including Deepseek. And no doubt some people will pay one of the companies I mentioned to rent the infrastructure to do exactly that. Much of the rest of the world will be paying directly for direct access to the frontier models.

As for the legal/compliance stuff, I recommend you don't take any big decisions on that front without consulting lawyers. My understanding of that is that most serious companies in the EU have to take these topics pretty seriously. I'm sure in the US, hosting all your data and secrets in Chinese data centers isn't a whole lot less controversial.

The Chinese could of course choose try to match the current levels of investment Google, OpenAI, Anthropic, etc. are putting into local infrastructure. But as far as I know they aren't and there are probably a few political blockers for that.

Without infrastructure, their role is being a niche player in these markets. It doesn't really matter how good they are if they can't scale to most of the market.

skeledrew•about 19 hours ago

They'll still have their dedicated enterprise customers. I think the Chinese providers will pull more of the single users who're paying their own way, than those backed by company budget. And it's a pretty good split as the demand becomes better distributed, resulting in better service (I'll never forgot must how bad access to Claude became until they got access to Colossus) and less potential for lock-in (we really don't want there to be a duopoly, etc on good AI).

kkarpkkarp•about 7 hours ago

> I currently subscribe to the $100/month Max plan from Anthropic and the $100/month Pro plan from OpenAI. If you are a heavy user of coding agents these plans are a fantastic deal.

Guys, what - in your opinion - does "heavy user" mean? I thought I am heavy user (I am using AI to code every day 8hr a day + side projects) but 20 USD/month Cursor plan is always enough. What should I be doing to extend my license to higher level?

jeswin•about 7 hours ago

I'm out of credits codex pro credits (200 plan) in 4-5 days, and then I switch to claude (another large plan) mostly or just go out more and get Vitamin D. :)

I spend most of my time designing and tweaking tests suites, and improving test performance. These commits are almost entirely Codex: https://github.com/tsoniclang/tsonic/commits/main/ - but it's possible only because there's a very large test suite attached to it.

All of that is very token intensive. If OpenAI gave me 3x my limits, I'd find ways to eat it up in a week.

What do these tokens give me? Well, I think in a week or two, I hope to port the TypeScript-Go compiler back into TypeScript, but compiled to native code. It's probably not particularly useful for most ppl, but it's a hobby project that I've spent the last 7 months on.

whazor•about 7 hours ago

My agents are easily busy 30 min to a hour independently. Implementing their plan, building and running tests, verifying deployment, linting. So I switch between multiple agents. Each agent has its own branch and worktree.

mock-possum•about 7 hours ago

How many sessions are you running simultaneously, how careful are you being about token consumption and context window management, and what level of reasoning are you using (with which models?)

8hrs a day doesn’t really mean anything without a lot additional qualifiers.

fwiw lately I’ve been straddling 2 or 3 claude codes and one Claude cowork, primarily on 4.7 with high effort - the company’s paying for it, so I’m doing my best to burn as many tokens as I have the mental capacity to manage. At that rate, the 100 account is completely necessary, I was blowing through my 4-hour limits consistently before requisitioning an upgrade.

CachedaCodes•about 19 hours ago

Ai has become indispensable but maybe not at all cost. My company just had a company-wide meeting to talk about how they're restricting who can use which models and instructing us the "be more responsible with company's tokens". And it's not an small company by any means.

gcr•about 1 hour ago

For an alternate albeit somewhat contrarian view, also see Ed Zitron’s piece that add context to Anthropic’s profitability: https://www.wheresyoured.at/anthropics-profitability-swindle...

TL;DR Ed argues that the deal between Anthropic and xAI could have been negotiated in such a way as to make Anthropic only appear profitable during its “ramp-up” period in June, which incidentally is also the month that Anthropic is making tons of other pricing changes.

crakhamster01•about 6 hours ago

> As further evidence that enterprise agents represent product-market fit for these companies, consider their open job listings.

PMF is one interpretation, but it could also be read as desperation.

In my opinion, we've been at PMF for quite a while now. The November inflection point that's often referenced definitely changed how we interface with models, but as far as coding goes, I feel like Cursor had proven itself useful for at least a year prior to that.

The demand has always been there, the outstanding question is still - how do you build a business on top of these products? None of the frontier models have emerged as uniquely capable, but open weight models are now catching up in capability as well. The explosion in go-to-market roles feels more like an attempt to lock customers into contracts so that they don't consider alternatives.

I assume the hope is that during this 12-month contract they will develop real integrations, something deeper than just a CLI harness. If you've ever worked in procurement or dev tooling at a reasonably sized company, you'll know that this is exactly what teams try to avoid.

It's anyone's guess what will happen this time, but I'm excited to see how the IPOs go.

cj•about 18 hours ago

> Coding agents really did change everything. These are tools which burn vastly more tokens

The assumption here is that this is a positive thing.

But this very well could end up being a major negative long term by increasing the cost per user, reducing margins.

More usage = more cost = less profit.

It's not obvious that more usage is good. It's only good if revenue per user increases more than cost does. I'm skeptical about that.

simonw•about 17 hours ago

> It's only good if revenue per user increases more than cost does.

That's why it's so important for these labs that they're selling API tokens for more than the compute+energy costs needed to generate them.

Every indicator I've seen is that they do have a positive margin on that. If they don't, they're screwed.

grttq•about 14 hours ago

No this is not what matters.

The customers of these tokens need to see returns on their projects that exceed the cost of financing.

Laying people off only goes so far.

If enough said firms don’t see enough value given the price of frontiers they will cancel and consume open source. This is the risk the frontier labs are exposed to.

mattas•about 17 hours ago

What's an example of an indicator? Genuinely curious!

simonw•about 17 hours ago

Insider tips from Google and AWS telling me that they run inference at a profit (though that was over a year ago now).

Dario telling Dwarkesh three months ago that they have a margin on inference: https://www.dwarkesh.com/p/dario-amodei-2?timestamp=3528.0

rldjbpin•about 3 hours ago

money talks, and being a broky from outside the valley/west coast, the "product-market fit" here are the neighbours of the service providers.

the economics simply don't work unless you make six figures, at least to just give it a go blindly. the providers are also still figuring out what they can get away by charging, and they are getting a similar treatment from those under the stack.

the caps and limits are not very transparent, and it is quite difficult to know what is "enough". the current rate does not stay the same and the contract is changed way too often to dedicate for the long term. regardless, the subsidized rates should not be sustainable forever. make hay while the sun shines i suppose.

dep_b•about 5 hours ago

It’s funny how a US$30 ReSharper license that would give me a 20% performance boost in programming was out of the question, while me burning US$3000 per month on tokens for a 40% boost in performance never was questioned

mrpopo•about 3 hours ago

I feel this so much. What was all that fuss about needing 3 levels of approval for any software license? Or having to provide a compare chart against open-source alternatives?

The money would be so much better spent that way as well, supporting individual programmers.

harrouet•about 6 hours ago

Product-market fit, but what about customer retention?

It is quite trivial to switch from using one model or another. Likewise, in a few years we'll have affordable laptops to run today's frontier models.

What's their plan to let us keep subscribing?

simonw•about 3 hours ago

Right now the main plan for that appears to be having those enterprise accounts commit for a year at a time.

mchusma•about 5 hours ago

If we define product market fit as profitable with a trillion dollar valuation, I think the term has lost its helpfulness.

I do agree with the author that these companies seem much stronger financially recently though.

throwatdem12311•about 1 hour ago

If they’ve spent all this cash, and all we have to show for all this AI baloney are crappy coding agents that constantly make the same mistakes no matter how much you try to guide them.

All the slop content, all the bots, all the misinformation and fake AI images and videos.

All of the social and economic disruption from datacenter buildouts.

The massive nosedive in reliability on the world’s software infrastructure.

After all of that and all we get is a code bot so a few incompetent loser devs can bloviate about not writing their own code and brag about never reading it.

Burn it all to the gd ground. Destroy this new Tower of Babel.

mbesto•about 18 hours ago

> but I suspect there’s a more important factor here: I think they’ve finally found product-market fit

Ahhh the classic startup term that's definition is nebulous. But also, since when does any definition of product/market fit mean a product is profitable? And profitable in what sense? Unit economics? Overall company?

simonw•about 18 hours ago

Oh I'm absolutely taking advantage of the fact that "product-market fit" has a bit of a nebulous meaning here.

It's a great hook to build an article around. My core point is more that April 2026 was the point when Anthropic and OpenAI finally appeared to have figured out a credible business model.

mbesto•about 15 hours ago

> My core point is more that April 2026 was the point when Anthropic and OpenAI finally appeared to have figured out a credible business model.

How so? What's specifically changed? We still don't know what their unit economics are and everything you've documented is basically speculation at this point.

simonw•about 15 hours ago

> What's specifically changed?

1. Both Anthropic and OpenAI significantly increased the prices of their latest models. They're clearly not trying to offer the lowest-price-possible to drum up demand any more.

2. Both Anthropic and OpenAI no longer let enterprise companies buy discounted almost-all-you-can-eat subscriptions. Those big enterprises are now paying full API prices.

3. According to reasonably well-sourced leaks, Anthropic may be about to have their first profitable quarter.

And I didn't even say "profitable", I said "credible business model". I think getting companies to spend hundreds of dollars per month per seat, WITHOUT crazy subscription discounts, is a credible business model.

iancmceachern•about 7 hours ago

It reminds me of this Steve Jobs Clip:

https://youtu.be/0lvMgMrNDlg?si=QkkOnngYTjaSPlIy

He said, so many years ago, that there will become a time where computing power is so prevalent that we will stop using the person to make the computers job easier and start using the computer to make the humans job of interfacing with it easier.

But in this context, it would mean the other side of increase productivity is decreased time to do the same work. These are the same thing.

smokel•about 19 hours ago

Does this analysis factor in potential caching of tokens on the server side? It seems that if they organize things well (as a model provider), they can save quite a lot on that. Looking at my Cursor statistics makes it clear that the token calculations are not at all trivial.

simonw•about 19 hours ago

I believe the ccusage tool I used takes cached token pricing into account.

osigurdson•about 19 hours ago

Realistically, OpenAI found product market fit with the OpenAI API playground in 2021. People were using that as ChatGPT at the time.

mesmertech•about 19 hours ago

If nothing else this blog did give me the idea that I should split my $200 claude max plan into two $100 CC max and $100 codex plan, esp because Claude is now offering 1.5x weekly limits so its the 5x usage is now more like 7.5x usage.

Havoc•about 18 hours ago

>I should split my $200 claude max plan into two $100 CC max and $100 codex plan

You may want to get one of them to check the math on that :p

asim•about 18 hours ago

Love how everyone boasted about replacing all the software with ChatGPT and then we end up with coding agents meaning the software engineer are STILL important. The sell is the development tool. It's classic cloud. Where did all the ops people go, many got subsumed by the cloud companies YET every company still has DevOps people to manage cloud infrastructure. The layer of abstraction went up but we still need the people to write the glue code and understand the business. OK great there's a new cash printer in the room. There's a new tool. Let's just start to ground the tooling in its new found gravity, profitability and IPO market dynamics... Reality has set in. The hype cycle is about to explode... Do you remember ride hailing and just how much cash was burned on credits pre Uber IPO. Then remember the IPO itself? These companies are not the new Google. They are a layer on top. Google was still the most efficient cash printing machine in history beyond the the US government and might still be. Will be interesting to see what the trillion dollar IPOs turn into. I'm going to say we see those prices get cut to a third in less than 5 years and scale back up over the next 15-20 years.

thewebguyd•about 18 hours ago

> The sell is the development tool.

I've been calling that out for a couple years now. LLMs best and most viable use case is still just as a dev tool. Even for non-programming tasks, I still get better results from the LLM if I instruct it to write code to do the task...look at Claude Cowork for example, it's everything I used to do with python myself. It's not really a novel capability, it's just using python & bash for automations that any sysadmin has been doing for decades. Yeah, that's valuable for a non-techincal audience but is it $1T valuable? I don't think so.

When has an IDE or other dev tool ever commanded a $1T valuation?

These things get lost in discussions because people conflate "overvalued" with "not useful." LLMs are useful, particularly as dev tool, but Anthropic & OpenAI are definitely way overvalued.

protocolture•about 13 hours ago

I feel like we should just ignore all LLM news until after they IPO, lots of positive sentiment astroturfing bots.

tornikeo•about 6 hours ago

And I think that's amazing. I'd like to keep using the subsidized coding tools, especially Codex, since I've given up on Claude. Hopefully the PMF allows the subsidy to continue. Would hate to have to move to the next coding harness again.

mark_l_watson•about 10 hours ago

Well, good for them that they are charging enterprises API rates. Why in the world not do something similar for consumers? Use for free a few times a day, have a $5 dollar plan for light use, and perhaps $10 to $15 for heavier use. If 90% of consumers pay nothing then ‘drop them’ except for letting them have an account and a few queries a day.

It is easy for me to change providers. Right now I use the open source Claud Code harness with two paid API venders for DeepSeek v4 (flash and Pro). I like seeing how much each session costs.

mtrifonov•about 18 hours ago

They certainly have, but it relies entirely on the assistant frame, which is a problem in and of itself for the trillion-dollar economics.

Anthropic and OpenAI have shown people want a tool for task offloading, driving predictable token consumption and justifying the math, so long as users stay in that dynamic.

However, knowledge workers using these tools daily are getting exhausted with them. Outputs come out polished but hollow. Talking to a frictionless, frame-completing model all day drains you.

If user behavior drifts away from assistant usage because of that, per-token math implodes. The valuations we're hearing about all the time rely on usage compounding daily. The fatigue is a timer running against that compound.

Anthropic's Constitution is the closest hedge out there, I think. Installing an identity structure into the model through training. But it's still assistant-first, so the fix there is only partial.

I've spent the last year running a product that flips the architecture so identity is primary and the assistant role is secondary. Same frontier models, completely different conversational quality. The fatigue property doesn't really show up.

Whichever labs figure out how to install real identity natively in the weights are going to be the ones with PMF in the next phase.

jvaqueiro•about 10 hours ago

I'm surprised by how good coding agents are. I think PMF was reached last year or the year before (at least for coding).

With current limits my 100/mo codex subscription is more than enough for the work I do.

However, I do worry about when does current subsidies are going to end? I can see myself paying up to 300/mo, but more than that will be prohibitive.

What's the long term plan here? Are OpenAI's and Anthropic's costs expected to increase/decrease?

firesteelrain•about 18 hours ago

Anyone actually making money paying all of these monthly fees? Or just hobbyists? I have yet to see any real ROI posted anywhere.

lugu•about 12 hours ago

Ask yourself: what is the ROI I provide? It isn't trivial most of the time.

firesteelrain•about 12 hours ago

We measure ROI of employees annually via ranking systems, who gets bonuses, promotions etc

rvz•about 17 hours ago

This is the same question I said about people running OpenClaw. You don't hear about anymore.

Other than the hosting providers, I am also yet to see anyone directly making money from their OpenClaw agent.

brazukadev•about 11 hours ago

Ironically, we had more new interesting things launched daily 5 years ago before AI.

smallerfish•about 19 hours ago

I think the reasons for them going with API pricing will become abundantly clear when the S-1s become available. If they don't have a story covering how they can get revenue closer to expenses, then they're relying on the market to believe the pixie dust version of their profitability story, which I think people increasingly don't.

attentive•about 6 hours ago

> and both companies now have measures to lock their enterprise customers (who tend to sign year-long deals) at those API prices

why would enterprises do that if they can just use bedrock or vertex?

aenis•about 14 hours ago

The end game here is going back from a model where a bunch of product and tech management people sit in the U.S. or Europe, and try to manage thousands of mediocre talent sitting somewhere far away. The new model is you give those coding tools to good engineers colocated with your product people, and you ship good stuff much faster. If you can achieve such a setup, the token costs can be $50k per seat per month and you still run circles around the legacy IT models in terms of efficiency. Giving everyone the API keys and not changing the way products are managed is not going to work.

overgard•about 13 hours ago

Good lord, what company would want to spend 600k per employee just to go maybe %20 faster (what the studies seem to show is a realistic estimate for productivity gains),

I'm building a product right now with some AI coding (despite my negative sentiment about AI in general they are useful). I am both the product person and the engineer, and I'm pretty decent at using it, so according to the hype I should be seeing like a 10x speedup. I am not seeing that. It's definitely faster, but there are also days where I'm stuck cleaning up things after going too fast for too long, or periods where I need to put the software in front of people to get real feedback, or even periods where I just need to use it extensively myself to find the pain points and bugs. I just don't see this "running circles" once you get past an MVP and you actually need to build something secure and not embarassingly broken.

grttq•about 13 hours ago

To me the question is, can the frontier labs make the variance of output lower + make the output of higher quality to justify their prices?

If not lower priced chinese offerings will be better as its cheaper per token - giving you more attempts to offset the variance.

My feeling on the former is no... I believe they tried really hard but they've settled on pure marketing now to attempt to fight off the chinese with perceived superiority in quality.

showurwerk•about 10 hours ago

Not sure I can agree with this. Sure, we have enterprise paying, but are we actually getting anything new and interesting made from LLMs compared to if we just didn't have them at all?

Maybe acceleration in smaller teams. We still seem in the era of the early internet where what questions LLMs change hasn't exactly emerged.

Szpadel•about 17 hours ago

> but as far as I can tell those credit costs are an exact match for the API token costs listed for those models.

it is only true for USD. for example if you pay in euro, this is actually more expensive. kind of makes no sense, because it translates to $1 = €1

rubiquity•about 18 hours ago

I think it's fair to say they had achieved product-market fit when their revenues were growing deep triple digits month over month. What we're seeing now is that perhaps they have achieved profitability or at the least a more sustainable balance sheet.

wg0•about 10 hours ago

This requires hard evidence which isn't available. Circumstantial evidence is all otherwise.

Bloggers are having AI psychosis too.

simonw•about 10 hours ago

I thought the links I provided were pretty solid. A lot more solid then you'll see in most commentary about this stuff!

I agree with this person, let's use AI psychosis for when using an LLM gives someone psychosis, not for when we think, what, that a blogger made some poor assumptions?

https://news.ycombinator.com/item?id=48296794#48303200

Gravityloss•about 16 hours ago

I see two really good ideas for monetizing the free tier for consumers.

Firstly, if the user is asking for things where AI can link to products or services to buy, there's a very good relevancy, much higher than in other types of ads.

Secondly, since the AI often takes time to compute answers to user's questions, they could be shown ads while waiting. People could perhaps be less annoyed by this than some other commercials since they know the break has to be there anyway.

(First idea is something I came up when asking Claude to compare some products, or ask for help in lawn care. Second idea was by a colleague.)

spprashant•about 19 hours ago

So it largely sounds like many more people will be able to write software - and will use AI to do it. Existing software engineers will continue to automate their tasks away like they always did, but perhaps at a faster rate.

The impact of AI in other fields seems to be muted.

simonw•about 19 hours ago

I think it is applicable to a much wider range of knowledge work, but it's also harder to apply there.

Software development has the huge advantage that mistakes and hallucinations are very easy to spot: the software works or it doesn't.

Spotting errors in a research report or legal brief is a whole lot harder!

But... non-software professionals spend a huge amount of their time on tasks that can be safely automated - reformatting documents, extracting numbers from PDFs, all kinds of flavor of data entry.

Learning how to use a tool like Claude Cowork can take a big dent out of those.

slopinthebag•about 17 hours ago

> Software development has the huge advantage that mistakes and hallucinations are very easy to spot: the software works or it doesn't.

Do we not care about code quality, maintainability, performance, extensibility, or understandability anymore? Honest question, not a gotcha, it's just previously getting software to pass all the tests was a small part of what we would consider "working" or perhaps "good" software. Maybe that's different now with LLMs, idk. Maybe we need automated checks for these things as well, like not compiling until the code quality is good enough to let the agent finish it's loop.

tracerbulletx•about 11 hours ago

What code quality even means is different now, but also LLMs are capable of producing better quality code at scale in my companies experience. We are able to in fact sort of propagate best practices and structure via the llm to all of the teams even when they're working under time pressure.

simonw•about 17 hours ago

> Do we not care about code quality, maintainability, performance, extensibility, or understandability anymore?

Yes, we should care. I've been writing a whole book about that: https://simonwillison.net/guides/agentic-engineering-pattern...

pianopatrick•about 18 hours ago

If the AI can write code for robots the impact in other fields may be pretty large. Seems to me a lot of jobs can be automated with software and robots combined. The limit in the past was writing the software to get the robots to work. But if AI can remove that limit...

NortySpock•about 18 hours ago

"[would have spent] $1,199 with Anthropic, $980 with OpenAI"

How many tokens is that, input/output-wise?

(a) I'm curious if you feel like you got $2000 worth of value out of them in the last month?

(b) I'm also curious if you would have gotten similar quality out of a slightly lower-cost provider of an open-weight model? (e.g. Kimi K2.6 and DeepSeek v4 Pro) and what the spend would have been for that.

I myself have managed to spend not quite $4 on OpenRouter and have felt it was very worth it; I just have much smaller, or more targeted requests I guess. (Lately, adding features to a static site generator in Python, or setting up log forwarding via a docker compose file)

simonw•about 18 hours ago

Claude Code:

  Input tokens:        52,545,485
  Output tokens:        5,767,253
  Cache create tokens:  5,112,029
  Cache read tokens: 1,475,069,465
  Total tokens:      1,538,494,232
  Total cost:        $1,199.79

OpenAI Codex:

  Input tokens:          52,598,013
  Output tokens:          4,681,867
  Reasoning output:       2,091,063
  Cached input tokens: 1,153,844,864
  Total tokens:        1,211,124,744
  Total cost:          $980.37

I'm confident I got value out of OpenAI - I've been mainly on Codex for the last few weeks.

Not so sure I got that value from Claude, just because I've been using it a lot less and somehow the price came to about the same as OpenAI.

Given the code I've been able to build in the past month I genuinely do think I got value for the API price version, and (don't tell OpenAI or Anthropic) I think I'd have paid full price.

I've not spent nearly enough time with GLM-5.1 and co to compare, but I do know that the prompts I'm using with the agents are not prompts I would have expected to work just three months ago.

NortySpock•about 18 hours ago

Cool! Thanks for the details, and your blog posts are usually interesting food for thought, so thank you for them too!

krupan•about 18 hours ago

Are you saying that the software you wrote using those tools generated enough revenue to cover the $2000?

simonw•about 17 hours ago

Not yet, but that's because it was almost all open source and I'm really bad at generating revenue from that.

When I account for the amount of time it saved me there's no question $2,000 was worth it.

regularfry•about 18 hours ago

If it were me I'd be asking "How long would it have taken me to do that, and what's the rate I'd have been charging for the work I would have been doing otherwise?"

Personally, I've probably spent $60 or so on OpenRouter in the last month or so and got a working project out of it that it would probably have taken me a fortnight to knock together (which is inevitably an under-estimate because it covered things I'd have to learn but K2.5/6 already knew). There's an orders-of-magnitude gap there.

vishalrad•about 16 hours ago

This is great analysis but my first reaction was - is this trolling? The fact that we have to think about whether a $1TN company has achieved product-market fit both articulates 1/just how high the valuations are 2/How hard it is to pin down EXACTLY what PMF is. As a pre-revenue startup, I am laser focused on PMF and frankly if this is the bar, no one will achieve it. But OTOH its heartwarming that people are willing to value you at $1TN before you reach that. Guess everything is in the eye of the beholder?

simonw•about 16 hours ago

The title was indeed meant to throw subtle shade at the idea that supposedly $1tb companies were only just now starting to figure out product-market fit.

Zafira•about 11 hours ago

The amount of caveats and vibes that this article is powered by could run New York City for a decade.

jimnotgym•about 7 hours ago

They have the whole software industry addicted to AI. They will be pushing the price continually up, and like addicts they will find a way to pay

Zizdefense•about 16 hours ago

My Costs without CC Max+Caching over past 2 months: $112K.

Ran `ccusage` on my Claude Code logs.

- Total tokens: 22.2B

Without current Claude deals, my personal cost would have been *~$112,000*.

jcmfernandes•about 14 hours ago

These folks have no lasting moat and they know it. We are still close to the November 2025 inflection point, so they had a clear advantage during these past few months. That will soon fade as open-weights models become really good, which is arguably already the case with DeepSeek V4.

dnnddidiej•about 17 hours ago

Is PMF enough. It is such a dynamic self-disrupting wave that it is like predicting physical chaos. These aren't early Googles in a blue ocean. Maybe a blue ocean full of pirates and dragons!

This isn't me being a doomer I just don't know. Can we look at Q2 profits and draw hockey sticks yet?

Remember people are boasting how much their expenses are. That is where we are in the bubble/new paradigm.

cmiles8•about 16 hours ago

Article doesn’t mention on-prem and on device models. Almost guaranteed that there are a range of killer enhancements on these fronts waiting to drop until IPOs get closer to inflict maximum chaos on the valuation games.

While the big guys will argue they’re worth trillions expect others to drop chaos booms showing their NPV may be effectively zero.

chipsrafferty•about 10 hours ago

I think my company is going to cancel our subscription once it realizes we're spending more money in a week that a developer gets paid in a month.

Havoc•about 18 hours ago

What baffles me is the range of estimates.

Operating profit is both post depreciation and fees paid to third parties for hire. So aside from shenanigans like RSUs and financing interest that's already somewhat close to actual economics.

Meanwhile we've got commenters here talking of 5-10 trillion with a T revenue shortfall.

Those are very different takes on reality

pzo•about 18 hours ago

> If you are a heavy user of coding agents these plans are a fantastic deal. I just ran the ccusage tool on my laptop to get an estimate of how much I would have spent if I were to pay for API tokens in the past 30 days and got

You think this is fantastic deal only because they use similar like tricks where they inflate the price and tell you something supposed to cost $1000 but they have this today promo for $100.

I was there too and paying for a while. Few weeks ago I tried DeepSeek V4 Pro - expected its gonna be shit but its actually pretty good.

The deal is I pay daily ~$1 for DSV4-pro for ~100M API token usage. And they probably not getting broke because >90% of those token in practice is cache read and they very well optimized for that.

sourcecodeplz•about 17 hours ago

Yep, exactly this. And I have so much less anxiety that I have to use my 5-hour/weekly usage or I lose it... with deepseek api the credits never expire, I can use them when I want, how much I want and the prices are ridiculously low for the quality/intelligence/performance.

conradkay•about 16 hours ago

GPT 5.5 is maybe 4x the size of v4 pro, hard to compare price since cache hit is basically free with Deepseek, but 40x cheaper (with their 75% off) seems about right.

So ballpark same price per parameter as Simon.

x187463•about 19 hours ago

I wonder how a focus on per-token API profits will impact the incentives to improve token efficiency and drive down costs through optimized compute. I suppose as long as a few leading labs are competing, we'll see progress in this regard, but it's certainly less in their interest than it is with a flat subscription pricing model.

skeledrew•about 4 hours ago

There's also increasing pressure from China: note the massive permanent price reductions DeepSeek and Xiaomi announced in the past few days, with the possibility of more around the corner. And there's also the constant release of increasingly capable open weight models.

try-working•about 12 hours ago

Product market fit for the models, sure. But their cost structure is going to kill them as token costs drop this year, following DeepSeek and Xiaomi and soon more providers.

Hasz•about 18 hours ago

Mentioned in the article, but it cracks me up that both openai and anthropic are utilizing fairly traditional enterprise GTM plans segmented by verticals.

So many startups trying to automate sales, but somehow the two biggest frontier labs have decided that the best GTM strategy is firmly human-in-the-loop.

vfalbor2•about 7 hours ago

He is talking about tokenization of different llm and comparing, and this is inacurate because you cannot compare if you dont how it tokenice the words, some tokenicer could be cheap with more tokens than other because of their features. Not agree with Simón speech.

_verandaguy•about 18 hours ago

With respect to Simon, whose writing I've usually agreed with in the past and whose insights I've liked: this is a bad take that overlooks the extent to which corporations are imposing the use of AI on employees, and in particular ICs, who make up a majority of the AI-using workforce by headcount.

Many of us are either openly having our performance reviews tied to AI use, especially at larger enterprises. Whether that's measured by sheer token count or just "how many of your tasks are you using AI for these days" (combined with the implication that question carries at many orgs which are heavily invested in AI).

simonw•about 17 hours ago

Are you saying that Anthropic's huge leaps in revenue are caused by stupid company policies and token leaderboards, and the moment companies stop imposing AI on their employees revenue will drop to a point where Anthropic are unlikely to be profitable?

I don't think that's the case. I think the token leaderboard thing (which is clearly ridiculous) affects a tiny portion of companies and is already going out of fashion.

_verandaguy•about 17 hours ago

I'm saying that the truth lies somewhere in between, and that Anthropic's current revenue is being, in part, propped up artificially.

We're also in a place where a lot of the usage guidance around these tools is still nascent. People are cowboying a lot of stuff, even as larger companies start to organize AI policy/safety/responsible use working groups to try and policy around the shortfalls of the technology.

IMO: if this technology persists, and if we figure out a way to use it in a broadly safe way, the value proposition will probably trend down rather than up, at least on the code generation front.

As a research tool, it shows some promise, though I still find the ethics of the technology disgusting.

zuzululu•about 19 hours ago

Great article I know this upsets a lot of people who are used to thinking Anthropic/OpenAI are just lighting cash on fire but they've cornered the market on enterprise who cannot walk away from these $200/month plans

However the valuations are still far far away from actual sanity

binary0010•about 19 hours ago

Have you tried the large open source code models?

I use glm-5.1 and occasionally deep seek v4.

They are as good or better than Claude's latest models.

And significantly cheaper. I've converted 3 of my engineer friends as well. All three have dropped their $200 month plans they had with anthropic.

We've all been a bit shocked at just how good these models are now.

If you "have" tried GLM (I specifically find it shockingly good for code). Did you not think it's not competitive to Claude, and why?

BeetleB•about 19 hours ago

I use GLM-5.1.

It's good enough for personal stuff. It doesn't compare to the latest Opus I use at work. You can certainly argue I don't need Opus for work, but there is clearly a difference.

Also, at least with z.ai, GLM-5.1 is s l o w! After using Claude at work, I get really impatient with GLM-5.1 at home. When doing "true" vibe coding (i.e. not really examining the code), Opus is a ton faster (easily 5x).

But yeah, I'm not willing to personally pay for the frontier models. I won't even renew my annual Z.ai plan - it's become too expensive.

binary0010•about 18 hours ago

Hmm, I use opencode subscription, and glm seems just as fast from the tests I've tried to compare between the two. Tbh it mostly took Claude longer (mostly significantly longer) for the same tests.

Also, and I know you may not want to answer. But could you give me an idea of the type of thing you found glm to be worse with?

I think I've been fairly unbiased in testing a bunch of different development tasks. But am curious if maybe it performs well for some stuff and not others. So if you could share what you feel it's worse at.

Also are you an experienced developer or less experience?

cassianoleal•about 18 hours ago

I'll repeat something I wrote on an entirely separate HN submission.

When DeepSeek V4 Pro came out, I had been mostly coding with GLM-5.1 on a Z.ai coding plan.

I had a large analysis task on a relatively complex codebase. I decided to try the models out.

GLM-5.1 did acceptably but got a few things wrong (easily corrected) and took quite a while to get there.

Opus 4.6 burnt through the US$10 budget I had given it in about 10-15 min, without ever returning from the first prompt.

DeepSeek V4 returned a full analysis within 2-3 min, and I carried on all the way to implementing the feature I was after. Total cost less than US$1.00.

I now mostly alternate between GLM-5.1 and DeepSeek V4 Flash, with an occasional dip into V4 Pro for more complex analyses.

dominotw•about 19 hours ago

task i am working on right now at work is comparing two verisions of apis and documenting responses in their outputs. i suspect a vast majority of work at entrprise is of similar complexity.

right now everyone is using latest and greatest to do dumb stuff like that. that would change fast if companies start caring about costs.

therealdrag0•about 18 hours ago

What is the best IDE UI to use them? I don’t like CLIs.

szatkus•about 14 hours ago

Cline is pretty good if you use VSCode. It's one of a few AI agent plugins that works in the left sidebar.

binary0010•about 16 hours ago

Personally I'm happy with opencode right now

thewebguyd•about 18 hours ago

> enterprise who cannot walk away from these $200/month plans

Any org with more than 150 users aren't on $200/month plans, they are forced into API pricing + $20/month/user

For individuals and orgs small enough to get to use the subscription plans, that's all well and good until usage limits keep going down, or cost goes up. If you compare the usage you get on $200/month maxed out vs. what that would cost at API pricing, the $200/mont plan is an absolute steal. I doubt it will last long.

bigbuppo•about 18 hours ago

Not to mention the API plans are also still in their "lose money, just get the suckers hooked like addicts" phase. Once the reality-based pricing comes into play, it's a coin flip of whether the bulk of the companies fail, or they get to live off government subsidies for a few decades.

On the plus side, I'm happy I'll have a nice hay barn when the local half-built AI data center is abandoned.

simonw•about 18 hours ago

I believe that API pricing runs at a healthy margin, at least compared to the server and energy costs used to serve the tokens.

Recent conversation here on that topic: https://news.ycombinator.com/item?id=47062534#47063134

smallerfish•about 19 hours ago

> enterprise who cannot walk away from these $200/month plans

But that's the point of the article. Enterprise plans are starting to get API pricing, not the subsidized subscription pricing.

cedws•about 11 hours ago

My employer went from the Max plans to Enterprise this month which was utterly stupid. We went from paying $200 per head to like $500 for some people, even more for others. For the same product. Oh well, guess we’re doing our part to prop up this bubble.

vb-8448•about 17 hours ago

> That’s $2,180.16 worth of tokens for $200—not bad at all!

Just imagine how funny it will be if it comes out that big labs were doing some fancy maths to count the 2k$/month in their forecasts ...

jwpapi•about 10 hours ago

Unfortunately Simon drank from the AI Cool Aid.

I know everything you’ve done for the tech community, but I please you to take some time off and reflect on this article. It’s not on par with ur usual level, but the tendency has been visible from the last couple of articles.

simonw•about 10 hours ago

Genuine question: what's wrong with it?

I thought this was one of my best pieces of writing this year.

(In case you missed it, the title was meant as a subtle burn on those two companies - it's pretty absurd for them to only just be finding product-market fit when they're already valued at over a trillion dollars.)

jwpapi•about 3 hours ago

I see you trying really hard to make things right and are everywhere in the comment. I feel a bit bad in formulating my comments a bit polemical.

Also my highest respect for responding so calmly without lowering your debating level to mine. I try to best to explain what I think is wrong.

To start I didn’t interpret the article as a burn.

I think it’s interesting to explain what’s wrong with it, because it seems similar to what‘s wrong with AI. The issues are subtle.

- Anthropic doesn’t have a profitable quarter, it’s financial engineering (https://www.wheresyoured.at/anthropics-profitability-swindle...)

- The first argument about your subscription price, doesn’t has anything to do with the overall claim of the article. It would if at all be a weak argument to support the opposite. Subsidizing prices signals a lack of PMF.

- That you hire sales people after you had a billions of funding is nothing surprising and doesn’t indicate PMF or not.

- AI Implementations are fresh and of course AI Failures are thin, but so are AI successes. I haven’t seen any companies creating billions of shareholder value because they’ve massively invested in AI and their competitiors didn’t You really can look at these things in 5 to 10 years and it is multi-faceted including cultural acceptance.

- That they need to buy more compute to satisfy the requests is probably the strongest argument in the artical, but don’t conclusive. The product is been sold heavily subsidized and in hype cycle. And again both OPENAI and Anthropic have to show growth in order to justify the IPO.

- Regarding the part about revenue I refer to the linkedm article above, as it does explain it very well.

The conclusion is reasonable given the arguments, but not the title.

However it is missing all the real discussion points that are actually in observation at the moment.

Local models as alternatives, IPO finanical engineering, how AI implementation actually will perform over years... Let’s all not forget crypto. It’s been full of "use cases" just a bunch of years ago. I like the idea of crypto(btc,eth) and I’m still invested, but 99.99% of coins have died on promies.

So this is not a piece of critical thinking, but this reads like a twitter thread to sell me a course :/

sandeepkd•about 16 hours ago

If we take out the circular interests and investments here then there is no way that this is a feasible business in current state.

CuriouslyC•about 18 hours ago

Companies are kool-aid drinking now due to hype, but given how much they're spending, if they don't see REAL, BIG wins from it soon, they're going to scale it back quickly and switch to Chinese models. Claude isn't worth the API cost for a lot of development work, and once companies have had time to collect and crunch data they'll see this.

grttq•about 17 hours ago

Swear people like you were hyping the frontier labs so hard not long ago.

Funny to see the change of tone - a lesson for people not to get too ahead of themselves.

CuriouslyC•about 16 hours ago

There's no change in tone, I'm still very bullish on the tech, Claude in particular just isn't worth the API price, which I've always felt was too damn high. I have paid for Gemini 2.5 Pro, Deepseek 3.2/4 and GLM 5 tokens happily though.

grttq•about 14 hours ago

Lmao you won’t admit it will you?

You financially benefit from stuff like agents. Of course you will be the last to admit publicly when things aren’t quite heading in the right direction. The gap between hype and reality is ever increasing.

Legend2440•about 19 hours ago

>Somehow this fragment turned into headlines like Uber’s COO says it’s getting harder to justify the money spent on AI tokenmaxxing, because the market for stories about AI failures remains enormous.

I notice this all over the place. Many people hate AI and want it to fail, and they're willing to invent misinformation if it supports that idea.

hansmayer•about 18 hours ago

Well, it is a big news when the COO of Uber says it no? Not quite some small consultancy shop here.

Legend2440•about 18 hours ago

But the COO did not say that. The headline was deliberately misrepresenting what he said.

uncivilized•about 17 hours ago

The article was posted on HN and discussed a day or two ago.

https://news.ycombinator.com/item?id=48268871

hansmayer•about 17 hours ago

No, he said exactly that, if you remove the corporate sanitised language designed to not offend the Uber CTO.

jsemrau•about 8 hours ago

Anthropic for sure. It's a useful professional product that I find many use-cases for in my professional and private life. OpenAI not so much.

CAX•about 7 hours ago

$50/month tops is like 60B

airstrike•about 19 hours ago

Who's to say those enterprises won't churn after XYZ comes out with a decent enough model that costs 10x less to use?

There's a whole bag of clever tricks you can play to juice short term results leading to an IPO that may not work longer term.

I'll believe they've found product-market fit when they have a product. Right now they're selling the infrastructure, in a highly subsidized and undifferentiated way (at least over a sufficient long period of time of, say, a couple of years).

atleastoptimal•about 17 hours ago

I think this was obvious since the birth of ChatGPT

Intelligence is a universal good, it can apply to anything, and no, "human intelligence" is not the only form that is useful nor special. There are limitations to AI but also huge advantages, and its obvious that the advantages are worth paying for, given their revenue.

signalbright•about 10 hours ago

That might be an understatement

dude250711•about 19 hours ago

> Anthropic are strongly rumored to be about to have their first profitable quarter.

Is that quarter same as any other quarter in terms of infrastructure costs (e.g. are there any temporary discounts happening coincidentally)?

MadxX79•about 19 hours ago

Didn't xAI basically donate the compute for that quarter so Anthropic could get to say they turned a profit?

simonw•about 18 hours ago

The SpaceX S-1 says they're charging Anthropic $1.25b a month.

travelalberta•about 18 hours ago

It also states that the first few months (this current quarter where Anthropic are reporting profit) are discounted.

travelalberta•about 18 hours ago

Hey man, that discounted rate on Colossus 1 inference is purely coincidental...

raincole•about 11 hours ago

The article is at least one year too late. Claude Code has been the product-market fit. It's so obvious retrospectively that questioning it at this point would be very silly.

The problem is not whether they have PMF (they do) but how they're going to compete against on-prem and Chinese providers.

Having PMF != printing money forever.

The author claim:

> That’s $2,180.16 worth of tokens for $200

No matter what it means, rebuild the same thing you built with these $2,000 tokens with DeepSeek Pro V4 and let's see if Claude has a chance to survive.

DeathArrow•about 7 hours ago

I think we are overly fixating on Anthropic and OpenAI.

Since there are lots of models that are competitive and have a much better pricing, both OpenAI and Anthropic seem inefficient. I don't get why someone would want to buy shares after IPO apart from fomo and artificially built enthusiasm.

Anthropic and OpenAI may well be the Altavista and the Yahoo of the AI age.

aagha•about 15 hours ago

Ed Zitron begs to differ

Maojer•about 7 hours ago

Man what a disaster article

vb-8448•about 17 hours ago

I'm a huge fan of agent coding but kinda dislike this "llm evangelism".

There are still several open points (eg.: code churn, maintainability, subtle bugs human will never do) that everyone with a minimal programming knowledge that seriously used a LLM agent knows about but somehow none of these "big influencers" never mention (or just saying "it's your fault").

MaxPock•about 11 hours ago

The writer is on to something. Musk saw the writing on the wall and bought cursor. AI money is in coding plans

mschuller•about 18 hours ago

yep, and the issue is, they took investment

vonneumannstan•about 15 hours ago

HN is the least agi-pilled place on the internet

reducesuffering•about 15 hours ago

HN has been on every wrong side of AGI predictions since the founding article on OpenAI...

marcosdumay•about 13 hours ago

You mean that HN has been predicting AGI will happen, or you mean the Singularity happened earlier today and I missed it that humanity is already all killed?

willsmith72•about 8 hours ago

can we stop throwing around the term "profitable"? didn't they say "operating profit"? so so different to actually making a profit

emsign•about 4 hours ago

They've finally forced the system to fake profits. Attacking personal computing, sucking the rest of the industry dry, yeah eventually you can make people pay for your shit. I still pray these vultures will implode without too-big-to-fail safety nets. And big daddy Trump can't help them anymore by fixing the system for them, no matter how much billions they are paying into his personal pockets.

The future are small models, nobody really needs big compute in the long run, that's why big tech is going for our personal hardware. So we won't become their competitor in their rent only economy. True competition is eliminated, natural evolution is being fixated by the government. This is not going to end well for the USA.

einpoklum•about 3 hours ago

Sounds like mostly-baseless praise for the LLM behemoths / capital investment black holes.

> I currently subscribe to the $100/month Max plan from Anthropic and the $100/month Pro plan from OpenAI

... which already indicates a bias.

> If you are a heavy user of coding agents these plans are a fantastic deal... that’s $2,180.16 worth of tokens for $200—not bad at all!

Thank you for the sales pitch. Perhaps go buy a car and tell us how [insert your manufacturer] has "found product-market fit"?

> Anthropic are strongly rumored to be about to have their first profitable quarter.

First, the strong rumor is a claim by Anthropic itself. But even assuming that's true - it's an "operating profit", i.e. disregarding the massive capital expenses for years, and may also disregards ongoing capital expenses, if those happen not to be taken this particular quarter.

> 1 Trillion .. companies spending $200+/month/user will get you there a whole lot faster

First note the use of the first person plural to talk about Anthropic and OpenAI.

But that aside - most companies aren't paying $200 USD/user-month. But even if they were - if we take the 30 Million SW developers mentioned in trjordan's comment as subscribers, that's 2400/user-year * 30 Million = 72 Billion USD / year. And this is already rather optimistic, but - want to double that number of subscribers? Fine, make it 150 Billion / year. Still not there with a rosy outlook and assuming the hype and enthusiasm continue for many years.

And those rosy estimates are more likely than not unrealistic. I am reminded of this review of some empirical research regarding the benefit of LLM/AI use:

https://cmr.berkeley.edu/2025/10/seven-myths-about-ai-and-pr...

claysmithr•about 11 hours ago

I’m happy with spending zero dollars on ai lol..

stego-tech•about 18 hours ago

The big assumption with all of these sorts of analyses is that things will continue as they are for the foreseeable future.

In hype-driven markets, you cannot be certain of that.

Let's take a view that the author is right: coding agents and their associated harnesses were the inflection point for some degree of profitability and widespread consumption, and that these tools are now yet another SaaS subscription or API bucket expense to bake into every single developer (or developer-adjacent) in the organization alongside your collab suite, HR seat, CRM seat, design seat, etc. To be fair I honestly think that's a safe assumption to make for highly technical firms whose image is derived from remaining on the cutting edge of things.

That begs the following questions, which we won't know until IPOs start happening:

* Are subscriptions profitable, or just API consumption?

* What's the run rate when we just consider subscription-based usage like Claude Code and Codex? What about API calls?

* Is there any profitable pathway forward at which enterprises can get unlimited usage but at fixed rates via subscription?

* What does customer churn look like for subscription users versus API users?

We also have a number of questions for customers that I suspect we'll start seeing receipts for in the coming months, at least from the early adopters:

* What was the net gain (loss) from leveraging coding agents?

* What's the cost of a developer with or without access to a coding agent + harness? Is it cheaper to hire an outsourced worker with a coding agent subscription, or a domestic worker without one?

* At what point does further AI spend result in diminishing returns, i.e. where's the 'sweet spot' for spend?

* Did AI boost actual revenue and outcomes, or did it just gamify KPIs?

* What roles or work did AI actually replace, versus merely displace during the hype cycle?

Not to mention the questions regarding the technology itself:

* Will we develop the means to run foundational/frontier models at edge using less resources through some existing (e.g. distillation) or new technology, thus cutting off the profit centers of these firms?

* When the market mismatch between supply and demand is resolved, won't it be more affordable for consumers and companies to operate their own AI infrastructure rather than support further centralized buildouts?

* Will coding agents improve to the point of being able to bootstrap and self-orchestrate on edge/consumer hardware without substantial technical expertise, or at least improve to the point that traditional IT teams can securely operate them internally without an expensive subscription or API token bucket?

All of these will influence the long tail of this bubble, because it is a bubble at this point. Even if these companies are indeed profitable thanks to the coding agent inflection point, there's still so many unanswered questions about utility beyond coding that it's impossible to extrapolate a future. If coding agents are indeed the extent of utility for profitability, then there's no possible way these entities will recoup the investment already sunk into their infrastructure buildouts. Even if more profitable uses are discovered, does this offset or replace the firms disappearing due to AI speculation and their associated contributions to the economy as a whole (RE: the consumer compute industry at present, higher energy costs due to datacenter builds, opportunity cost from harms to local infrastructure from haphazard builds, etc)? Should these firms indeed be runaway successes and immensely profitable to the point of paying off their investors and growing the larger economy, does this end up stifling innovation in a world where most new ideas are fed into LLMs for R&D that are then controlled by only a handful of companies and immensely wealthy people, via systems that are easily surveilled and stolen from without recourse?

So many, many questions yet to be answered. Betting the farm because of coding agents is one hell of a gamble.

epolanski•about 16 hours ago

Wait till people figure out they can swap Claude code for DS4 Pro and spend a fraction in API billing (actually, significantly less than $100) while barely noticing a difference.

hansmayer•about 18 hours ago

> Anthropic are strongly rumored to be about to have their first profitable quarter

No, its more like their own leak to WSJ and according to Ed Zitron -> seems to be heavily engineered via non-GAAP practices such as counting potential, but not realised revenue as actual revenue - the stuff for which I would be arrested if I did it at my company.

Also it appears according to Ed's analysis - strangely they seem to be projecting only that one quarter as profitable - potentially to calm the investors ahead of the IPO. Investor fraud anyone?

cootsnuck•about 17 hours ago

Also it was but a few months ago that their CFO said, in a court filing, that Anthropic's revenue across the entire lifetime of the company "exceeds $5 billion". Pretty strange.

https://www.reuters.com/commentary/breakingviews/anthropic-g...

jonas21•about 17 hours ago

How is it strange? The "exceeds $5B" quote was from December 2025. Anthropic has seen tremendous growth since then, ever since Claude Code with Opus 4.5 got really good at coding.

If you've ever been at a startup, this is exactly what it looks like when you go from not having product-market fit to having it (though with a few extra zeros on the end compared to most).

hansmayer•about 17 hours ago

Ah yes, December 2025...such a long, long time ago...

peteforde•about 15 hours ago

Ed is a smart guy, but you or anyone basing your opinion on what one eloquent journalist says is ultimately a risky bet, no matter how much his reporting hits your particular dopamine receptors.

Please don't forget that Ed's entire brand identity is now 1:1 with exposing "AI" as a giant, unmitigated failure.

That's a very specific flow chart to hook your caboose to when none of this is even remotely close to endgame.

HerbManic•about 14 hours ago

Pretty much. Ed does a lot of great work in digging through all this stuff but his conclusions always feel far too doomer oriented. OpenAL should have closed 5 times by now if you have been following his assertations from the start.

There will be big parts of what he says are true once the rubble settles but it will not be anywhere near what he is predicting. How that will shape out may not be great for the average person, what money shuffling tricks will be used? But it won't be a total wreck.

peteforde•about 13 hours ago

> It won't be a total wreck.

Honestly, I think it's very short-sighted to assume that all of this will be seen as any kind of wreck in the long term.

Normies are still catching up and reacting to chat-based LLMs.

HN types are further ahead of the curve, but still catching up and reacting to agentic coding and design workflows.

What often gets completely ignored is that entirely new modalities for how the underlying tech can be applied will continue to be demonstrated, and those will once again cause new ripples of excitement and disgust.

There are companies building world models and systems for protein discovery. Comparatively speaking, these approaches are barely in the zeitgeist today.

Deciding that we already have the data points we need to extrapolate how all of this plays out is like someone in 1974 deciding that microprocessors are just for accounting and inventory. Don't be that someone.

yogthos•about 14 hours ago

We don't have to take Ed's word for it. Anybody who's capable of doing grade school math can see that the numbers simply don't work. These companies are literally spending orders of magnitude more money than they're actually bringing in. Cursor, who've been renting Claude, estimated just recently that a $200-per-month Claude Code subscription could use up to $2,000 in compute. https://www.forbes.com/sites/annatong/2026/03/05/cursor-goes...

simonw•about 14 hours ago

Interesting story. Here's what it says:

> According to a person familiar with the company’s internal analysis, Cursor estimated last year that a $200-per-month Claude Code subscription could use up to $2,000 in compute, suggesting significant subsidization by Anthropic. Today, that subsidization appears to be even more aggressive, with that $200 plan able to consume about $5,000 in compute, according to a different person who has seen analyses on the company’s compute spend patterns.

The load-bearing detail here is if that means $2,000 of internal server+electricity costs, or $2,000 if they were to charge at their API pricing instead of the subscription cost.

The latter is how I understand these things to work right now. If it's the former then yeah, Anthropic are losing a TON of money on those subscriptions.

surgical_fire•about 17 hours ago

Also, if I understand correctly, they are rumored to have a profitable EBITDA.

It's a funny metric considering Depreciation is a huge cost for them.

"We are profitable when we don't count our expenses"

skybrian•about 17 hours ago

There's a good reason to look at it separately: if inference is profitable then they make money (or at least lose less money) when they get more customers, because any fixed costs are spread across more usage.

dminik•about 14 hours ago

Assuming that there are infinite suckers with cash to spend. It's entirely possible (if unlikely) that the market is not big enough to cover the training costs. Especially for multiple companies all burning insane amount of money on the regular.

surgical_fire•about 17 hours ago

Depreciation is part of the cost of inference. Inference happens in GPUs that have a relatively short lifespan.

Those GPUs are very expensive.

Inference is expensive because a GPU can only process a certain amount of requests in a given timeframe. Remember that Anthropic is constrained in compute.

If they are constrained, it means that those GPUs are not idle. If they have more customers, they will need more GPUs.

If they have to play silly games using EBITDA to be "profitable", then it means that they need to ramp up prices a lot more than they already did.

Which is why in these discussions I always say that inference is also extremely expensive. Too many people like to pretend without any evidence that inference is cheap.

downrightmike•about 15 hours ago

Bubble popped when they increased prices. IPO may help cover some of the costs, but AI is very elastic and can be swapped out for any other company second to second. Which is why I think they bought up ram and disks like they did, to starve out competition and local models.

grttq•about 14 hours ago

Correct. That’s exactly right.

The move to buy up ram is straight out of a industrial organisation textbook.

downrightmike•about 8 hours ago

It was the root concession that scale will not solve AGI

duped•about 18 hours ago

AI companies/users are filled with liars and grifters, so any numbers/outlook they report should be highly suspect.

supern0va•about 17 hours ago

I must admit that I am going to find it fascinating when we hit the point where it becomes nearly impossible to deny the efficacy of these tools. I have straight up had people, even in real life, suggest that I'm lying about my productivity gains or what I'm able to accomplish with them.

Like, I understand the reasonable arguments against (I even agree with a few), but it's clear that some people have fully inserted their head into the sand and just don't want to believe any of this could be true. Which will be harsh, since I think getting hit with this train all at once in the future is going to be a rougher ride than a slower coming-to-terms-with, even if the result is one we're unhappy with.

duped•about 15 hours ago

I don't deny their efficacy, I'm saying that there's a massive crop of grifters and liars building them and using them.

hansmayer•about 16 hours ago

In the meanwhile, Google AI search still says the next year after 2026 will be 2028.

bflesch•about 18 hours ago

There's a saying "the fish stinks from the head".

pier25•about 18 hours ago

Yeah I'll believe it when I see it. Revenue is increasing but so are their costs.

Back in 2024 their CEO claimed training costs would rise to $10-100B in the next years.

https://www.tomshardware.com/tech-industry/artificial-intell...

aspenmartin•about 17 hours ago

thats not that far off. Costs like $100Ms to train a frontier coding agent model today, billions if you count the full pipeline. Combine that with the infra we're building out, the fact that you have multiple labs building similar scaled models, the industry-wide costs of training frontier models could easily surpass 10B/yr in 2027

pier25•about 17 hours ago

Yes, when he made that claim back in 2024 they were spending like $100M to train a model.

hansmayer•about 17 hours ago

Their CEO claims a lot of wild shit. He claimed in January this year, that in about 2-3 weeks from this moment, i.e. "in 6 months" that AI will be doing all of SWE work. Lets hold these people accountable for a change!

aspenmartin•about 17 hours ago

> "in 6 months" that AI will be doing all of SWE work

I assume this is the quote you're referring to from Davos?

"I have engineers within Anthropic who say I don’t write any code anymore. I just let the model write the code, I edit it. I do the things around it… we might be six to twelve months away from when the model is doing most, maybe all of what SWEs do end to end."

that was in Jan, he said "might" and he said 6-12 months. Yes! Let's hold him accountable for saying reasonable things!

supern0va•about 17 hours ago

I work in big tech and probably 90% of code over the last month has been written by AI. And I suspect it's probably higher within Anthropic, which is probably what he's basing his opinion on.

So, he's closer to correct than not.

That said, your recollection is also flawed. It was in mid-March, and here's the relevant quotes:

>I think we’ll be there in three to six months—where AI is writing 90 percent of the code. And then in twelve months, we may be in a world where AI is writing essentially all of the code.

[...]

>But the programmer still needs to specify, you know, what are—what are the conditions of what you’re doing, what—you know, what is the overall app you’re trying to make, what’s the overall design decision? How do we collaborate with other code that’s been written? You know, how do we have some common sense on whether this is a secure design or an insecure design?

[...]

>So as long as there are these small pieces that a programmer, a human programmer, needs to do, the AI isn’t good at, I think human productivity will actually be enhanced. But on the other hand, I think that eventually all those little islands will get picked off by AI systems.

With another 3-4 months left on the clock, his prediction seems remarkably on point for at least certain organizations and domains.

I welcome you to also hold yourself accountable in the coming months if this trend continues. ;)

sampli•about 17 hours ago

Elon playbook

supern0va•about 17 hours ago

>according to Ed Zitron

So, unsourced vibes from a shady guy whose entire empire is built on being against AI?

I genuinely don't know how folks can continuously buy into anything he has to say after that Wired piece. The credibility there is seriously lacking.

Please, continue to be skeptical of the labs. But people need to stop talking about this dude as if he's the Holy Grail of the anti-AI movement. It's going to blow up in y'alls faces.

overgard•about 15 hours ago

Ed actually provides sources and goes into an incredible amount of detail as to how he came to his conclusions. The average AI booster just goes "I totally built ten businesses off vibe coding but I can't tell you anything because it's a SECRET!". And the mainstream tech media is so in the pocket of big tech and AI corporations that they might as well just publish their PR emails at this point. Yeah, I'll listen to Ed thank you very much.

I think it's telling that most critics don't address his actual points, but instead his credibility because he's a "hater".

supern0va•about 11 hours ago

Ed actually seems to make some really serious errors in his work. Tim Lee called out a particularly egregious one here, though it's one of many: https://x.com/binarybits/status/2050562429709377986

That said, I really mean it when I say that I don't actually think Ed is a good choice for the anti-AI movement. I think an actual opposition is useful, but he ain't it.

I really recommend you read the Wired profile if you haven't yet and form your own opinion: https://www.wired.com/story/ai-pr-ed-zitron-profile/

hansmayer•about 17 hours ago

> So, unsourced vibes from a shady guy whose entire empire is built on being against AI?

Actually he provides sources when he analyses stuff and imho much better than the usual corporate "Sam Altman says we should ask ChatGPT how to raise babies" crap. Also, I don't know many 'shady' guys who have built entire "empires", nor does he seem to actually have an empire. Usually being shady means you are kind of unknown and all. I am not glorifying Ed, don't even know him personally. I am not even impressed with his writing style much to be honest. But he brings important facts and information to light, which otherwise would have been lost in the cacophony of corporate media light treatment of these con-men. Holy Grail? Blowing up in our faces? WTF are you talking about?

supern0va•about 17 hours ago

>Actually he provides sources when he analyses stuff and imho much better than the usual corporate

You said it was likely an internal leak to the WSJ "according to Ed Zitron". Did Ed have a source for that, or was it just vibes?

hansmayer•about 18 hours ago

> I currently subscribe to the $100/month Max plan from Anthropic and the $100/month Pro plan from OpenAI. If you are a heavy user of coding agents these plans are a fantastic deal.

Agreed. But its only a great deal because it is heavily subsidized, as you said yourself. Enjoy while it lasts, but in my book, product-market fit means something along the lines of "product which enjoys a loyal customer base, sold at a price perceived fair by the customers, and generating profit. How many of these does your definition of product-market fit hit here?

bellowsgulch•about 18 hours ago

How will they stay profitable if every business lays off engineers because of AI and there are no engineers to use it? /s

827a•about 18 hours ago

[flagged]

tomhow•about 8 hours ago

We detached this comment from https://news.ycombinator.com/item?id=48297777 and marked it off topic.

The guidelines specifically ask us to avoid nasty comments like this. These lines in particular make it clear why your comment is out of line:

Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

When disagreeing, please reply to the argument instead of calling names. "That is idiotic; 1 + 1 is 2, not 3" can be shortened to "1 + 1 is 2, not 3."

Please don't fulminate. Please don't sneer, including at the rest of the community.

Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.

Eschew flamebait. Avoid generic tangents.

https://news.ycombinator.com/newsguidelines.html

simonw•about 18 hours ago

Fun fact: the $20/month subscription fee for ChatGPT Pro - which set the standard for at least a couple of years - really was an arbitrary decision made based on a Google form: https://simonwillison.net/2025/Aug/12/nick-turley/

enraged_camel•about 18 hours ago

I wonder how Ed Zitron will shift goal posts this time, and how long it will take for that article, when published, to reach HN front page.

wewewedxfgdf•about 17 hours ago

Simon Willison just hit the "Publish to top of HN" button.

simonw•about 17 hours ago

Wish I'd hit that one the other day on this one, which I cared a lot more about: https://news.ycombinator.com/item?id=48228321