Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
66% Positive
Analyzed from 4153 words in the discussion.
Trending Topics
#code#language#intent#don#llms#more#llm#abstraction#write#formal
Discussion Sentiment
Analyzed from 4153 words in the discussion.
Trending Topics
Discussion (82 Comments)Read Original on HackerNews
And on Jess' comments on validating docs vs generating them... It's a traditional locking problem, with traditional solutions. And it's not as if the agent cannot read git, and realize when one thing is done first, in anticipation of the other by convention.
I'm quite senior: In fact, I have been a teammate of a couple of people mention in this article. I suspect that they'd not question my engineering standards. And yet I've no seen any of that kind of debt in my LLM workflows: if anything, by most traditional forms of evaluating software quality, the projects I work on are better than what they were 5, 10 years ago, using the same metrics as back then. And it's not magic or anything, but making sure there are agents running sharing those quality priorities. But I am getting work done, instead of spending time looking for attention in conferences.
> if anything, by most traditional forms of evaluating software quality, the projects I work on are better than what they were 5, 10 years ago, using the same metrics as back then.
In this side sentence you're introducing so much vagueness. Can you share insights to get some validation on your claim? What metrics are you using and how is your code from 10, 5, 0 years performing?
I feel throwing in a vague claim like that unnecessarily dilutes your message and distracts from the point. But, if you do have more to share I'd be curious to learn more.
- 40x speed improvement
- Painless env setup
- 20 Second deploy
- 90+% test coverage
- Ability to quickly refactor
- Documentation
(The original system that I wrote with one other programmer 20 years ago took 1.5+ years to write. Modern rewrite: 2 days)
I'm a proponent of architectural styles like MVC, SOLID, hexagonal architecture, etc, and in pre-LLM workflows, "human laziness" often led to technical debt: a developer might lazily leak domain logic into a controller or skip writing an interface just to save time.
The code I get the LLM to emit is a lot more compliant with those BUT there is a caveat that the LLMs do have a habit of "forgetting" the specific concerns of the given file/package/etc, and I frequently have to remind it.
The "metric" improvement isn't that the LLM is a better architect than a senior dev; it's that it reduces the cost of doing things the right way. The delta between "quick and dirty" and "cleanly architected" has shrunk to near zero, so the "clean" version becomes the path of least resistance.
I'm seeing less "temporary" kludges because the LLM almost blindly follows my requests
It mostly works. CC's plan mode creates a plan by cleaning up first, then defining narrow, integrated steps. Mentioning "subtractive" and "yagni" appears to be a reliable enough way for an LLM to choose a minimal path.
To my mind these instructions remain incantations and I feel like an alchemist of old.
I’m trying out another, what I call the principle of path independence. It’s the idea that the code should reflect only the current requirements, and not the order in which functionality was added — in other words, if you should decide to rebuild the system again from scratch tomorrow, the code should look broadly similar to its current state. It sort of works even though this isn’t a real thing that’s in its training data.
works for me.
It works relatively well but not always.
Practical LLM Coding
Like self-driving cars, at least you remember the scenery along the way, but now it just teleports you to another place and then shows you the recording.
This kind of review is ineffective. Such ghosted code might be acceptable for small tools, but for databases or similar systems, it's really worrying.
I've now basically stopped granting the agent any write permissions and returned to how I wrote codes 2 years ago with manual QA. The thing is, it's actually more efficient in turns of tokens and the results.
That's just my personal experience.
My counter is that technical intent, in the way he is describing it, only exists because we needed to translate human intent into machine language. You can still think deeply about problems without needed to formulate them as domain driven abstractions in code. You could mind map it, or journal about it, or put post-it notes all over the wall. Creating object oriented abstractions isn't magic.
It’s similar to how doing math in natural language without math notation is cumbersome and error-prone.
Using a formal language also help to enter in a kind of flow. And then details you did not think about before using the formal language may appear. Everything cannot be prompted, just like Alex Honnold prepared his climbing of El Capitan very carefully but it's only when he was on the rock that he took the real decisions. Same for Lindbergh when he crossed the Atlantic. The map is not the territory.
If you invent a formal language that is easy to read and easy to write, it may look like Python... Then someone will probably write an interpreter.
We have many languages, senior people who know how to use them, who enjoy coding and who don't have a "lack of productivity" problem. I don't feel the need to throw away everything we have to embrace what is supposed to be "the future". And since we need good devs to read and LLM generated code how to remain a good dev if we don't write code anymore ? What's the point of being up to date in language x if we don't write code ? Remaining good at something without doing it is a mystery to me.
"A sufficiently detailed specification is code"
If you are thinking through deterministic code, you are thinking through the manipulation of bits in hardware. You are just doing it in a language which is easier for humans to understand.
There is a direct mapping of intent.
No I'm not. If I want the machine to evaluate 2+2, I don't know or care what bits in hardware it uses to do that (as long as it doesn't run out of memory), I just want the result to come back as 4.
When you think through what will happen as a result of deterministic code, you are also thinking through what the bits will do, albeit at a higher level of abstraction.
When you ask an LLM to do something, you have no guarantee that the intent you provide is accurately translated, and you have no guarantee you’ll get the result you want. If you want your answer to 2+2 to always be 4, you shouldn’t use a non deterministic LLM. To get that guarantee, the bit manipulation a machine does needs to be logically equivalent to the way you evaluate the question.
That doesn’t mean you can’t minimize intent distortion or cognitive debt while using LLMs, or that you can’t think through the logic of whatever problem you’re dealing with in the same structured way a formal language forces you to while using them. But one of my pet peeves is comparing LLMs to compilers. The nondeterminism of LLMs and lack of logical rigidity makes them fundamentally different.
The whole purpose of an abstractions is to not have to look underneath it to make sure what you did with the abstraction is still correct. You can make sure because you, or someone you trust, did the work of paying that debt once.
With LLMs you always need to verify the output, for every generation you need to pay that debt. So it is not an abstraction.
The interpreter is deterministic but LLMs aren't.
Reading the Hacker News comments, I kept thinking that programming is fundamentally about building mental models, and that the market, in the end, buys my mental model.
If we start from human intent, the chain might look something like this:
human intent -> problem model -> abstraction -> language expression -> compilation -> change in hadrware
But abstraction and language expression are themselves subdivided into many layers. How much of those layers a programmer can afford not to know has a direct effect on that programmer’s position in the market. People often think of abstraction as something clean, but in reality it is incomplete and contextual. In theory it is always clean; in practice it is always breaking down.
Depending on which layer you live in, even when using the same programming language, the form of expression can become radically different. From that point of view, people casually bundle everything together and call it “abstraction” or “intent,” but in reality there is a gap between intent and abstraction, and another gap between abstraction and language expression. Those subtle friction points are not fully reducible.
Seen from that perspective, even if you write a very clear specification, there will always be something that does not reduce neatly. And perhaps the real difference between LLMs and humans lies in how they deal with that residue.
Martin frames the issue in a way that suggests LLM abstractions are bad, but I do not fully agree. As someone from a third-world country in Asia, I have seen a great deal of bad abstraction written in my own language and environment. In that sense, I often feel that LLM-generated code is actually much better than the average abstractions produced by my Asian peers. At the same time, when I look at really good programming from strong Western engineers, I find myself asking again what a good abstraction actually is.
The essay talks about TDD and other methodologies, but personally I think TDD can become one of the worst methodologies when the abstraction itself is broken. If the abstraction is wrong, do the tests really mean anything? I have seen plenty of cases where people kept chasing green tests while gradually destroying the architecture. I have seen this especially in systems involving databases.
The biggest problem with methodology is that it always tends to become dogma, as if it were something that must be obeyed. SOLID principles, for example, do not always need to be followed, but in some organizations they become almost religious doctrine. In UI component design, enforcing LSP too rigidly can actually damage the diversity and flexibility of the UI. In the end, perhaps what we call intent is really the ability to remain flexible in context and search for the best possible solution within that context.
From that angle, intent begins to look a lot like the reward-function-based learning of LLMs.
That being said, even model-based design (MBD) has largely been a failure, despite it being about mapping formal models to (formal-language) program code.
there is the famious bowling game tdd example where their result doesn't have a frame object and they argue they proved you don't need one. That is wrong though, the example took just a couple hours - there is nothing so bad in a a two hour program you will regret. If you were doing a real bowling system with pin setters, support for 50 lanes and a bunch of other things that I who don't work in that area don't even know about - you will find places to regret things.
What I meant is that, like any powerful tool, there are situations where it shouldn't be used.
Thanks for the thoughtful comment.
It’s easier to keep the balance by keeping everything simple and maintaining a good hygiene in your codebase.
I agree! You often see this realized when projects slowly migrate to using more and more ctypes code to try and back out of that pit.
In a previous job, a project was spun up using Python because it was easier and the performance requirements weren't understood at that time. A year or two later it had become a bottleneck for tapeout, and when it was rewritten most of the abstract architecture was thrown out with it, since it was all Pythonic in a way that required a different approach in C++
I realize that most researchers use AI to assist with writing, but when the topic of your paper is "cognitive surrender", I struggle to take any content in there seriously.
This is disgusting
In short:
1. stacked-commits automation (cannot skip writing context/why/verify sections)
2. product specs (full ERD: https://excalidraw.com/#json=WT-oRUdyKBhAsDZJ3NwAR,WAbVgfO39...)
3. linking specs to code via SCIP indexes, and commits to ACs, later you can attach anything you want
A concrete example, I had a set of python models that defined a database schema for a given set of logical concepts.
I added a new logical concept to the system, very analogous to the existing logical set. Claude decided that it should just re-use the existing model set, which worked in theory, but caused the consumers to have to do all sorts of gymnastics to do type inference at runtime. It "worked", but it was definitely the wrong layer of abstraction.
In the olden days we used Duff's devices and manually unrolled loops with duplicated code that we wrote ourselves.
Now, the compiler is "smart" enough to understand your intent and actually generates repeated assembly code that is duplicated. You don't care that it's duplicated because the compiler is doing it for you.
I've had some projects recently where I was using an LLM where I needed a few snippets of non-trivial computational geometry. In the old days, I'd have to go search for a library and get permission from compliance to import the library and then I'd have to convert my domain representations of stuff into the formats that library needed. All of that would have been cheaper than me writing the code myself, but it was non-trivial.
Now the LLM can write for me only the stuff I need (no extra big library to import) and it will use the data in the format I stored it in (no needing to translate data structures). The canon says the "right" way to do it would be to have a geometry library to prevent repeated code, but here I have a self contained function that "just works".
I've had several bugs that required manual intervention (yes, even with $YOUR_FAVORITE_MODEL -- I've tried them all at this point). After the first few sessions of deleting countless lines of pointless cruft, I quickly learned the benefits of preemptively trimming down the code by hand.
And even if they didn't every line of extra code without sufficient abstraction adds cognitive overload.
I think the "cognitive bottlenecks" in software engineering live between artifacts, where code is simply one of them.
outcome → requirements → spec → acceptance criteria → executable proof → review
I'm making experimental tooling that automates the boring parts around those transitions, while keeping humans focused on validating that intent survived each step.
I'm kidding. But yes, I explicitly didn't model it yet. The bigger vision is there's a reason for Spec to exist, right?
And that would be Outcome.
> "We observed that users share 100+ characters long links too often and they are frustrated when it doesn't work / crop / browser address bar limitations"
So the outcome is: "Users no longer have to worry about long URLs". And then you have idea, a spec: "what if we let them create and use short URLs for sharing?" -> URL shortener app.
And yes, this ERD is easily expandable. I'd rather not add more fields but keep the "core" schema short and nice.
Things like outcome, observations, analytics, they can be simply extra tables linking to Spec, ACs, etc. Jira tickets, Datadog dashboards, Tableau analytics, whatever makes sense to teams. And it doesn't require you to setup a postgres instance. MVP would run on sqlite3.
I also seen a lot of effort trying to link different systems together specifically for simpler context access for agents. "RAG enterprise intelligent search" it is.
What's concerning to me is that even Sourcegraph haven't thought about what I'm thinking since 2015: linking specs to code directly, via SCIP. I should be able to press a button "find specs", in addition to "find references" and "find implementations". And I strongly believe they are sitting on a gold mine right now.
From my experience, it all comes down to code, and so code was the first-class artifact for a long time. Up until I realized that code is only a lossy representation of the spec artifacts. And if nobody ever records spec as an artifact...
What I'm saying is that the pain is real, I've been here for a long enough time. And I should be able to at least use something like this even if the industry doesn't want to.
Here's image you can open on the phone https://pbs.twimg.com/media/HGjHvSsWIAAkhHL?format=jpg&name=...
I also did a post explaining reasoning behind this diagram: https://x.com/br11k_dev/status/2047105958451507268
But I'll make a proper post on HN once I have all ingredients ready!
- Minimal CLI tooling
- Jupyter Lab you can go through step by step, on example greenfield project (URL shortener app)
- Blog post on what I've been doing for last 2 months
I assure you, they do not.
My most recent Claude Code fix consisted of one line: calling `third_party_lib._connect()`. It reaches into the internals of an external library. The fix worked, but it is improper to depend on the specific implementation. The correct fix was about 20 lines.
(Tangentially, this is why I think LLMs are more useful for senior developers because junior developers tend to not have a sense for what's good quality and accept whatever works.)
This lines up with YAGNI, but most people believe the opposite, often using YAGNI to justify NOT building the necessary abstractions.
I don't think what Fowler says here is in favor of saddling the early versions of your system with abstractions before you actually seen its use in practice, and its needs over time as requirements and conditions change.
From this "Laziness drives us to make the system as simple as possible (but no simpler!) — to develop the powerful abstractions that then allow us to do much more, much more easily." it's clear that when he talks of abstractions he means of very basic, and as simple as possible, building blocks. Like having core, orthogonal, principles in the system.
Not the kind of piling of software and pattern design abstractions e.g. the Java land in the past used to build.
That line (between your other values?) was uproarious; I apologise for not u*voting it, partially because I couldn't vocalise my peculiar fetish att (+ "gnarliness-pornstar" doesn't sound nearly as enticing as "AI-affordability-pornstar" X)