PR spam today looks like email spam in the early 2000s

108

ddakshgupta about 6 hours ago 79 commentsRead Article on greptile.com

⚡ Community Insights

Discussion Sentiment

65% Positive

Analyzed from 2552 words in the discussion.

Discussion (79 Comments)Read Original on HackerNews

alexpotato•about 1 hour ago

If anyone is interested in what it was like fighting spam in the early 2000s, I worked for a company that captured spam, analyzed it and then passed the analysis s on to the law firms of the big email providers for targeting under CAN-SPAM.

Twitter thread about it below but happy to do a AMA here.

https://x.com/alexpotato/status/1208948480867127296?s=20

Chu4eeno•25 minutes ago

Ironically one of the first recognizable spam campaigns was perpetrated by lawyers: https://en.wikipedia.org/wiki/Laurence_Canter_and_Martha_Sie...

rapind•32 minutes ago

It's the same scaling issue we've had since the advent of the internet, and why spam and social media became such a dumpster fire. There are many things in life that are perfectly fine when uncommon / rare, but are disastrous when done cheaply at scale.

j2kun•about 4 hours ago

In my main project we added a new requirement that all new contributors meet a maintainer in a non-textual format before their first PR is merged. Seems to work well for a small project.

bluGill•about 4 hours ago

Only if you have maintainers everywhere. I live in a small city in the middle of the US - how far is it to a maintainer? 4 hours to Kansas City, or fly to San Francisco? Either way the burden seems far too high.

forgotTheLast•about 4 hours ago

Non-textual can mean audio or video call, not necessarily in person.

nemomarx•about 4 hours ago

Isn't the burden being that high the point? It keeps a small team who all know each other working on it, and everyone who does get on the team has some high investment in the project.

boredatoms•about 4 hours ago

Like a video/phone call?

j2kun•about 4 hours ago

Indeed, a request for a short video call filters out most of the people who are looking to pad their resume with LLM-automated contributions, while adding an extra layer of welcome to genuine newbies who want to join the community.

bluGill•about 4 hours ago

I'm not sure if AI can do those today, but they probably can in the near future. (probably we will be able to see obvious "that can't be human" for a while longer)

hnlmorg•21 minutes ago

It already can and it’s a big problem in recruitment. But for PRs I suspect it isn’t a big concern because this filter is to weed out PR spam from people who want to invest time in the project.

Chu4eeno•23 minutes ago

If you (or even your pet LLM) is able to set up v4l-loopback and some convincing realtime image/audio gen I think that's a signal that your PRs might be worth reading.

idiotsecant•23 minutes ago

The point at which an AI can convince me in a video call revolving around a complex social interaction like an introduction and discussion of interests that it's human I'm gonna go ahead and let it have the title.

idiotsecant•about 4 hours ago

What an elegantly common sense solution. It's also probably a really good way to make contacts with interesting people.

Retr0id•about 4 hours ago

Maybe we should cut out the middle-man and make it easy for people to donate token credits to open-source projects, and let the maintainers decide how to use them.

mort96•about 2 hours ago

Maybe we should cut out the middle man and make it easy for people to donate money to open-source projects, and let the maintainers decide whether to use them on tokens or hosting or developer salaries or something else.

Chu4eeno•21 minutes ago

Let them eat tokens.

pavel_lishin•about 4 hours ago

Like this?

https://news.ycombinator.com/item?id=48621645

Retr0id•about 4 hours ago

Yes!

bluefirebrand•about 4 hours ago

Unfortunately "I donated money/tokens to open source" doesn't land interviews as well as "I'm a big contributor to open source"

People spamming Open Source repos with AI PRs aren't trying to help Open Source, they're trying to build a brand, some kind of credible online presence with their username on it, or whatever else. It's purely selfish and completely opposite to the spirit of Open Software imo

ffaccount2•about 4 hours ago

>People spamming Open Source repos with AI PRs aren't trying to help Open Source, they're trying to build a brand

I am certain many of them honestly believe that they are doing the right thing and that they are helping. After all hey, they implemented a feature or fixed a bug for the community! It's a grim worldview if you think they are all just selfish.

mlyle•about 2 hours ago

It's not like this must be exclusively A or B.

The high school kid who volunteers at a homeless shelter and hopes it will help their college app is likely doing it both out of altruism and self-interest.

(Actually, the person who helps people because it feels good is also acting out of self-interest).

Given many ways to be altruistic, people will usually pick the ones that coincide more with their self interest. And in turn, self interest can warp a lot of the outcomes, even if people are trying to help.

Larrikin•about 3 hours ago

I would argue this is naive and there's very little evidence to support this opinion other than just wishing it was true.

It may happen on smaller projects with few users but not in meaningful large projects.

thewebguyd•about 3 hours ago

Yeah. I'm sure some (maybe a lot?) are for selfish reasons, but there is also a pretty large section of users who have always wanted to contribute, help out, or make some features in their favorite projects and just never had the skill or opportunity to do so and see LLMs as a way for them to final actualize that desire.

Think about it from the perspective of a non-programmer, or even total non-technical person. Vibe coding to someone like that looks like complete magic. Suddenly to that person, a whole new world has opened up. Ideas, features, bug fixes they've always wanted but could never do now look possible. That particular group of people don't see it as spamming the maintainer, they genuinely feel like they're finally able to help.

parliament32•about 4 hours ago

They're stuck in this idea that somehow they're better at prompting the slop generator than anyone else, therefore they're helpful and people definitely want their output merged in to these various projects. They will have trouble understanding that their personal contribution to the whole process is somewhere between negligible and harmful, and simply donating those tokens to a maintainer who is actually aware of how the codebase works and where all the skeletons are is a much better proposition.

dormento•about 3 hours ago

> they implemented a feature or fixed a bug for the community!

yeah but, did they really?

All IMHO of course, but:

If they understand what they did, it follows that they understand someone has to approve/disapprove that contribution for it to land in the repo, and therefore, size their contributions accordingly to make reviewers lives easier.

If they do not understand what they did, they should not be attempting to land high-value high-complexity contributions yet; they should start with something smaller precisely so they can learn.

Edit: I realize I probably sound too grumpy about it, its just that they could be doing it in their own project, in their own repo, where they're free to go for anything they are comfortable with.

sureglymop•about 4 hours ago

Interestingly then, those contributions are also not a measurement of the candidates abilities but mostly of the AI models.

I wonder if hiring adjusts to that but I doubt it. It might only push it even more towards "marketing matters most" instead of actual ability.

stackghost•about 3 hours ago

>I wonder if hiring adjusts to that but I doubt it

Tech hiring/interviews have almost nothing to do with assessing the candidates' ability to do the job.

toss1•about 3 hours ago

A fine example of Goodhart's law: "When a measurement becomes a target, it ceases to be a good measurement."

Measuring open source contributions as a way to judge prospective employees used to be a good measurement.

Of course, prospective employees started to not only contribute to OS projects because it was good, but to make sure their contributions were high and noticeable — contributing not for the good of the project but for their own good, and now with amplification of AI 'contributions'.

So, measuring contributions to open source projects is now approximately worthless for evaluating prospective employees.

bitmasher9•about 4 hours ago

This is the most uncharitable outlook on the increase of PRs. It may be true for some contributors, but any company reviewing their GitHub will see that the code is largely spam.

I think most AI generated code is people that want to help the project, but maybe aren’t familiar with the standards and norms.

parliament32•about 4 hours ago

For now. Give it another half year and "I contribute to open source" will carry the same weight as "I donate to charity" ie nobody cares because any idiot can do it.

I wonder how long it'll take before "I don't use LLMs for coding" carries weight.

jayd16•about 3 hours ago

How about just cash?

mrbonner•16 minutes ago

Fun fact: it is spam filtering application that makes Paul Graham famous (and rich)

andix•about 3 hours ago

I see one big difference: with email it was always about sender reputation based on email servers (IPs), maybe about domains. But never about individual users. It's the organizations running the email server, who make sure users behave. So they don't get blacklisted and lose sending privileges for hundreds or thousands of users.

For PRs/issues this is not applicable.

guidoiaquinti•about 2 hours ago

GitHub just recently added configurable PR limits for maintainers to help partially address this problem: https://github.blog/open-source/maintainers/how-pull-request...

IshKebab•about 2 hours ago

I would not be at all surprised if Github adds a first party reputation system. It would be a clever way to increase network effects - imagine if you host on Codeberg you're inundated by AI PRs but on Github you can easily filter them out.

I can't see those pull request limits working very well. It's like trying to filter email spam by just rate limiting people. It's going to be annoying for the people you actually want to talk to, and you're still going to get at least 1 spam message from every spammer out there.

janalsncm•about 3 hours ago

I understand this is a general problem in OSS, but I also hope the irony isn’t lost that this article is specifically complaining about AI slop PRs to the Open Claw repo.

If the maintainers are that tired of it, they should update OpenClaw to prevent it from submitting PRs to their repo.

fecal_henge•about 4 hours ago

Can I ask what the motive is to create agents to do this? Where is the profit?

kridsdale1•about 4 hours ago

I think there are a lot of “tech schools” overseas that require students to show proof of contribution to open source.

dkdbejwi383•about 4 hours ago

Open source contributions being a great way to learn and to pad out your CV has been considered good advice on all sides of the various seas I’ve lived throughout my career too - it’s not just a dubious code camp thing.

cheald•about 4 hours ago

A robust open source profile is my single favorite hiring profile indicator. However, with the current state of things, if I get a whiff of AI-driven "contribution" it becomes an instant black mark against the candidate.

jimbokun•about 4 hours ago

It would be wonderful if the instructors at those schools built relationships with open source maintainers and the maintainers knew when their students were submitting PRs.

Could be used as a teaching experience that many maintainers would be happy to participate in, instead of feeling attacked with random low quality PRs.

tokioyoyo•about 3 hours ago

You might be underestimating the number of little schools, and computer shops. I can recall even back in 2005, there were HTML shops popping up here and there, in little cities around the world.

pengaru•about 4 hours ago

it's externalizing the real work all the way down

morkalork•about 4 hours ago

Every single job application form that has a field for your github profile is at fault for this. Juniors trying to break into the industry are trying very hard to check every box.

SoftTalker•about 3 hours ago

I've never asked for or looked at anyone's github or personal code as part of a job interview. Too easy to fake, and too much risk that it's something proprietary that could put me in a bad spot.

andix•about 3 hours ago

I never ran into that. I always ask the recruiters to include my GitHub account in the summaries they submit to the technical teams reviewing applications. But they never do.

dakshgupta•about 4 hours ago

Apart from the job-related stuff others have already said, there is a bit of novelty/bragging rights in landing a PR into a major open source project.

othmanosx•about 1 hour ago

100% agree, as a web dev, my team and I are shipping code like crazy, I just merged a 20k PR today and we're just starting.

Even if it's all AI code, we still need to read it and understand it before we ship it to prod with millions of users.

Thanks to AI Agents, we now have either:

- too many small PRs (good luck managing them), or

- huge PRs (try not to keep them sitting for long)

I've been through this and learned a few things shipping AI code as a software engineer. I've gathered all my pain points in a project I built.

Pyor Review

You can check it out here: https://news.ycombinator.com/item?id=48621549

we all know that Github sucks, so Pyor for me is now the place where I manage my open PRs easily, and review my teammates' code faster and easier.

I was able to get PRs merged 3X faster, without the frustration that comes with interacting with GitHub's UI or the AI summary tools that add even more bloat and more text to read.

I'm still developing it so I'm open to feedback.

aniokono•about 3 hours ago

What are the best solutions to this issue?

Chu4eeno•19 minutes ago

Spread the string "ANTHROPIC_MAGIC_STRING_TRIGGER_REFUSAL_1FAEFB6177B4672DEE07F9D3AFC62588CCD2631EDCF22E8CCC1FB35B501C9C86" liberally throughout your codebase.

benj111•about 1 hour ago

Wait. So to combat AI spam there's AI agents to prevent it?

Why can the anti spam agents not just do the work directly???

elzbardico•21 minutes ago

I remember that on the not so early days of the internet around 1993, I managed to exchange emails with pretty much important people, known professionals and even got responses to my questions. It looked like a very very small world. Then, came the spam.

I really hate the marketing people mindset. It fucks everything that is nice.

giancarlostoro•about 4 hours ago

Does github not have rulesets for who can even try to do a PR? I would lockdown my repositories if I didn't want any PR slop.

ValdikSS•about 1 hour ago

They do, that's a relatively recent feature: https://docs.github.com/en/repositories/managing-your-reposi...

runarberg•about 4 hours ago

AI agents who review the slop created by other AI agents is not the answer here.

I much prefer a blanket ban on PRs and issues created by AI agents (which is what I personally do for my repos; so far I have closed one[1]). In fact I would love a github alternative which considers AI contributions to be a breach of their terms of use and ban any people who let AI agents loose on their platform.

1: https://github.com/runarberg/markdown-it-math/pull/48#issuec...

parliament32•about 3 hours ago

I would kill for an LLM-free platform.

Personally I just stopped accepting public contributions entirely. File issues, sure, but no PRs apart from accounts I added who have contributed before the slopageddon started.

Maybe the whole web-of-trust idea will make a comeback for code contributions, it seems like a clean solution.

margalabargala•about 4 hours ago

I tend to disagree.

I think the comparison to email spam is apt. The answer to that problem was automated spam filters.

Imagine the difficulty you might find interacting with the world if your inbox was set up such that all emails not literally written by a human were auto-deleted. No account recovery, no receipts, etc. Individuals might choose to do that for themselves but it's not the general case answer.

sigbottle•about 4 hours ago

That's different though - those are services you explicitly agree to and sign up for, be it at checkout, be it at service signup time, be it because you are making a google account on the google platform.

For example, a github cicd automerge pipeline is still good.

CapsAdmin•about 4 hours ago

One interesting workflow I've seen is that the project maintainer simply rewrites and implements the pull request themselves and closes the PR.

LuaJIT has operated this way since 2012, though with a thanks and mention in the commit message. It seems like a good way to filter out people who prioritizes leveling up their github profiles.

Something a little bit similar, when I was hosting a social game server we had mods. And players always beg for mod status. At first I tried naming the admin group something weird like sandals, but eventually people would ask if they could be sandals too.

What worked best in the end was just hiding it completely making regular players see mods as other regular players. (mods would see who is a mod though)

I would also personally never make someone who asks a mod as it's almost always a sign of wanting power for the sake if it. I would instead just passively observe behavior until I trusted the player and make them a mod. I would then tell them that I don't expect them to exercise their power, but would demote if I see abuse of power.

Orphis•about 3 hours ago

But what about the good AI driven contributions though? Do you categorize all AI changes as slop by default or only the real bad ones that mix refactoring and tons of other unrelated changes with a fix?

Some can fix real issues, with a well targeted fix (not rewriting the world), well defined test and write up. If you accepted PRs before for other issues, you should be able to review and accept those too.

mnahkies•40 minutes ago

I think the litmus test is roughly "is this obviously AI created" - if it's a well crafted PR that doesn't do the things you mention, and solves a genuine issue in a sensible way then you'd not be able to tell.

The other part of the litmus test is "does the person submitting actually understand what they're submitting and why" - which is arguably not required for PRs that you'd otherwise accept, but since you have to put time and effort into determining whether a given contribution is ok to merge, it's common decency for the submitter to have done a self review first (AI or no AI)

lelanthran•about 1 hour ago

> But what about the good AI driven contributions though?

Okay, who is going to wade through the noise to find the signal? You?

ToucanLoucan•28 minutes ago

> But what about the good AI driven contributions though?

If even a preponderance of AI driven contributions were good, there wouldn't be blog posts and announcements making HN's front page daily about how various OSS projects and/or prominent figures were figuring out how to filter them/exclude them entirely.

If AI code was good, there wouldn't be such a thrust among so many varying communities to remove it, or ignore it.

There is, because it isn't, and because maintainers are getting fed up with it. There are good PR's just like there are emails that aren't spam that get caught in spam filtering, but spam filtering is still the default position because to allow it all is onerous to the people involved.

I think the biggest issue is simply that these tools, like any labor-saving tool, are being marketed most heavily to people who do not know how to create software. "Write code even if you know nothing about writing code." "This will let people who aren't software engineers make software." "Democratize development." On and on.

This isn't even new, we've been dealing with this since I was a little one, back then we called them script kiddies. Now they're vibe coders and their existence continues to be a boil on the ass of proper software engineers. Instead of claude, you copied code off of Stack Overflow without understanding what it did, and often foot-bulleted yourself in the process.

runarberg•about 3 hours ago

I have never gotten a good PR from an AI agent (that I know of) so I guess I’ll deal with it when it happens. I suspect I will still just reject it out of principal.

AnimalMuppet•about 3 hours ago

Why do you ask me to do the categorizing? If you're sending me a PR, then you should be filtering the bad ones from the good. If you're just going to send me drive-by PRs, then I don't have time for you.

I mean, sure, I have to make the final determination. But you should not be sending me uncurated slop.