Study finds AI tools made open source software developers 19 percent slower

paequ2@lemmy.today · 2 days ago

Study finds AI tools made open source software developers 19 percent slower

JordanZ@lemmy.world · 2 days ago

Coders spent more time prompting and reviewing AI generations than they saved on coding.

See that’s the problem right there. You’re just suppose to take its output as gospel and move on. Skip this “reviewing” step and massive productivity gains await!

Obviously /s

hedgehog@ttrpg.network · 2 days ago

Ars points out that these findings contradict those of other experiments and then goes on to postulate as to why. I clicked on the link to the other experiment:

when data is combined across three experiments and 4,867 developers, our analysis reveals a 26.08% increase (SE: 10.3%) in completed tasks among developers using the AI tool

By comparison, this experiment considered 16 developers. That’s 0.3% as many as the experiments its findings contradict. Fortunately, the authors don’t claim their findings are broadly applicable. They even have a table that reads:

We do not provide evidence that | Clarification —- | —- AI systems do not currently speed up many or most software developers | We do not claim that our developers or repositories represent a majority or plurality of software development work AI systems do not speed up individuals or groups in domains other than software de- velopment | We only study software development AI systems in the near future will not speed up developers in our exact setting | Progress is difficult to predict, and there has been substantial AI progress over the past five years [2] There are not ways of using existing AI systems more effectively to achieve positive speedup in our exact setting | Cursor does not sample many tokens from LLMs, it may not use optimal prompting/scaffolding, and domain/repository-specific training/finetuning/few-shot learning could yield positive speedup

That said, the study has been an interesting read so far. I highly recommend reading it directly rather than just the news posts about it. Check out their own blog post: https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/

I personally find the psychological effect - the devs thought they were 20% faster even afterward - to be pretty interesting, as it suggests that even if more time overall is spent, use of AI could reduce cognitive load and potentially side effects like burnout.

I’d like to see much larger scale studies set up like this, as well as studies of other real world situations. For example, how does this affect the amount of time this takes 10,000 different developers to onboard onto an unfamiliar repository?

Catoblepas@piefed.blahaj.zone · 2 days ago

I personally find the psychological effect - the devs thought they were 20% faster even afterward - to be pretty interesting, as it suggests that even if more time overall is spent, use of AI could reduce cognitive load and potentially side effects like burnout.

This assumes that lower estimated time = lower stress levels, when other factors could easily be throwing off time estimation. Think the trope of someone very busy at work who realizes they’ve worked through lunch or dinner. I would have expected people who spend 20% less mental effort on something to be less engaged and more bored by the passage of time, not less.

Also, importantly, improving worker conditions is something that can reduce burnout without the burden of massive data centers. We don’t have to make a machine that produces the illusion of speech to pay people better.

errer@lemmy.world · 2 days ago

Also: you can multitask with these things! Prompt it and let it cook for several minutes while you do something else. I feel like the people in this study must have been blankly staring at the code generating to get an overall slowdown…

Barbarian@sh.itjust.works · 2 days ago

Or spent more time debugging what it produced than it would have taken to write it themselves.

errer@lemmy.world · 1 day ago

Nah, not if you’re using it in a reasonable way.

usernamesAreTricky@lemmy.ml · 2 days ago

Though also when breaking down the study earlier to more experience developers, a similar same pattern of within margin of error change or decrease in productivity shows in that metastudy

They are also not comparing the same metrics here. The earlier study is looking at number of commits and pull requests as a metric for productivity. The other is looking at the time per task

Number of commits / PRs / similar kinds of metrics like lines of aren’t great for measuring productivity in general and especially here. Usage patterns with AI could very easily change your commit pattern and PR patterns without changing how much you are getting done

GreenKnight23@lemmy.world · 2 days ago

remember, that’s 19% more expensive.

jordanlund@lemmy.world · 2 days ago

In before “clearly the problem is open source software…”

Kairos@lemmy.today · 2 days ago

Whats the purpose of “In before”? Like “get in [before] something happens”?

hedgehog@ttrpg.network · 2 days ago

https://knowyourmeme.com/memes/inb4--2

Kairos@lemmy.today · 2 days ago

Close enough.