Skip Navigation
3 comments
  • This tracks with my experience: I spent far more time double checking copilot output than trusting it. Also it almost always auto completed way too much way too often, but that could be UI/UX issue than a functional one.

    However, by far the most egregious thing was that it made the most subtle but crucial errors I took hours to fix, which made me lose faith in it entirely.

    For example, I had a cmake project & the AI auto completed "target_link_directories" instead of "target_link_libraries". Looking at cmake all day & never using the *_directories keyword before I couldn't figure out why I was getting config errors. Wasted orders of magnitude more time on finding something so trivial, compared to writing "boilerplate" code myself.

    Looks like I am not alone:

    Furthermore, the reliability of AI suggestions was inconsistent; developers accepted less than 44 percent of the code it generated, spending significant time reviewing and correcting these outputs.

    When I did find it & fix it, something interesting happened: maybe because AI is sitting too damn low in the uncanny valley I got angry at it. If the same thing would have been done by any other dev, we'd have laughed about it. Perhaps because I'd trust a another dev (optimistically? Naïvely?) to improve & learn I'd be gentler on them. A tool built on stolen knowledge by a trillion dollar corp to create an uncaring stats machine, didn't get much love from me.

  • The word "experienced" is missing from a lot of coverage.

    This tech is okay at a wide variety of things. A lot of hate comes from people who can already do a thing the hard way. They're better at it. Surprise? A thing that was barely an exciting blog post five years ago has not yet rivaled the performance of a skilled human professional, in any particular vocation.

    But there's probably something it can do better than you, personally. Not better than you could... just better than you can.

    The more interesting part here is that experienced coders expected it to help. Why.

  • Copying from the x-post

    Research paper here: https://metr.org/Early_2025_AI_Experienced_OS_Devs_Study.pdf

    I think these are interesting findings, they definitely conform to my experience, but I'd want to always have more data and larger sample sizes.

    One point in particular stood out to me - the lack of LLM context. I have a new person on my team (junior level) that uses AI for everything, and it's so obvious that the LLM is getting tripped up with the lack of context. Not all the required information is in a single repository, and it needs additional info like the documentation and spoken architectures that aren't explicitly documented, historical choices as to why we make decisions in a certain way, or even just style guides. The LLM context window just isn't large enough right now to be a truly effective programmer for large, complex projects