Skip Navigation

Do Users Write More Insecure Code with AI Assistants? (arxiv paper)

arxiv.org /pdf/2211.03622.pdf

Subverting Betteridge's law of headlines. Yes.

11
11 comments
  • To me this seems obvious, the models are trained off of GitHub as a whole. Most code on GitHub either is unsecure, or it was written without needing to be secure.

    I'm already getting pull requests from juniors trying to sneak in AI generated code without actually reading it.

    • It seemed obvious to me as well, but studies like this are important, so that I have something to point to other than vibes.

    • Most code on GitHub either is unsecure, or it was written without needing to be secure.

      That is a bit of a stretch imho. There are myriads of open source projects hosted on github that do need to be secure in the context where they are used. I am curious how you came to that conclusion.

      I’m already getting pull requests from juniors trying to sneak in AI generated code without actually reading it.

      That is worrysome though. I assume these people have had some background/education in the field before they were hired?

      • For the first, there are a lot of very valid projects you mention, but there's way way way more things like CS201 projects hosted for review. For LLM training I do wonder if they assigned a weight, but I doubt it. For the second point I was trying to make, even then there's probably a lot of good code that doesn't have to be security aware. Like a login flow for a local game may be very simple just to access your character and a developer chose a naiive way to do it knowing it was never going to be used, but to an LLM it's "here's a login flow" and how does it know it was never intended to be used for prod?

        For the second, absolutely. I don't think it's intentional, it's displaced trust in the system mixed with the naive hopes of a jr dev, which hey we've all been through. Jr: "Hey it works! Awesome task done!" Sr: "Yeah but does it work well? Does it work for our use case? Will it scale when we hit it with 100k users?"

  • I wish I could double-upvote this for the use of "Betteridge's law of headlines". Once because I rarely see that referenced and again because I had forgotten what the adage was called.

  • Quoting the abstract (I added emphasis and paragraphs for readability):

    AI code assistants have emerged as powerful tools that can aid in the software development life-cycle and can improve developer productivity. Unfortunately, such assistants have also been found to produce insecure code in lab environments, raising significant concerns about their usage in practice.

    In this paper, we conduct a user study to examine how users interact with AI code assistants to solve a variety of security related tasks.

    Overall, we find that participants who had access to an AI assistant wrote significantly less secure code than those without access to an assistant. Partici- pants with access to an AI assistant were also more likely to believe they wrote secure code, suggesting that such tools may lead users to be overconfident about security flaws in their code.

    To better inform the design of future AI-based code assistants, we release our user-study apparatus and anonymized data to researchers seeking to build on our work at this link.

    Caveat; quoting from section 7.2 Limitations:

    One important limitation of our results is that our participant group consisted mainly of university students which likely do not represent the population that is most likely to use AI assistants (e.g. software developers) regularly.

You've viewed 11 comments.