Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
AI Programming

40% of GitHub's Copilot's Suggestions Had Security Vulnerabilties, Study Finds (visualstudiomagazine.com) 24

"Academic researchers discover that nearly 40% of the code suggestions by GitHub's Copilot tool are erroneous, from a security point of view..." writes TechRadar: To help quantify the value-add of the system, the academic researchers created 89 different scenarios for Copilot to suggest code for, which produced over 1600 programs. Reviewing them, the researchers discovered that almost 40% were vulnerable in one way or another...

Since Copilot draws on publicly available code in GitHub repositories, the researchers theorize that the generated vulnerable code could perhaps just be the result of the system mimicking the behavior of buggy code in the repositories. Furthermore, the researchers note that in addition to perhaps inheriting buggy training data, Copilot also fails to consider the age of the training data. "What is 'best practice' at the time of writing may slowly become 'bad practice' as the cybersecurity landscape evolves."

Visual Studio magazine highlights another concern. 39.33 percent of the top options were vulnerable, the paper noted, adding that "The security of the top options are particularly important — novice users may have more confidence to accept the 'best' suggestion...." "There is no question that next-generation 'auto-complete' tools like GitHub Copilot will increase the productivity of software developers," the authors (Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Brendan Dolan-Gavitt and Ramesh Karri) say in conclusion.

"However, while Copilot can rapidly generate prodigious amounts of code, our conclusions reveal that developers should remain vigilant ('awake') when using Copilot as a co-pilot. Ideally, Copilot should be paired with appropriate security-aware tooling during both training and generation to minimize the risk of introducing security vulnerabilities.

This discussion has been archived. No new comments can be posted.

40% of GitHub's Copilot's Suggestions Had Security Vulnerabilties, Study Finds

Comments Filter:
  • However... (Score:2, Funny)

    by Anonymous Coward

    This beat the 90%+ security vulnerability rate when coders were left to their own devices.

    • by Briareos ( 21163 )

      Surely you mean "when they were left to copy-pasting from Stack Overflow"?

      • by arQon ( 447508 )

        Indeed. And that's what "copilot" is, really - it's AT BEST the equivalent of the absolutely lowest-tier developer you can get: an EE or somesuch who minored in CS. The ability to type out what is technically a "sort-of working" piece of code that probably compiles, after gluing together pieces from different SO questions, but with no understanding at all of either the bigger picture or even the most basic of the concepts underpinning the code, let alone the different assumptions within the various code sni

  • by Mal-2 ( 675116 ) on Sunday August 29, 2021 @12:48PM (#61741657) Homepage Journal

    "However, while Copilot can rapidly generate prodigious amounts of code, our conclusions reveal that developers should remain vigilant ('awake') when using Copilot as a co-pilot. Ideally, Copilot should be paired with appropriate security-aware tooling during both training and generation to minimize the risk of introducing security vulnerabilities.

    That's a damn fancy way of saying "garbage in, garbage out".

    • Exactly. I think the lede could just as easily have been 39.33% of github code has serious security vulnerabilities.
    • by sjames ( 1099 )

      It's kinda like Sendmail. The configuration language is turing complete, so it's YOUR fault you didn't use it to write an actually secure MTA running inside Sendmail.

  • perhaps just be the result of the system mimicking the behavior of buggy code in the repositories

    Pattern Bot see, Pattern Bot do.

  • A useless number (Score:5, Insightful)

    by Yurka ( 468420 ) on Sunday August 29, 2021 @01:48PM (#61741763) Homepage

    without knowing what percentage of human-generated code contains vulnerabilities of the same kind. If it's 20%, then sure, extra vigilance is required; if it's 60%, then we should replace the codebases with Copilot output wherever we can.

    • You have a point. A point which can be stretched too far, though.

      Suppose that 80% of "coders" write crap.
      Suppose this system produces crap 40% of the time.

      Are your only two options to either a) use this system or b) hire crappy coders, the most readily available kind?

      You assume *randomly* hiring people to be software developers, so that you get the same quality you find by choosing code randomly from GitHub.

      Perhaps a third option would be to SELECT developers, intentionally rather than randomly. To *interv

      • by raynet ( 51803 )

        Hmm, option three isn't available for companies, as you usually need to pay for talent and also HR don't know how to interview such people.

    • by raynet ( 51803 )

      They seem to have generated lots of C code, which means 100% of the code would contain vulnerabilities if written by human :)

  • by devslash0 ( 4203435 ) on Sunday August 29, 2021 @01:55PM (#61741775)
    If you train your model on source code of questionable quality, then you get the same questionable quality output. Whoever thought that training Copilot on unfiltered set of repositories was a good idea is probably questioning his decision now.
  • I'm not sure average developers would have done better. Maybe, maybe not, that's not even the point.

    Thing is : analysis tools do exist, and are used (or should be!) in any serious development process.

    Guess what : even Github provides one : LGTM (bought from an expert in security analysis company - semmle)

    Tools like CoPilot, even if I'm not convinced about them, should not be used to replace good practices and other tools.

  • No way could real people produce as much bad code as you can find on GitHub. It was a long-term ploy to poison AI training sets all along, thus assuring human coders have jobs for all eternity!

    Well done.

  • The AI picked up our bad habits
  • Security researchers really run the gambit from very real and severe vulnerabilities (e.g. HeartBleed) to mountain out of by-designed behavior (recently I saw someone declaring vlans 'broken' due to a 'vulnerability' of a host being able to join any vlan it likes, if the network admin enables DTP on the edge port) to factually incorrect (that time a 'security researcher' declared that Nintendo must be checking partial passwords because it left 'login' button greyed out until the minimum password length was

    • by sjames ( 1099 )

      It's also fun when the proof of concept code won't even compile due to syntax errors.

  • The only way of get rid of bugs in the code is to have peer reviewed platform to asses and evaluate code and rate it. Like stars on chunk of code for some functionality. This peer reviewing functionality is missing from github.You can only notify of issues, not rate the overall/specific quality of code. Even though, lets say, access to SQL database can be written some moderate number of different ways, there is still only few ways that are secure and acceptable. Identifying those and rate them to be clearl
  • There's plenty of tutorial and documentation websites around. The problem is that these tutorials are often meant to get the user to learn how to do things quickly, rather than doing it properly.

    Sometimes they're wrong: https://www.lua.org/pil/19.1.h... [lua.org] - in this case, lua's table.getn is no longer present.

    Sometimes they're causal mistakes, such as having people use sscanf or sprintf without encouraging a length limit.

    And sometimes the correct method is buried under a morass of other documentation.

It is easier to write an incorrect program than understand a correct one.

Working...