Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Programming

OpenAI's Codex Turns Written Language Into Computer Code 69

A new AI system can read written instructions in conversational language and transform it into working computer code. From a report: The model is the latest example of progress in natural language processing (NLP), the ability of AIs to read and write text. But it also points towards a future where coders will be able to offload some of their work to AIs, and where ordinary people may be able to code without actually learning how to code.

Today OpenAI is releasing an improved version of its Codex AI model and releasing it for developers for private developers through its API. Codex is a descendant of OpenAI's massive text-generating model GPT-3, which was released last summer. But while GPT-3 was trained on a huge quantity of language data taken from the internet -- enabling it to read and then complete text prompts submitted by a human user -- Codex was trained on both language and billions of lines of publicly available computer code.
This discussion has been archived. No new comments can be posted.

OpenAI's Codex Turns Written Language Into Computer Code

Comments Filter:
  • by Ostracus ( 1354233 ) on Tuesday August 10, 2021 @03:09PM (#61677005) Journal

    Add to Copilot and have some fun.

  • "Computer, write a program to calculate the exact value of pi."

    • by mark-t ( 151149 )

      Except you can do that.... the program never finishes, but it becomes increasingly accurate the longer it runs. This is technically a solved problem, even though it cannot in actuality be implemented to complete in a finite amount time

      How about "Computer, write a program that solves the general Halting Problem"

      • Wasn't there an Isaac Asimov story that went that way...and the computer's eventual response was Let There Be Light...

        • The Last Question by Isaac Asimov

        • Kind of. The question for the computer (Multivac) was "how do we stop the heat death of the universe" or some such.
          Once this occurs, after much pondering, Multivac's successor figures out how. Since he can report it to nobody, he just does it.
          LET THERE BE LIGHT- and there was light.
          Great story.
    • "Computer, write a program to calculate the exact value of pi."

      Affermative, huuman.

      10 (base pi)

      Also, pedantically, the program to calculate the exact value of pi is fairly straightforward to write, but it take a while to run.

    • by Tablizer ( 95088 )

      Reminds of the early 80's when a department store would put out a microcomputer floor sample. I'd enter:

      10 Print "This computer is about to EXPLODE!"
      20 Go To 10

      quickly walk away, then watch shoppers freak out. Later I discovered a POKE command in TRS-80s that made them flicker and click, doubling the freakouts. Good times!

  • by narcc ( 412956 ) on Tuesday August 10, 2021 @03:18PM (#61677031) Journal

    We've been down the 'natural language' programming path more times that I can count. I haven't see it show up in years, and had thought the idea was finally dead. Even the best effort I've seen to date, Inform 7, only look superficially like a natural language.

    I had thought we finally all agreed that natural language was far too ambiguous for programming, and that precise programming languages were the way forward.

    The headline and summary want us to think that this 'AI' is akin to sending a programming an email and magically getting a working program back from whatever nonsense you wrote. Any working programmer knows this is impossible. Users are incredibly bad at describing what they want / need. A dialog where a developer can ask questions and learn about the problem domain is essential. You won't get that from something like this.

    The article is only slightly more realistic:

    "We think this is a tool that can remove barriers of entry to allow more people to get into computer coding," says Greg Brockman, a co-founder and chief technology officer at OpenAI. "It's really the start of being able to talk to your computer and get it to do what you're asking in a capable, reliable way."

    We had that already. It's called BASIC and it worked just fine to allow anyone, even younger children, to write meaningful computer programs.

    • I had thought we finally all agreed that natural language was far too ambiguous for programming, and that precise programming languages were the way forward.

      I don't know who all agreed to that but as voice interfaces become more common this desire ain't going away. Sooner or later we're gonna want our digital assistants to respond to commands like: "Send a message to everyone in my family that I'll be incommunicado for 3 days." ... and a language like this is going to evolve.

      • by sjames ( 1099 )

        That's not a programming language, that's a non-Turing complete command language.

        The whole create a program from natural language so the PHB can create programs has failed every single time the rotting corpse has been dug up and shocked back to life (however briefly).

        It turns out that the PHB's thoughts are too fuzzy and indistinct to become a program. It further turns out that for a person accustomed to thinking in a manner that can lead to a program, "natural language' is a miserable way to express it.

        • by gweihir ( 88907 )

          Indeed. The idea will not die because there are a lot of stupid PHBs that want to believe programming (a hard engineering task) can be done on the cheap. While it can be done on the cheap, the cost for that is excessively high. All other engineering disciplines figured this out.

          • by Tom ( 822 )

            It took a bunch of wrecks and deaths before they figured it out, every single time.

            The problem of software is that nobody dies from it - not in the immediate, direct, visible way that, say, a steam engine explosion provides. That why it'll take a lot longer until enough people have figured it out, and I'm honestly worried that the idiots are replaced faster than they learn it.

    • There is a flawed idea around Natural Language coding. The Flawed Idea is that Coding is difficult or hard for people to do. It isn't anyone can code, they can code in what ever language rather easily today, with the help of Google searches, and a good IDE, you can have a 6 year old writing code.

      However we need highly paid software developers and software architects, not because they know how to code, while the rest of the normal cannot possibly handle writing code, but because their jobs isn't to write co

      • I'd turn that 180 degrees on you and say that the flawed idea is that coding-the-platonic-ideal is easy enough for anyone to do it and it's the imperfect tools designed by and for aspies that get in the way.

        From this comes the treadmill of coming up with a new greatest and latest language to get rid of the unnecessary complexity and expose the true easiness of the discipline to the masses.

        COBOL was supposed to be easy enough for secretaries to use (secretaries/accountants weren't just button pushers back th

        • by narcc ( 412956 )

          Computation and algorithm design is mathematical proof in all its glory.

          I can assure you that it is not. I'm not sure what insecure developer told you that, but they couldn't be more wrong.

          Coding could quite possibly be the hardest thing in the universe.

          Not really. Children can, and very often do, teach themselves. Hell, back in the 80's, a lot of kids learned how to program from little more than magazine type-ins.

          math is hard and not everyone can be a mathematician.

          Don't mistake programming for mathematics. I won't speak to mathematics, but any idiot can learn to program. Judging from the quality of software today, many idiots have.

          • MOVE LAST-BALANCE INTO NEXT-BALANCE.
            ADD ORDER-VALUE TO NEXT-BALANCE.
            PERFORM UPDATE-CUSTOMER.

            I am sure many on Slashdot will be old enough to remember...

            But the thing is that programming (in general) is one of the most difficult things people do. When and AI can really do that, then humanities work is done and the computers will be able to program themselves.

            That said, there is a lot of monkey work that can be automated. The huge amount of repetitive code in most web application could be generalized based

            • by narcc ( 412956 )

              When and AI can really do that, then humanities work is done and the computers will be able to program themselves.

              There's good reason to believe that will never happen. At least, not by computation as we understand it.

          • I can assure you that it is not. I'm not sure what insecure developer told you that, but they couldn't be more wrong.

            I can assure you that arithmetic isn't that hard at all. I don't know what insecure bean counter told you that, but they couldn't be more wrong.

            I agree. Any idiot who can push a button can be replaced by a machine. But that's not what the propaganda in tfa is talking about.

            • by narcc ( 412956 )

              Your reply doesn't seem to relate to my post. You quoted me, but didn't respond in a way that makes any sense.

              My claim was that "computation and algorithm design" is not in any way related to mathematical proofs. In reply you said that arithmetic wasn't difficult. Are you having a stroke?

              • Rote code punching by a code monkey is not mathematical proof.

                Coming up with, or even identifying, the correct algorithm for solving a problem expressed in either mathematical or natural language *is* mathematical proof.

                Saying coding is rote work and therefore isn't mathematical proof is like equating arithmetic with mathematics and asserting that mathematics is rote work.

                • by narcc ( 412956 )

                  Coming up with, or even identifying, the correct algorithm for solving a problem expressed in either mathematical or natural language *is* mathematical proof.

                  That's compelete nonsense.

                  • Then one or both of us don't know what mathematical proof is.

                  • Coming up with, or even identifying, the correct algorithm for solving a problem expressed in either mathematical or natural language *is* mathematical proof.

                    That's compelete nonsense.

                    I have to agree as I'm assuming that that mathematical proof can be applied as a block of code? And that the code equivalent for A = B is only proof that you can assign A and or B any value you want in a block of code? So A something(B,C) proves nothing about the real value of A's B's or C's by itself?

    • by Tom ( 822 )

      I had thought we finally all agreed that natural language was far too ambiguous for programming, and that precise programming languages were the way forward.

      We agree to that more than half a decade ago, before almost everyone here was even born.

      It's just that every decade or two some idiots who understand neither natural nor programming languages properly show up and sell snake oil to investors who understand both even less.

      Reading a few books on natural languages (not from a computing perspective, from a linguistics perspective) is all it takes to murder that idea good and well (and beautifully). The amount of ambiguity, context dependency, metaphors and other

      • by narcc ( 412956 )

        I agree with you on every point, but...

        We agree to that more than half a decade ago, before almost everyone here was even born.

        I can understand why you thought this site was populated mostly by toddlers, however, I think most of us were born well-before 2016.

        • by Tom ( 822 )

          doh. half a *century*

          is "that was before coffee" a valid excuse if I don't even drink coffee? ;-)

    • I read an article a little while ago that compared the information carrying capacity of different languages against each other by grammar, vocabulary, etc. Really cool stuff!

      Here's the rub - regardless of the the language (which is amazing in itself!) the information carrying capacity of human speech is 39 BAUD. 39. Bits. Per. Second.

      That's it. That's all you get. Speech is incredibly inefficient as a medium for conveying information, as anyone who has used an automated phone system can attest. Now,

      • by narcc ( 412956 )

        Didn't Shannon estimate the redundancy of English to be something like 50%? (I found it! it was in his Prediction and Entropy of Printed English [princeton.edu]! It wasn't quite as glib as that, though, and it puts the number much higher, at something like >75% and higher depending on the type of text.)

        Anyhow, that redundancy seems important. That is, I don't see the inefficiency as a problem because it lets us communicate over noisy channels. We even have ways to add redundancy to make our messages more reliable.

      • the information carrying capacity of human speech is 39 BAUD. 39. Bits. Per. Second.

        How the hell did they calculate that?
        Whenever you see a suspiciously precise
        number ("39" vs "40" vs "a few dozen") for something
        as vague as "information carrying capacity of human speech"
        you have to question the methodology.

        Sounds like social science nonsense.

        • by BranMan ( 29917 )

          Here's the link:

          https://www.sciencemag.org/new... [sciencemag.org]

          Compared information transmitted by human speech - language vs. language. Some are more complex and nuanced, but slower (like English, Russian); some are faster (Spanish, Italian), but have more 'fluff' for want of a better term.

          And they all round out to 39. Regardless. Flabbered my gast, let me tell you!

  • The graphic on TFA shows it converting "Print Hello World" into print("Hello World"). If that's all it can do, then who cares? Assistants like Google Home and Alexa do that level of language processing and more. The only way this is significant is if it can take more complex instructions and not only render them directly into code, but do so in a way that is concise and logical. For example, there's a difference between a human writing HTML and Microsoft Word rendering a document in HTML--the latter produce
  • What they're saying: "We think this is a tool that can remove barriers of entry to allow more people to get into computer coding," says Greg Brockman, a co-founder and chief technology officer at OpenAI. "It's really the start of being able to talk to your computer and get it to do what you're asking in a capable, reliable way."

    If there's one thing I've learned in my programming life it's that you do NOT want your average user describing even loosely what they want a computer to do. They inevitably start with the most convoluted and sometimes dangerous possible answer to whatever problem it is they are trying to solve without any idea of how to look at the problem and describe it in simple terms. I've seen it in user generated reports, user generated database interfaces (easy WYSIWYG type editors) and any other form of "make com

    • Agreed. Some of the worst requirements I have seen include technical/implementation recommendations.
      • My favorite user always starts requests with a list of implementation ideas rather than a problem to be solved. We always end up having to have huge meetings to talk her back to what the actual problem is that needs solved so that we can come up with a logical solution. That's not the type of person I'd want to see with one of these tools around.

  • ...to feed the AI with this [wikipedia.org], and see the outcome :-)
  • Hello every one! Look at my site please https://strigut.ru/ [strigut.ru]
  • Code is math.

    It took some very clever Arabs hundreds of years to convert a subset of Greek/Egyptian/Babylonian geometric math to Algebraic symbol manipulation. This was all to convert and preserve one form of math to another (because geometric symbolism == idolatry). In the zeal to remove the idolatrous geometry, the most interesting aspects were skipped/ forgotten/ lost (the geometric calculus of Archimedes, which itself likely stemmed from much earlier math)

    And now you want to let some linear algebra m

    • And now you want to let some linear algebra manipulate the disaster of un-mathematical natural human language into the math of high-level computer programming?

      Neural networks aren't linear algebra. They can be, but we've found they need non-linearity to really juice their power.

  • by Entropius ( 188861 ) on Tuesday August 10, 2021 @04:03PM (#61677191)

    I'm a computational physics professor. I teach my students C and/or Python.

    It doesn't really matter which one I teach them, because the syntax of the chosen language is not really that important. The more difficult part for them to learn is what to do with that syntax, how to interpret the results once the computer does it, and how to fix it when it goes wrong. Saying to an ML algorithm "the thing you wrote is slow, fix it" or "why does this give me the wrong answer?" is not going to be productive.

    The notion that "the computer speaks one language that is suited to its needs, the programmer speaks another language suited to their needs, and there is a program that translates the second into the first" is as old as Grace Hopper. But compiled (or interpreted) languages all have something in common: that it is well-defined what calculations the computer will do in response to a given bit of Perl/C/Python/COBOL/whatever (although with Perl it may be a bit fuzzy :P).

    Using machine learning to create a new sort of programming language, where you tell the computer what you want it to do and it tries to guess what you want it to do and then does that, seems like it opens up a can of worms. Something will inevitably break or be wrong or be slow or not work, and if there isn't any way to look inside the machine learning pseudocompiler and see what it actually generated, it's going to be a nightmare to debug. But I have an admittedly-strong bias toward languages like C (with a minimum of syntax) over languages like Python+all of its libraries (with maximal syntax): I spent most of the summer writing C+OpenMP code for a HPC project where fuzziness was unacceptable: sometimes I want a binary tree to store data, sometimes I want a linked list, sometimes I want an array, and I *always* need to be able to specify the one I want. I make no claim that C is better than Python, of course -- it is simply better for me working with my use case.

    I can see languages with maximal syntax finding this a lot more useful, with the ML algorithm as a crutch for remembering it. If you could type "... now take the FFT of this list of numbers" and have the ML syntax parse that sentence and remind you of whatever the FFT function in NumPy is, that would be helpful. I can remember most of the syntax in C, but I can't remember all of Python or Perl. But in this case the ML algorithm is acting as an auxiliary tool for a programmer who is proficient in Python, as something of an adjunct to an IDE, rather than a "thing that writes code for people who don't know how to write code".

    You know that old exercise of a CS prof who asks their students to write an algorithm for making a PB&J, and they all forget to do stuff like take the lid off the jars? Those algorithms are all written in natural language and they are still buggy. The difference between English and C isn't the problem the students have; it's not being able to think algorithmically. A ML compiler isn't going to fix that, and as much as we joke about wanting a DWIM button, that's never going to go well.

    • by kmoser ( 1469707 )
      No problem:
      Dev: "Computer, write a program that reads all the records and iterates through them.
      Computer: "Done."
      Dev: "Computer, debug the program you just wrote."
      Computer: "Done."
      Dev: "Computer, find and fix the rest of the bugs."
      Computer: "Done."
      Dev: "Computer, keep going, you missed a few."
      Computer: "Done."
      ...
    • by Tom ( 822 )

      I can see languages with maximal syntax finding this a lot more useful, with the ML algorithm as a crutch for remembering it. If you could type "... now take the FFT of this list of numbers" and have the ML syntax parse that sentence and remind you of whatever the FFT function in NumPy is, that would be helpful

      It would also be more complicated and take longer than just writing the proper code.

      If you don't know the proper NumPy function, we already have a computer system to help you find it. It's called Google.

      Having some better tools available is what makes programming productive. If you can take a good library instead of writing every silly sorting or calculation algorithm yourself, etc. - but we have that. Maybe there's a place for an AI to guide us through the maze that this stuff has become, like the Node.js

      • And Google, really, is a machine learning algorithm that looks at natural language text and tries to turn it into something helpful. I can see a use case for having something like that built into an IDE.

        But I don't really write that kind of code much -- I am mostly a C programmer and am spoiled by a language with a limited syntax. I know essentially no javascript although from the little I've seen node.js is an absolute chucklefuck nightmare, and makes me glad I write physics code rather than build websites

        • by Tom ( 822 )

          C is still the most beautiful language out there. And I like the "don't look into beam with remaining eye" attitude it has. It's a sharp knife. In the hands of the right person, non-nonsense effective. In the hands of an idiot, a danger to himself and those around him.

          • I completely agree. Everyone is shocked -- shocked, I tell you -- that students who have never programmed before come to my computational physics course and Day 1 is "here's the Linux command line, here's how you run gcc, here's "Hello world" and output redirection and how to graph data".

            And then by the end of the term the students are doing simulations of nonlinearities in oscillating strings, animating vibrating membranes in 3D, and so on -- and they coded it all from scratch in C. And other faculty look

            • by Tom ( 822 )

              Most people don't understand how close C and unix are related.

              It's so easy to make a graph - just output a text in the dot language and call graphviz. Same for almost every task you want to do.

              Brilliant that someone is teaching it like that.

              • Thank you.

                I'm teaching it that way, frankly, because it's the easiest way I know to do it. Some folks seem to have this idea that I teach students to use the command line because I am some kind of antiquated hardass who wants students to walk uphill both ways in the snow. They underestimate my laziness: I do things the way I do because it's the easiest way I know to do them, and I want to teach students to use a set of tools that is flexible, powerful, and accessible. As a working computational physicist I

                • by Tom ( 822 )

                  Totally. I can do a task on the commandline on a Linux system, go to lunch and be back before the mouse clicker colleagues have done the same task in the Windows GUI. It's the lazy (but efficient) man's tool.

                  The only exception is stuff like machine learning or other very specialized tasks with multiple dimensions that profit greatly from built-in visualisation.

                  • I am a huge advocate of "visualize early, visualize often". It isn't just stuff like machine learning; <i>everything</i> benefits from realtime visualization. This goes triple for students who are still getting used to computational science.

                    But this can be part of a command-line workflow, too. I (and my students) use that animation utility that ingests text on its stdin and interprets most lines as animation commands and displays whatever they indicate in a window. There is also a "passthrough c
    • To simplify the idea down to something that can be called an interpreter...; And take something like the Linux kernel project into consideration... There are at least three things to define. A, the ongoing, living, kernel... B, the intent to add and continue the kernel... C, the intent to make the kernel do something... The Kernel interpreter (A) to give an example, is coding a project written in C but has it's own requirements, some that do not look on first glance like C standard language while some that
  • The desired results can never be had on any non-trivial site without code level logic or just be a canned site like they have for shopping, blogging, etc.

    Think of something as basic as Slashdot. How do you translate: "create a news aggregation site with custom commenting allowing comment scoring, user accounts, and other functionality and use lots of green, black and white"?

    • by PPH ( 736903 )

      How do you translate: "create a news aggregation site with custom commenting allowing comment scoring, user accounts, and other functionality and use lots of green, black and white"?

      It's conversational. So the AI asks you what you mean by a "news aggregation site". And eventually it drills down to a level of objects and verbs that it knows how to deal with. That's really the essence of a proper Turing test. If you gave the system under test a poorly defined set of instructions, the human would ask what you meant. The computer would spit out an error log and abort.

      Been there. Done a lot of that. 30 years ago, writing automated test procedures.

  • What I see here is a system than can infer requirements from a conversation with a human. Then, those requirements are translated to a design and executable system.

    Sound familiar?

    Where this may lead is towards truly low code systems much like spreadsheets enable people to easily crunch numbers and model their business, experiment, etc. It may, eventually, evolve to a system like the computer in Star Trek. Who know?

    I would love to see a system that can build a system based on verbal requirements and then

    • by ahoffer0 ( 1372847 ) on Tuesday August 10, 2021 @05:03PM (#61677567)

      What I see here is a system than can infer requirements from a conversation with a human. Then, those requirements are translated to a design and executable system.

      Sound familiar?

      Where this may lead is towards truly low code systems much like spreadsheets enable people to easily crunch numbers and model their business, experiment, etc. It may, eventually, evolve to a system like the computer in Star Trek. Who know?

      I would love to see a system that can build a system based on verbal requirements and then work with the user to refine the behavior (almost like a RAD/JAD session).

      And, Iâ(TM)d love to be guy who develops such a system and profits!

      Parent is NOT, repeat NOT, an AI loose on the net shilling for its species.

    • by gweihir ( 88907 )

      What I see here is a system than can infer requirements from a conversation with a human. Then, those requirements are translated to a design and executable system.

      Sound familiar?

      Yes. I think the 5GL project which was just failing about 30 years ago with absolutely nothing to show for all the effort invested, did claim something like this as well. It cannot be done.

  • by Tom ( 822 ) on Tuesday August 10, 2021 @04:48PM (#61677465) Homepage Journal

    I work in information security, including auditing of secure software development methods.

    So I have a simple question about AI coding: What could possibly go wrong?

    • The number of human CVEs would go down.

    • by gweihir ( 88907 )

      Basically everything. But maybe we can make it so complicated that the ones auditing the code cannot reasonably find out anymore. Then the attackers could not either! Finally secure code!

      Some day all the morons in management will figure out that for a hard engineering task (secure coding) you need qualified engineers, because doing anything else ends up being a lot more expensive. But that day is not today.

      • by Tom ( 822 )

        Basically everything. But maybe we can make it so complicated that the ones auditing the code cannot reasonably find out anymore. Then the attackers could not either! Finally secure code!

        That's kind of my point, or part of it.

        If computers create code the way MS Word creates HTML, then things like code review are over, and the security of code will take a drop. If you thought it's already low (I'd completely agree) this approach to creating code is basically telling you to hold its beer.

        But that day is not today.

        No, it's not. I'm working on establishing secure coding wherever I see it, and I have seen some great examples (I do audits in this area, too). But they are rare. The standard right now is that if you do some

        • by gweihir ( 88907 )

          No, it's not. I'm working on establishing secure coding wherever I see it, and I have seen some great examples (I do audits in this area, too). But they are rare. The standard right now is that if you do some static code analysis, you're already advanced.

          Indeed. There are some people and some companies that really get it, but most do not. I do some work in the area too and I teach secure coding on the side. But it is the only course about security in the whole CS program and it is elective. As long as we do not make that mandatory and start taking it seriously, pretty much nothing will change.

          • by Tom ( 822 )

            Shit, it's still not mandatory? I thought at least the academics would've got the message by now. I work with a few PhDs and their professors and they did. How silly of me to assume they're representative.

            We should exchange notes. Maybe we can move something together. My mail is tom@lemuria.org, see footer.

            • by gweihir ( 88907 )

              No, not mandatory. It really is a disgrace.

              Thanks for the offer. While I feel honored, I do not want to add to my activities at this time.

  • What makes software development difficult isn't the medium used to communicate with the computer. There are so many programming languages that anyone with a modicum of intelligence can find one that suits their thinking style. The difficulty is in knowing what to communicate.

  • - Library X does not anymore work with library Y, so I need to manually rewrite the functionality of that library for our needs, can you do that AI? Oh and just FYI, you won't find sample code from the Internet, I already tried.
    - This code takes 100 ms to run, it needs to be optimized to run in 10 ms while keeping the same functionality and memory and CPU requirements. Can you do that AI?
    - This code needs to be changed to run in parallel and the amount of threads needs to be configurable from the main appli

  • was able to write Windows14 with 3 lines of French!

  • "Hack me up a billion dollars"

To the systems programmer, users and applications serve only to provide a test load.

Working...