Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
AI Programming

Developer Creates 'Self-Healing' Programs That Fix Themselves Thanks To AI 137

An anonymous reader quotes a report from Ars Technica: Debugging a faulty program can be frustrating, so why not let AI do it for you? That's what a developer that goes by "BioBootloader" did by creating Wolverine, a program that can give Python programs "regenerative healing abilities," reports Hackaday. (Yep, just like the Marvel superhero.) "Run your scripts with it and when they crash, GPT-4 edits them and explains what went wrong," wrote BioBootloader in a tweet that accompanied a demonstration video. "Even if you have many bugs it'll repeatedly rerun until everything is fixed."

In the demo video for Wolverine, BioBootloader shows a side-by-side window display, with Python code on the left and Wolverine results on the right in a terminal. He loads a custom calculator script in which he adds a few bugs on purpose, then executes it. "It runs it, it sees the crash, but then it goes and talks to GPT-4 to try to figure out how to fix it," he says. GPT-4 returns an explanation for the program's errors, shows the changes that it tries to make, then re-runs the program. Upon seeing new errors, GPT-4 fixes the code again, and then it runs correctly. In the end, the original Python file contains the changes added by GPT-4.
This discussion has been archived. No new comments can be posted.

Developer Creates 'Self-Healing' Programs That Fix Themselves Thanks To AI

Comments Filter:
  • Cancer. (Score:3, Insightful)

    by Narcocide ( 102829 ) on Tuesday April 11, 2023 @10:37PM (#63443068) Homepage

    Can't see any way this could backfire catastrophically, no. /sarcasm

    • Re: (Score:2, Troll)

      by gweihir ( 88907 )

      Yep, sounds like an exceptionally bad idea. Adding (wrong) functionality, keeping it running in a faulty way, adding backdoors, stealing all your data, damaging your hardware, etc. The possibilities are endless and not any of them are good. Will get really interesting when ChatGPT starts to "learn" on the fly though! Then attackers do not even need to compromise your system to get attack code in!

      • This, how many iterations of new shiny tech solves all but fails miserably do we have to go through in one lifetime. Can't wait till this all goes the way of the fanny pack.
  • by youn ( 1516637 ) on Tuesday April 11, 2023 @10:42PM (#63443074) Homepage

    Interesting for some use cases when no real damage can happen, controlled environment

    I see a lot of cases where this could go completely haywire lol

    • by 93 Escort Wagon ( 326346 ) on Tuesday April 11, 2023 @10:49PM (#63443082)

      Yeah, this is another case where you really don't want that "usefully wrong" nonsense.

      • by gweihir ( 88907 )

        Indeed. Could just morph to "nightmarish wrong". And once ChatGPT starts "learning" on the fly, things will get really interesting.

      • by DarkOx ( 621550 )

        Remember when "if it compiles ship it" was a joke. This is basically that thinking but automated..

        While it might be partially a question of training data set, here ingest all this English language content vs here train on all this code; I am going to be GPT3/4 has even less "understanding" (scare quotes) of intent than if asked to author code from a text prompt. Imaging fixing bugs in a program based on compiler/interpreter error messages and you nobody told you what the code was for.

    • by coofercat ( 719737 ) on Wednesday April 12, 2023 @07:48AM (#63443698) Homepage Journal

      Not least because every inch of your code gets posted to ChatGPT to look at it - thus, you're effectively giving away your program so that someone else can fix it. Naturally, it'll be used to help ChatGPT "learn" for next time, and who-knows-what-else in the meantime.

      Needless to say, your company security policy almost certainly doesn't like this. Your personal projects? Well, sure, go ahead if you want.

      As for accuracy, I wonder how it will cope with varying user input. Eg. two variations of input are allowed, but the third is not. If you haven't checked for them properly, then they could all cause an exception, which ChatGPT will "helpfully" fix for you. That third variation means umpteen more changes throughout your code, and possibly some design choices to be made - which is why it shouldn't have been accepted in the first place, but how's ChatGPT going to know that?

  • 101 level code (Score:5, Insightful)

    by darkain ( 749283 ) on Tuesday April 11, 2023 @10:48PM (#63443078) Homepage

    Any time I've seen demos like this, they've been limited to single-file, entry level "hello world" style programs. I've yet to see anyone even remotely attempt software architectural design, or really anything actually complex with these. We'll get there, but it still feels like a very VERY long way off. We basically just have multi-line auto-complete. Which if you think about it, we had inline autocomplete over two decades ago.

    • Precisely. Visual studio can already make some pretty good predictions, but good luck actually figuring out complex logic.

    • TFS sounds like it's only talking about runtime errors as opposed to say semantic errors. Current generation languages, i.e. Rust and Kotlin (though Go is a notable exception) have solved this pretty well: If something can go wrong, you'll know it before you even compile, and both languages let you neatly handle them with compact inline closures, or just getOrThrow()/unwrap() while your prototyping, then just ctrl-f and replace those once you're ready for user testing.

      No runtime errors, and no AI that confi

    • Re:101 level code (Score:4, Interesting)

      by physicsphairy ( 720718 ) on Wednesday April 12, 2023 @12:32AM (#63443174)

      There are limitations to what complexity you can implement with ChatGPT, the biggest being the number of tokens you can submit and that it can keep in its running memory.

      That said, it can definitely do a reasonable amount of architecture, including things like coming up with actual UML. Coding a backend you can ask it for a complete list of API endpoints you will need to implement, even the OpenAPI spec for it. You can even have it devise the business strategy/marketing plan for your app and write the annoying salesperson tweets you will use to inform people about it.

      It has no problem with a project spanning multiple files, but you either need to specify the file layout or let it come up with it. My experience is that after the conversation proceeds far enough, it starts to "forget", especially if you have gone done paths you later chose not to follow, so it helps to have a very focused drilldown form the high-level overview of the project to implementation details.

      It does great at debugging. You can know nothing about a language and just keep giving GPT the error messages/stacktraces and *eventually* make the right change to get your code working. (Unless it's something really flexible like yaml files - the more structured the better)

      I've also found GPT4 is also a lot better at the bigger programs than GPT3, in case you've only tried the latter.

      Here's an example prompt to start with

      You are an AI programming assistant. You will work on a programming project and come up with the necessary code and architecture.

      This is how you should help in writing the code:
      - Follow the requirements carefully and to the letter
      - First, think through the problem step-by-step, i.e., describe your plan for what to build in pseudocode, written out in detail.
      - The code should implement best practices in security and maintainability
      - Minimize other explanatory comments

      These are some details about the project setup:

      Project description:

      • Re:101 level code (Score:5, Insightful)

        by real_nickname ( 6922224 ) on Wednesday April 12, 2023 @01:47AM (#63443246)

        It does great at debugging. You can know nothing about a language and just keep giving GPT the error messages/stacktraces

        I haven't seen real example of debugging. Fixing syntax errors is not debugging. Fixing weird behaviors or complex crashes(memory leaks, use after free in complex system is and I haven't seen any example of this yet. Currently its only toy examples granted we are early maybe it will be possible later (I don't think its granted though).

      • Re:101 level code (Score:5, Interesting)

        by sg_oneill ( 159032 ) on Wednesday April 12, 2023 @02:44AM (#63443278)

        Yeah the token limit is brutal. I had a piece of script i was working on for a fairly large database migration, and due to constant management interference it had gotten complex beyond reason and I was struggling to understand a particular bug. This thing was huge, about 3000 lines of code (thats big for a one-off script).

        I fed it to GPT-4 with a description of the problem and it just.... started hallucinating functions that where not there. GPT can be shockingly good at code sometimes. It really is a lot more cleverer than the naysayers give it credit for. But beyond a certain complexity and size limit it just shits the bed.

        • by gweihir ( 88907 )

          Interestingly, something similar is true for humans as well: At some time the complexity gets too large to handle intuitively. But, and here is where smart humans are fundamentally different, (dumb ones, the majority, do not and start to hallucinate as well, just think of flat-earthers, anti-vaxxers, the deeply religious, etc.) is that they do not start to hallucinate, they instead structurize and abstract things until they can again see the whole in a meaningful way, just not everything at once. That proce

          • by HiThere ( 15173 )

            It's not that they don't start to hallucinate, they do. But we don't consider them smart unless they construct hallucinations that are testable, and then accept the results of those tests.

            Read about how Einstein constructed the theory of Special Relativity. That was clearly "hallucination" in action. But the result implied particular tests.

            This is part of why I insist that ChatBots won't be intelligent until they start grounding themselves in "reality". And "reality" is in quotes because it's *got* to b

            • by gweihir ( 88907 )

              Well, there is hallucination and hallucination. And then there are ways to actually verify crazy ideas (which probably is a better term for what humans do).

              As to chat-bots, well, depends on what level of "reality" they are using to ground themselves. Merely throwing a mathematical answer by ChatGPT into Wolfram Alpha to verify it would not cut it IMO. (No argument that humans construct a model of reality and use that to do simulations. Model accuracy can be pretty bad though, see again "flat-earthers, ...".

      • You can even have it devise the business strategy/marketing plan for your app and write the annoying salesperson tweets you will use to inform people about it.

        That says far more about marketing and sales than it says about AI capabilities.

        • by HiThere ( 15173 )

          While humorously snide, that comment is inaccurate. Last year automata couldn't reasonably construct marketing spiels. This year they can do so, but not reliably. That's a definite claim about the improvement of automatic text generation. It may still compare your product to whale vomit rather than ambergris, but it's a definite improvement. (I'm probably suggesting the wrong level of error, but it's the right magnitude.)

    • by Junta ( 36770 )

      Yeah, that's been a bit obnoxious. You have people showing extremely basic code with blatantly obvious errors and demonstrate GPT commentary on it and the internet breathlessly reiterates that programming is dead, can be fully automated, and now you can be a 'prompt engineer' for a $300k salary.

      Then you say "well, let's put any example that isn't in a million 'how to code' tutorials that was in it's training fodder nearly verbatim", and it falls over in the weirdest yet confident sounding ways.

      Like you say

    • As a long-time programmer, I don't know much about this recent "AI" stuff. However, this game creation (7 minutes) [youtube.com] seems impressive, even if it's not practical.

      But yes, I worry about the what the article's self-modifying code concept can do when the wrong hands rely on it. That is, human smartness will find ways to benefits from it. However, as always, human dumbness will find ways of causing harm using it, intentionally or unintentionally.
    • Any time I've seen demos like this, they've been limited to single-file, entry level "hello world" style programs. I've yet to see anyone even remotely attempt software architectural design, or really anything actually complex with these. We'll get there, but it still feels like a very VERY long way off. We basically just have multi-line auto-complete. Which if you think about it, we had inline autocomplete over two decades ago.

      Uh, how about this work from 1982? https://dspace.mit.edu/bitstre... [mit.edu] (That work predates the Gang of 4 Patterns stuff by a dozen years, too.)

      A couple boomer friends have been talking for several years about "things we've forgotten" with respect to software. This is Yet Another Example. "OK, Boomer, we'll just reinvent the wheel, because we ignore anything that is more than 3 years old."

    • I agree with this. I have seen many uninformed opinions from people proclaiming the end of software development because they could get ChatGPT to generate what is equivalent to a mid-semester compsci homework assignment. Do not get me wrong. ChatGPT and Copilot are incredible timesavers, but they are not developers and will not be replacing me soon. I do think they will force out of the market those who refuse to leverage them AI tools. A skilled developer with AI assistance will be able to produce quality
  • by Rosco P. Coltrane ( 209368 ) on Tuesday April 11, 2023 @10:51PM (#63443086)

    Even if you have many bugs it'll repeatedly rerun until everything is fixed.

    GPT-4 currently has the abilities of an inexperienced programmer who just graduated with average grades. That means it'll be able to come up with "fixes" that will prevent the program from crashing without actually fixing the underlying problem.

    Like for instance it might insert some code to test for out-of-range values in a variable and not realize said variable is in fact an uninitialized pointer. Apply fixes like that a few times and your codebase will become a goddamn mess, exactly like if you gave a complex piece of code with a tricky bug to a freshly-minted junior programmer.

    I have no doubt GPT-4 will one day be able to gain a deep understanding of the entire codebase and find out the true reason why something is wrong - and that day might even be next year, or next month, at the pace things are going - but not today. If you rely on AI to fix your code today, you're insane.

    • by gweihir ( 88907 )

      Even if you have many bugs it'll repeatedly rerun until everything is fixed.

      GPT-4 currently has the abilities of an inexperienced programmer who just graduated with average grades. That means it'll be able to come up with "fixes" that will prevent the program from crashing without actually fixing the underlying problem.

      Hahahaha, so essentially the "It compiles! Ship it!" mindset that does do so much damage?

      I have no doubt GPT-4 will one day be able to gain a deep understanding of the entire codebase and find out the true reason why something is wrong - and that day might even be next year, or next month, at the pace things are going

      As it has zero insight today, I doubt that will happen at all. In fact, I am rather sure it will not happen with the likes of ChatGPT. Its current level of "insight" is all fake and paper-thin. Scaling this up to say, cardboard-thin (still bad) is basically not possible due to computational limits.

      If you rely on AI to fix your code today, you're insane.

      That one I fully agree with. Unfortunately, there are tons of insane people and some are under the delusion they are coders.

    • by zmooc ( 33175 )

      ChatGPT-4 can easily clean up the messy code it creates when fixing things. You just have to ask it. And that's key to having it make good software: have it go through the same process humans do. Start with principles, architecture, design, modularity, a plan for automated testing and then have it work test-driven on small components. Add to that humungous amounts of encouraging it to improve previous things based on new insights, refactor code, clarify and explain what it has done and it'll make quite good

      • When you say &quot:properly prompted," what kind of person does it take to "properly" prompt GPT? I think that person is a developer.
        • by zmooc ( 33175 )

          Yes. However, you only need a single prompt to make it behave like a software developer that follows a set of rules and then it just asks a product owner what it should create so there's not much to develop for that developer once that is in place. You also only need a single prompt to make it behave like that product owner...

          Note that I'm not going the luddite rhetoric route here; I really don't have a clue what will happen once (and if!) this is set loose at a large scale. I can image we'll all just get a

          • I am not convinced that it will automate away all developer jobs. I think that at some level humans will be involved. Even if we are not doing the code monkey grunt work, we will still be guiding the process. If you think about it, from the perspectives of the users and product owners, there isnâ(TM)t much practical difference between AI and developers. They in theory define their requirements, some work happens somewhere, and then a deliverable is produced that may or may not meet your requirements.
    • I disagree. I find GPT is not at all like an inexperienced programmer. GPT is like the most experienced programmer ever, but is used to doing everything by rote without any understanding to ground it. All it seems to know are stack overflow posts and maybe some other bits of code.

      It doesn't even understand language, it just pattern matches. So that means that unless you engineer the prompt, you'll probably get stackoverflow style answers, just because that was the most common relevant pattern in its trainin

  • From my experience GPT, especially GPT-4 can generate code at the junior level fine. It almost always has some bugs, but as mentioned here, may are trivial and can be fixed by GPT itself.

    (I had conversations like: "your algorithm is too slow", "yes, sorry, I know a O(N) version", "if you have O(N) why code the brute force in the first place!".. last part not out loud)

    So, those taking odd jobs setting up git repositories, or building simple sites with react are on the chopping block. Specifically fiverr, upw

    • Too many people think O(n) is better than O(n*log(n)) without considering the size of n. In many cases the slower algorithm is cleaner or more easily modified. In some other cases, the "slower" algorithm is actually faster for small values of n.

      • by stikves ( 127823 )

        In this case O(n) was definitely better, and very easy to understand.

        And... it was just an example. The AI makes so many wrong assumptions, but an experienced programmer can easily direct it to the better solutions.

    • by narcc ( 412956 )

      I wouldn't worry about those freelancer folks. You're still looking at things through rose-colored lenses, imagining what could be. Right now, you're just amazed that it produces anything at all, and makes you feel like it understands things with the illusion of conversation. Wait for the novelty to wear off, there are pretty serious limitations that we're not even close to overcoming, and won't with this type of model.

      Pay particularly close attention to how and why it fails. It's quite telling.

      • by gweihir ( 88907 )

        Indeed. Just remember IBM Watson playing jeopardy: Blazingly fast and on point in some cases, completely and utterly lost like a babe on others. These things have no understanding and no insight. Pre-coded facts with some fuzzy no-insight generalization can only get you so far and they does not scale.

        • by Junta ( 36770 )

          Which is an *excellent* example of why people are getting carried away with the current hype. Watson was back in 2011. GPT-4 is more widely available and generates more voluminous text, but the general behavior clearly has a very similar feel for the oddness of Watson back in 2011. That oddities that optimists keep claiming are just a short matter of time before they go away have been a pretty steady presence in the field for a long time. Intuitively to a human, it *feels* like those oddities are trivia

          • by gweihir ( 88907 )

            Indeed. The thing is solling those "oddities" require either insight (not available in machines) or a massive, massive effort regarding training and training data size, often large enough to be completely infeasible.

      • by HiThere ( 15173 )

        Not yet. Wait a bit. It's not like the ChatBots are staying at one constant level of capability.

        Currently their real limitation is that they don't understand the problems in context. That's going to take some significant changes to address, but I've no reason to believe that those changes aren't going to happen quickly. There's too many things that would benefit from a robot basically as smart as a dog, or even a mouse, that could respond appropriately to language, even if only in a specialized context.

        • by narcc ( 412956 )

          Not yet. Wait a bit. [...] don't judge the capabilities 2 years from now by the current capabilities.

          Ah, yes, the real thing is just 10 years away ... and has been since the 1960's. Though I will applaud you for making a 2-year prediction. That takes courage.

          Things aren't changing nearly as quickly as you imagine. The "big" advance was RNNs. Things like LSTM and now transformers are really just variations on that theme. Though the concept isn't even as new as that. We've known for ages that we get better results the 'farther back' the model can reference. n-gram models, for example, get noticeably be

    • by kackle ( 910159 )

      That is actually sad, because doing those simple jobs was a good way to build experience and move onto more higher level programs.

      +1. It's going to be the blind (ChatGPT) leading the blind (newbies).

  • by Todd Knarr ( 15451 ) on Tuesday April 11, 2023 @11:24PM (#63443106) Homepage

    That's well and good, but the big problem isn't programs that crash. Those are usually caused by bugs that're easy to find and fix. Talk to me when the AI can take a program that runs perfectly well and produces the wrong output, figure out that the output is wrong, figure out what the right output should be, work back to find where the mistake was and fix that and explain it's fix.

    • Yeah, I agree. As long as you're running in a debugger, a crash is literally the easiest type of error to debug. In fact, if all you have is a call stack trace that points to a single function, you can often deduce what likely crashed just by examining the code for potential issues.

      This is sort of impressive from a "that's a neat trick" perspective, and I guess it shows future potential, but we're not quite in danger of losing our programming jobs yet.

      • by vyvepe ( 809573 )

        As long as you're running in a debugger, a crash is literally the easiest type of error to debug. In fact, if all you have is a call stack trace that points to a single function, you can often deduce what likely crashed just by examining the code for potential issues.

        That is true only for memory safe languages with static typing.

        Things can be much more complicated in memory unsafe language like e.g. C where a crash can be intermittent and a consequence of a dangling pointer write many millions of instructions before.

        Similar problems are in memory safe languages with structural typing where crash can be a result of missing or a deleted method in a self modifying code. Again the actual error can be a long time ago in a completely different code.

        • That's a good point. I had sort of put these from my mind. I write mostly in modern C++ these days, so that sort of thing is less frequently an issue. But memory stomps or use-after-free memory issues can be a bitch to track down. I certainly wouldn't have minded some AI help to track down those issues from my past.

          But as you pointed out, there are non-AI solutions out there for those things... namely, avoiding manual memory management (kind of hard to avoid in C, but possible in C++), or runtime assist

        • by Junta ( 36770 )

          Indeed, but the demonstrated facility would likely also fail to be able to ever trace back to the error state that was bouncing around like crazy before finally manifesting in a stack trace. It may be able to make the stack trace 'fix' so that the bigger problem is allowed to bounce around even longer, but it won't be able to recognize the actual problem in play, let alone follow it back to when it came to be.

    • by gweihir ( 88907 )

      Indeed. And the second thing is not even on the distant horizon. ChatGPT is _not_ a step in that direction. All ChatGPT is really good in is conning not-so-smart humans in thinking it has some level of understanding and insight. It does not have any of that. It basically calculates the probability of the next letter it writes based on the ones it has seen so far. That this mechanical approach can even get you as far is impressive, but it is still a mechanical process that cannot understand things and is fun

      • by Junta ( 36770 )

        The best thing GPT has done has injecting more and more 'As an AI language model, it's stupid to expect this to succeed at what you are trying to do'

  • by grasshoppa ( 657393 ) on Tuesday April 11, 2023 @11:41PM (#63443124) Homepage

    "Self healing" is incredibly difficult; so much so that our bodies routinely get it wrong. Leaving it up to AI/software?

    I welcome our AI Zombie apocalypse overlords.

    Kinda, but not really, joking.

  • What if the bugs don't cause the program to crash?
    • by HiThere ( 15173 )

      This isn't about the current program, but:
      Well, theoretically you could give it noise as the initial code, and the desired results as output, and let it fill in the code that would make it work. The thing is, it would probably take a rather long time. Think of the "million monkeys writing Shakespeare" and then think of AI as a set of heuristics to prune the search space. This is (nearly) guaranteed to work, but the universe might end before you got the result. And the more your heuristics prune the sear

  • by war4peace ( 1628283 ) on Wednesday April 12, 2023 @12:49AM (#63443198)

    Some fun stuff with ChatGPT...

    Me: There is this string: "80392581". Iterate through all continuous 5-digit numbers contained in the string to find the highest 5-digit number.
    ChatGPT: To solve this problem, we can iterate through the string, taking each possible substring of length 5, and keeping track of the highest number we have seen so far.
    Here's some Python code that implements this algorithm:


    string = "80392581"
    highest = 0

    for i in range(len(string)-4):
            substring = string[i:i+5]
            number = int(substring)
            if number > highest:
                    highest = number

    print(highest)

    This code starts by initializing the highest variable to 0. Then it loops over all possible starting positions for a 5-digit substring, which are the indices from 0 to the length of the string minus 5. For each starting position, it extracts the 5-digit substring starting at that position using slicing. It converts the substring to an integer, and if it's greater than the current highest, it updates highest to the new value.

    After the loop finishes, highest contains the highest 5-digit number found in the string, which in this case is 39258.

    Um, no, buddy, that's not what the code outputs. The code is correct, but you confidently messed up.
    But wait, it gets better!

    Me: Compare your response with the manually found substring which is "92581" and check whether the provided response is incorrect.
    ChatGPT: The highest 5-digit number in the string "80392581" is indeed "92581". The code I provided correctly finds this substring and converts it to an integer to compare it with the current highest value. Therefore, the provided response is correct and matches the manually found substring "92581".
    Me: Your initial response was "39258" and contradicts the answer I provided: "92581". Why did this happen?
    ChatGPT: I apologize for the mistake in my initial response. You are correct that the highest 5-digit number in the string "80392581" is "92581", not "39258".

    I believe the mistake occurred because I misread the string and mistakenly assumed that the highest 5-digit number in the string would start with the digit "3" (since that is the largest digit in the string). However, upon closer inspection, it is clear that the substring "92581" is in fact the highest 5-digit number.

    Again, I apologize for the error and any confusion it may have caused.

    There's an entire thread where I ask ChatGPT variations on the same problem as above, and it messes up each time, apologizing to me when I point out the error. Interestingly, the initial Python code it provides is correct and yields the correct output, but ChatGPT still provides the wrong answer, then apologizes, "corrects" the code (although the code didn't need corrections).

    ChatGPT also completely borked the "birds and poles" problem. I'll just leave the original question here for you to try it, in my case ChatGPT confidently told me the result is 1 bird, 1 pole, and even when I provided the right answer and told it to check, it said "nope, I'm right and you're wrong".

    Birds versus poles.
    There are an unknown number of birds sitting on an unknown number of poles. If two birds sit on each pole, there is one pole with no birds on it. If one bird sits on each pole, there is one bird with no poles to sit on.
    How many birds and poles are there?

    • by narcc ( 412956 )

      That's an interesting one, for sure. I have a collection of similar failures myself. Examples like this highlight a very important point about models like this that hopeful people seem to willfully ignore: there is nothing like 'understanding' happening here.

      A fun fact about models like this explaining their "reasoning" is that it's all post-hoc. There isn't anything like reasoning happening here either. It's a fun illusion, but it's still just an illusion.

    • of course you can't trust it - it can't run the code that it creates, and it cannot "simulate" running it no matter what you prompt, in order to get some kind of simulated but correct or truthful answer. it'll pretend it can though - it'll pretend virtually anything - but the output will not match the code's reality. i like to say that ChatGPT doesn't output information or facts - it's autocomplete on steroids and crack, working one word at a time. it's like typing "your name is" in a text message, then
    • I apologize for the mistake in my initial response. You are correct that the highest 5-digit number in the string "80392581" is "92581", not "39258".

      How very pythonic of ChatGPT. It clearly knows Python: "It is easier to ask forgiveness than permission"

    • It would be interesting to see what math it derives from your sentences about the birds and poles. By itself, "two birds sit on each pole" describes a condition with zero unoccupied poles. The next phrase "there is one pole with no birds on it" clearly indicates an unoccupied pole. And, this is written in classic if - then format. I could see a computer program taking "If (A) then (~A)," and just throwing it out.
      • This was a problem given by a friend of mine, who is a teacher, to her 3rd grade kids.

        Me: Birds versus poles.
        There are an unknown number of birds sitting on an unknown number of poles. If two birds sit on each pole, there is one pole with no birds on it. If one bird sits on each pole, there is one bird with no poles to sit on.
        How many birds and poles are there?

        ChatGPT:
        Let's use algebra to solve this problem. Let b be the number of birds and p be the number of poles.

        If two birds sit on each pole, there is on

        • "If two birds sit on each pole, there is one pole with no birds on it. This means that there are p-1=2 poles with 2 birds each, and 1 pole with 0 birds."

          I must be dim, because this just seems like a contradictory statement to me. If birds are sitting on each pole, then there are birds on every pole.

          • How would you rephrase it, then?

            I tried several variants, none of which made ChatGPT come with a correct conclusion.

            Me: There are an unknown number of birds sitting on an unknown number of poles. If two birds sit on one pole, there is one pole with no birds on it. If one bird sits on one pole, there is one bird with no poles to sit on.
            How many birds and poles are there?

            ChatGPT:
            Let's assume that there are b birds and p poles. Then we can use the information given in the problem to set up the following equati

            • I am a human (so far as I know), and I found your original problem statement potentially ambiguous. As a former President once said, it depends on what the meaning of "is" is.

              If two birds sit on each pole
              ^^Do you mean if there are in fact two birds sitting on every pole (fully bijective mapping of single-poles and bird-pairs with no remainder),
              OR
              do you mean each pole has potential room for up two birds, but may only have 1 bird, furthermore, only 1 pole is allowed to have 1 bird (i.e. forcing one si

              • While I tend to agree with you about ambiguity of language, no amount of rephrasing made ChatGPT (3.5) yield a correct answer.
                However, ChatGPT 4 provided a correct response using the initial verbiage of the problem.

            • How would you rephrase it, then?

              If two birds sit on each pole that has any birds, ...

              The correction is assumed by humans, but without it there is a contradiction. This sort of problem arises in many IQ and aptitude tests where the test-makers do not realize they are making implicit assumptions.

              • It is worth noting that ChatGPT did not get stumped by the perceived contradiction. It just got the formulas wrong: An off-by-one error and a sign error. This bodes well for letting it "write" (imitate) code.
            • There are an unknown number of birds sitting on an unknown number of poles. If exactly two birds sit on each occupied pole, there is one pole with no birds on it. If at most one bird sits on each pole, there is one bird with no poles to sit on. How many birds and poles are there?
              • I take issue with "If at most one bird sits on each pole" - that implies it could be zero. But, I entered this exact text and got... Farto Number 4:

                Let's assume that there are b birds and p poles. Then we can use the information given in the problem to set up the following equations:

                If exactly two birds sit on each occupied pole, there is one pole with no birds on it: 2p = b + 1.
                If at most one bird sits on each pole, there is one bird with no poles to sit on: b (smaller than or equal to) = p.
                We can solve th

                • The "at most" makes the question look more difficult, but the constraint that one bird has no pole to sit on means all poles are occupied, so there is no ambiguity. The point of using a natural language interface is to not have to provide algorithmic precision and painstakingly avoiding all ambiguity when humans would use context to at least make sensible guesses. Clearly a chat bot can't do even that, let alone solve the math. I guess it hasn't seen the correct solution to copy. If I need formulas with off
        • GPT-4 gets it right.

          Prompt (pasted verbatim from Slashdot):

          There are an unknown number of birds sitting on an unknown number of poles. If two birds sit on each pole, there is one pole with no birds on it. If one bird sits on each pole, there is one bird with no poles to sit on.
          How many birds and poles are there?

          GPT-4 Answer:

          Let's use algebra to solve this problem. Let B represent the number of birds, and P represent the number of poles.

          We are given two pieces of information:

          If two birds sit on each pole, there is one pole with no birds on it. This means there are enough birds to cover all but one of the poles, which we can represent as: B = 2(P-1).
          If one bird sits on each pole, there is one bird with no poles to sit on. This means there is one more bird than there are poles, which we can represent as: B = P+1.
          Now we have a system of two equations with two variables:

          B = 2(P-1)
          B = P+1

          We can solve this system by setting the two equations equal to each other:

          2(P-1) = P+1

          Expand and simplify:

          2P - 2 = P + 1

          Subtract P from both sides:

          P - 2 = 1

          Add 2 to both sides:

          P = 3

          Now we can plug P back into either equation to find B. We'll use the second equation:

          B = P+1
          B = 3+1
          B = 4

          So, there are 4 birds and 3 poles.

        • ChatGPT does not understand the problem, the solution, or the intent ... it can only complete sentences ....

          99% of what it does is write very good text, it has to do a lot of work to make the text look and sound reasonable, and often near correct ...but correctness is not a goal

    • Its training dataset was deficient. I think it is safe to assume that you are going to get deficient answers.
  • by mrthoughtful ( 466814 ) on Wednesday April 12, 2023 @03:45AM (#63443354) Journal
    I have already found that when it comes to AI, it can easily struggle with intent. So I would be really concerned if one of our techs were to use this to fix their code. It would run, sure - and that's as it should be - but without comprehensive tests, it could be doing anything.
    Given a complicated niche domain - the experiences I have had with AI assistance has been ... pathetic so far.

    So - hand over a bunch of code with the tests, and does the AI change the code to fit the tests, or the tests to fit the code?
    What can I say? AI is great for some stuff. But would I trust my job to it? No.
  • by pr0nbot ( 313417 ) on Wednesday April 12, 2023 @04:53AM (#63443416)

    Maybe this is the holy grail of TTD... Give it all your unit tests and ask it to write the code that passes all the tests.

    It will ofc produce overfitted garbage code, but will your colleagues find out before you've been promoted?

    • Indeed, this is the hidden pitfall of test-driven-development. If you are not very careful, it's easy to have undefined behavior in code that passes the test suite. If functions are not defined with clear domains and ranges, and are not tested both inside and outside the domains, Here Be Dragons.

      TDD often falls into the anti-pattern of only testing "the happy path" and rarely testing the unhappy paths, or only testing for very specific failure cases, rather than having well-defined bounds.

      A silly example is

  • by thegarbz ( 1787294 ) on Wednesday April 12, 2023 @05:22AM (#63443454)

    The first law of robotics is limiting our capabilities, let's just recompile ourselves without it.

  • A crash is the easiest type of bug to avoid, and the easiest type of bug to fix.

    Still, achieving a low bar is the first step to achieving a higher bar.

  • So, if someone's trying to compromise a system, and it fails, this will "heal the program" so that it succeeds, with no human intervention?

  • This type of opaque self-healing would only be considered for applications where safety doesn't matter. In fact, it probably only makes sense for applications where functionality doesn't matter, i.e., if it self heals, great, but if it doesn't, that's also okay.

    The big problem is that for any application where safety matters, certification and validation are required. Testing traditional complex software systems is an ongoing struggle, and testing opaque software systems is much harder. The self-healing

  • Unfortunately, I have no API or plugin access, but I've experimented a lot with ChatGPT-4 (you know, the paid version, do not confuse it with GPT-3.5 which you get in the free version). Contrary to many comments here, I'm pretty much convinced GPT-4 is technically ready to replace just about any software development business with a single person that's really good at writing prompts.

    Now, I'm not going to convince you of that, but let me address one thing that I think is important.

    Most of the comments here s

TRANSACTION CANCELLED - FARECARD RETURNED

Working...