Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
AI Programming

Stack Overflow Touts New Programming Solutions Tool That Mines Crowd Knowledge (stackoverflow.blog) 40

Stack Overflow shares a new tool from a team of researchers that "takes the description of a programming task as a query and then provides relevant, comprehensive programming solutions containing both code snippets and their succinct explanations" -- the Crowd Knowledge Answer Generator (or CROKAGE): In order to reduce the gap between the queries and solutions, the team trained a word-embedding model with FastText, using millions of Q&A threads from Stack Overflow as the training corpus. CROKAGE also expanded the natural language query (task description) to include unique open source software library and function terms, carefully mined from Stack Overflow.

The team of researchers combined four weighted factors to rank the candidate answers... In particular, they collected the programming functions that potentially implement the target programming task (the query), and then promoted the candidate answers containing such functions. They hypothesized that an answer containing a code snippet that uses the relevant functions and is complemented with a succinct explanation is a strong candidate for a solution. To ensure that the written explanation was succinct and valuable, the team made use of natural language processing on the answers, ranking them most relevant by the four weighted factors. They selected programming solutions containing both code snippets and code explanations, unlike earlier studies. The team also discarded trivial sentences from the explanations...

The team analyzed the results of 48 programming queries processed by CROKAGE. The results outperformed six baselines, including the state-of-art research tool, BIKER. Furthermore, the team surveyed 29 developers across 24 coding queries. Their responses confirm that CROKAGE produces better results than that of the state-of-art tool in terms of relevance of the suggested code examples, benefit of the code explanations, and the overall solution quality (code + explanation).

The tool is still being refined, but it's "experimentally available" -- although "It's limited to Java queries for now, but the creators hope to have an expanded version open to the public soon."

It will probably be more useful than Stack Roboflow, a site that uses a neural network to synthesize fake Stack Overflow questions.
This discussion has been archived. No new comments can be posted.

Stack Overflow Touts New Programming Solutions Tool That Mines Crowd Knowledge

Comments Filter:
  • by Topwiz ( 1470979 ) on Sunday August 18, 2019 @07:51PM (#59100408)

    If it can't figure out the problem, it will report you as off-topic.

    • Possibly true. Training something on data from Stack Overflow is likely to reduce in a badly trainied AI. Especially if it takes into account the number of votes as I regularly find the best and most accurate answers with lower votes.

  • no garbage collection since we are not american...

  • Is that a misspelling of Crockage? We have certainly entered the Age of Crock. The only reason that Stack Overflow exists is because of the shoddy state of Reference Manuals these days.

    In the olden days we had these things called Reference Manuals that documented in excruciating detail each and every command and option with a detailed description, the whole thing ordered in alphabetical order (the same went for programming languages and library functions). Also available were User Guides and User Manuals

    • Re:Misspelling? (Score:4, Informative)

      by ptaff ( 165113 ) on Sunday August 18, 2019 @08:22PM (#59100482) Homepage

      The only reason that Stack Overflow exists is because of the shoddy state of Reference Manuals these days

      Many times the answer is documented and easy to find, but not as easy as entering your question into a search engine. For instance, 3180 votes on StackOverflow for a grep question that is answered in the manual [stackoverflow.com].

      You can call that laziness or efficiency; your call.

      • 3181 votes now :D I up-voted it, too, hehe.

        Yeah, perhaps the guy asking the question is using it on window, or never heard about -h/--help or "man grep".

        If it was not on SO, we would not get sneaky comments like this one: +1 - -5 is quicker to type than -A 5 -B 5 â" mouche Jul 12 '11 at 8:10>/> :P

        Sometimes it is actually quite fun to browse on SO ...

        • Too often there are people who don't understand the question giving the answers, and people who don't know enough to understand the question or the answer are upvoting it because it sounds like it fits the dogmatic view of the ocmmunity. Thus some of the most highest voted answers tend to be highly misleading, an answer to a different question, or even outright wrong. Ie, I see C quesitons being answered as if they were C++ questions, OSX questions answered as if they were Linux questions, etc. The probl

      • by xonen ( 774419 )

        A lot of commands are only documented by man pages. And those man pages can be very, let's put this nicely, cryptic.

        It's quite normal to read a man page 5 times over and still have no clue how to use a certain command. It typically lacks examples, as apparently examples are not documentation. And more often than not, you'd have to guess how parameters are formatted. Is there a space separating option and parameter or not. Some parameters only work with the --fullname, even though a single-letter shorthand i

      • Now it is at 4241. Apparently, many more people failed to RTFM

    • In our days, the amount of knowledge is just to much.
      Luckily Java etc. still comes with Javadoc ... but you hardly can read and memorize everything.
      Many problems you encounter have a solution that spreads over many places of the documentation
      Even I use stackoverflow relatively often. It simply makes more sense to google (which leads you most likely to SO) and get either a good hint or complete solution in 5 minutes than working it out over days.
      My last problem as: Json encoded data transfer between a REST c

      • In our days, the amount of knowledge is just to much.

        Just to much, or just to many?

        • I'd say the information is overflowing (the stack). Get it? Stack overflow? Overflowing the stack? Lol, I crack myself up.
        • Not sure, saying "information is just to many" sounds odd to me ... many you say when you can count stuff, much you say when you can't.
          But perhaps you have a better rule, I'm German, how should I know which is really better?

          • Not sure, saying "information is just to many" sounds odd to me

            That's because you're fearful of Social Justice.

    • by guruevi ( 827432 )

      With programming languages like JavaScript these days, you'd need an entire public library just to get the basics captured in text.

    • In the olden days we had these things called Reference Manuals that documented in excruciating detail ...

      I don't want "excruciating detail". I want a simple clean working example.

    • All the replies saying this is wrong - sorry kids, but I remember the days when I had the full set of OS/2 manuals, and the huge box of 5 NT4 manuals. They told you everything the system did, in detail too. i remember the Amiga reference manuals too.

      You see, in those days people sat down, thought about what they wanted a thing to do, how it should do it properly, and then not only did it but wrote up how it all worked. And then they left it pretty much alone - they may tweaked bits here and there but the ma

      • I blame rush-to-market. "If we don't get it out there now, we will be second!! We can't have that!! Get it out with bare bones functionality and we will do patch releases every week for the next 28 years! What could possibly go wrong?!?"
    • I assume it is play on grokage or groking.

      https://en.wikipedia.org/wiki/... [wikipedia.org]

    • Maybe, but Reference manuals don't always encompass experience, hence the humans in the loop.

  • More tools for you to do my job!

  • Meh.

    After you strip away all the jargon, hype, and terminology, it turns out that it's basically just a search engine with ranked results that are based on additional, deeper screening of the initial results.

    In other words the answer was there all along, it was just buried in all the useless chatter where people were asking you to check if your monitor was plugged in or some shit like that.

    It's good that it can (may) produce better results, but this is nothing revolutionary as far as I can tell.

    They wouldn'

  • by humankind ( 704050 ) on Sunday August 18, 2019 @11:23PM (#59100820) Journal

    I'm sorry your comment was removed because the moderators have deemed it to be redundant. Please see [this similar question posed 14 years ago that we believe addresses the same thing even though what you're talking about didn't exist then].

    • by AmiMoJo ( 196126 )

      Stack Exchange has become another MMORPG like Wikipedia. Level up at the expense of the competition by stealing their answers, down-voting them early, and posting simple but wrong solutions that attract up-votes from all the noobs who don't know better.

    • I have top-8% account on SO and just got absolutely fed up with spending all my time blocking closes and reopening good questions that were closed for no reason. I gave up in the end and rare visit. Despite the efforts to the contrary it has become a completely toxic place.

      If you examine who is closing question and what questions/answers they link as the duplicate it is obvious that many are straight up gaming SO to boost their own profiles.

  • If you're not paying, YOU'RE the product.
  • by Viol8 ( 599362 ) on Monday August 19, 2019 @06:02AM (#59101390) Homepage

    A) Its creating lazy lego brick programmers who have to use it to find a solution to every standard problem they encounter because they're too thick and/or lazy to work it out themselves.

    B) Its allowing organisations producing products and APIs to get away with incomplete or no documentation in the hope an insider will have posted an answer on slackoverflow (which they do, but proper documenation would be a hell of a lot better). Some examples: MacOS Sound API, any Mongo API

Any circuit design must contain at least one part which is obsolete, two parts which are unobtainable, and three parts which are still under development.

Working...