×
AI

AI Can Write Code But Lacks Engineer's Instinct, OpenAI Study Finds 68

Leading AI models can fix broken code, but they're nowhere near ready to replace human software engineers, according to extensive testing [PDF] by OpenAI researchers. The company's latest study put AI models and systems through their paces on real-world programming tasks, with even the most advanced models solving only a quarter of typical engineering challenges.

The research team created a test called SWE-Lancer, drawing from 1,488 actual software fixes made to Expensify's codebase, representing $1 million worth of freelance engineering work. When faced with these everyday programming tasks, the best AI model â" Claude 3.5 Sonnet -- managed to complete just 26.2% of hands-on coding tasks and 44.9% of technical management decisions.

Though the AI systems proved adept at quickly finding relevant code sections, they stumbled when it came to understanding how different parts of software interact. The models often suggested surface-level fixes without grasping the deeper implications of their changes.

The research, to be sure, used a set of complex methodologies to test the AI coding abilities. Instead of relying on simplified programming puzzles, OpenAI's benchmark uses complete software engineering tasks that range from quick $50 bug fixes to complex $32,000 feature implementations. Each solution was verified through rigorous end-to-end testing that simulated real user interactions, the researchers said.
Programming

'New Junior Developers Can't Actually Code' (nmn.gl) 220

Junior software developers' overreliance on AI coding assistants is creating knowledge gaps in fundamental programming concepts, developer Namanyay Goel argued in a post. While tools like GitHub Copilot and Claude enable faster code shipping, developers struggle to explain their code's underlying logic or handle edge cases, Goel wrote. Goel cites the decline of Stack Overflow, a technical forum where programmers historically found detailed explanations from experienced developers, as particularly concerning.
Python

Are Fast Programming Languages Gaining in Popularity? (techrepublic.com) 158

In January the TIOBE Index (estimating programming language popularity) declared Python their language of the year. (Though it was already #1 in their rankings, it had showed a 9.3% increase in their ranking system, notes InfoWorld.) TIOBE CEO Paul Jansen says this reflects how easy Python is to learn, adding that "The demand for new programmers is still very high" (and that "developing applications completely in AI is not possible yet.")

In fact on February's version of the index, the top ten looks mostly static. The only languages dropping appear to be very old languages. Over the last 12 months C and PHP have both fallen on the index — C from the #2 to the #4 spot, and PHP from #10 all the way to #14. (Also dropping is Visual Basic, which fell from #9 to #10.)

But TechRepublican cites another factor that seems to be affecting the rankings: language speed. Fast programming languages are gaining popularity, TIOBE CEO Paul Jansen said in the TIOBE Programming Community Index in February. Fast programming languages he called out include C++ [#2], Go [#8], and Rust [#13 — up from #18 a year ago].

Also, according to the updated TIOBE rankings...

- C++ held onto its place at second from the top of the leaderboard.
- Mojo and Zig are following trajectories likely to bring them into the top 50, and reached #51 and #56 respectively in February.

"Now that the world needs to crunch more and more numbers per second, and hardware is not evolving fast enough, speed of programs is getting important. Having said this, it is not surprising that the fast programming languages are gaining ground in the TIOBE index," Jansen wrote. The need for speed helped Mojo [#51] and Zig [#56] rise...

Rust reached its all-time high in the proprietary points system (1.47%.), and Jansen expects Go to be a common sight in the top 10 going forward.

Oracle

Oracle's Ellison Calls for Governments To Unify Data To Feed AI (msn.com) 105

Oracle co-founder and chairman Larry Ellison said governments should consolidate all national data for consumption by AI models, calling this step the "missing link" for them to take full advantage of the technology. From a report: Fragmented sets of data about a population's health, agriculture, infrastructure, procurement and borders should be unified into a single, secure database that can be accessed by AI models, Ellison said in an on-stage interview with former British Prime Minister Tony Blair at the World Government Summit in Dubai.

Countries with rich population data sets, such as the UK and United Arab Emirates, could cut costs and improve public services, particularly health care, with this approach, Ellison said. Upgrading government digital infrastructure could also help identify wastage and fraud, Ellison said. IT systems used by the US government are so primitive that it makes it difficult to identify "vast amounts of fraud," he added, pointing to efforts by Elon Musk's team at the Department of Government Efficiency to weed it out.

The Internet

Brave Now Lets You Inject Custom JavaScript To Tweak Websites (bleepingcomputer.com) 12

Brave Browser version 1.75 introduces "custom scriptlets," a new feature that allows advanced users to inject their own JavaScript into websites for enhanced customization, privacy, and usability. The feature is similar to the TamperMonkey and GreaseMonkey browser extensions, notes BleepingComputer. From the report: "Starting with desktop version 1.75, advanced Brave users will be able to write and inject their own scriptlets into a page, allowing for better control over their browsing experience," explained Brave in the announcement. Brave says that the feature was initially created to debug the browser's adblock feature but felt it was too valuable not to share with users. Brave's custom scriptlets feature can be used to modify webpages for a wide variety of privacy, security, and usability purposes.

For privacy-related changes, users write scripts that block JavaScript-based trackers, randomize fingerprinting APIs, and substitute Google Analytics scripts with a dummy version. In terms of customization and accessibility, the scriptlets could be used for hiding sidebars, pop-ups, floating ads, or annoying widgets, force dark mode even on sites that don't support it, expand content areas, force infinite scrolling, adjust text colors and font size, and auto-expand hidden content.

For performance and usability, the scriptlets can block video autoplay, lazy-load images, auto-fill forms with predefined data, enable custom keyboard shortcuts, bypass right-click restrictions, and automatically click confirmation dialogs. The possible actions achievable by injected JavaScript snippets are virtually endless. However, caution is advised, as running untrusted custom scriptlets may cause issues or even introduce some risk.

Programming

This Was CS50: Crying Poor, Yale To Stop Offering Harvard's Famed CS50 Course (yaledailynews.com) 50

Slashdot has been covering Harvard's legendary introductory programming course "CS50" since it began setting attendance records in 2014.

But now long-time Slashdot reader theodp brings some news about the course's fate over at Yale. From Yale's student newspaper: After a decade of partnership with Harvard, Yale's CS50 course will no longer be offered starting in fall 2025.... One of Yale's largest computer science courses, jointly taught with Harvard University, was canceled during a monthly faculty meeting after facing budgetary challenges. [Yale's endowment is $40+ billion]... Since Yale started offering the course in 2015, CS50 has consistently seen enrollment numbers in the hundreds and was often the department's largest class.... According to [Yale instructor Ozan] Erat, the original [anonymous] donation that made CS50 possible ended in June 2024, and the cost of employing so many undergraduate learning assistants for the course had become unsustainable.
theodp reminds us that CS50 and its progeny "will continue to live on in all their glory in-person and online at Harvard and edX."
Programming

What Do Linux Kernel Developers Think of Rust? (thenewstack.io) 42

Keynotes at this year's FOSDEM included free AI models and systemd, reports Heise.de — and also a progress report from Miguel Ojeda, supervisor of the Rust integration in the Linux kernel. Only eight people remain in the core team around Rust for Linux... Miguel Ojeda therefore launched a survey among kernel developers, including those outside the Rust community, and presented some of the more important voices in his FOSDEM talk. The overall mood towards Rust remains favorable, especially as Linus Torvalds and Greg Kroah-Hartman are convinced of the necessity of Rust integration. This is less about rapid progress and more about finding new talent for kernel development in the future.
The reaction was mostly positive, judging by Ojeda's slides:

- "2025 will be the year of Rust GPU drivers..." — Daniel Almedia

- "I think the introduction of Rust in the kernel is one of the most exciting development experiments we've seen in a long time." — Andrea Righi

- "[T]he project faces unique challenges. Rust's biggest weakness, as a language, is that relatively few people speak it. Indeed, Rust is not a language for beginners, and systems-level development complicates things even more. That said, the Linux kernel project has historically attracted developers who love challenging software — if there's an open source group willing to put the extra effort for a better OS, it's the kernel devs." — Carlos Bilbao

- "I played a little with [Rust] in user space, and I just absolutely hate the cargo concept... I hate having to pull down other code that I do not trust. At least with shared libraries, I can trust a third party to have done the build and all that... [While Rust should continue to grow in the kernel], if a subset of C becomes as safe as Rust, it may make Rust obsolete..." Steven Rostedt

Rostedt wasn't sure if Rust would attract more kernel contributors, but did venture this opinion. "I feel Rust is more of a language that younger developers want to learn, and C is their dad's language."

But still "contention exists within the kernel development community between those pro-Rust and -C camps," argues The New Stack, citing the latest remarks from kernel maintainer Christoph Hellwig (who had earlier likened the mixing of Rust and C to cancer). Three days later Hellwig reiterated his position again on the Linux kernel mailing list: "Every additional bit that another language creeps in drastically reduces the maintainability of the kernel as an integrated project. The only reason Linux managed to survive so long is by not having internal boundaries, and adding another language completely breaks this. You might not like my answer, but I will do everything I can do to stop this. This is NOT because I hate Rust. While not my favourite language it's definitively one of the best new ones and I encourage people to use it for new projects where it fits. I do not want it anywhere near a huge C code base that I need to maintain."
But the article also notes that Google "has been a staunch supporter of adding Rust to the kernel for Linux running in its Android phones." The use of Rust in the kernel is seen as a way to avoid memory vulnerabilities associated with C and C++ code and to add more stability to the Android OS. "Google's wanting to replace C code with Rust represents a small piece of the kernel but it would have a huge impact since we are talking about billions of phones," Ojeda told me after his talk.

In addition to Google, Rust adoption and enthusiasm for it is increasing as Rust gets more architectural support and as "maintainers become more comfortable with it," Ojeda told me. "Maintainers have already told me that if they could, then they would start writing Rust now," Ojeda said. "If they could drop C, they would do it...."

Amid the controversy, there has been a steady stream of vocal support for Ojeda. Much of his discussion also covered statements given by advocates for Rust in the kernel, ranging from lead developers of the kernel and including Linux creator Linus Torvalds himself to technology leads from Red Hat, Samsung, Google, Microsoft and others.

Programming

C++ on Steroids: Bjarne Stroustrup Presents Guideline-Enforcing 'Profiles' For Resource and Type Safety (acm.org) 71

"It is now 45+ years since C++ was first conceived," writes 74-year-old C++ creator Bjarne Stroustrup in an article this week for Communications of the ACM. But he complains that many developers "use C++ as if it was still the previous millennium," in an article titled 21st Century C++ that promises "the key concepts on which performant, type safe, and flexible C++ software can be built: resource management, life-time management, error-handling, modularity, and generic programming...

"At the end, I present ways to ensure that code is contemporary, rather than relying on outdated, unsafe, and hard-to-maintain techniques: guidelines and profiles." To help developers focus on effective use of contemporary C++ and avoid outdated "dark corners" of the language, sets of guidelines have been developed. Here I focus on the C++ Core guidelines that I consider the most ambitious... My principal aim is a type-safe and resource-safe use of ISO standard C++. That is:

- Every object is exclusively used according to its definition
- No resource is leaked

This encompasses what people refer to as memory safety and much more. It is not a new goal for C++. Obviously, it cannot be achieved for every use of C++, but by now we have years of experience showing that it can be done for modern code, though so far enforcement has been incomplete... When thinking about C++, it is important to remember that C++ is not just a language but part of an ecosystem consisting of implementations, libraries, tools, teaching, and more.

WG21 (and others) are working on "profiles" to enforce guidelines (though they're "not yet available, except for experimental and partial versions"). But Stroustrup writes that the C++ Core Guidelines "use a strategy known as subset-of-superset." First: extend the language with a few library abstractions: use parts of the standard library and add a tiny library to make use of the guidelines convenient and efficient (the Guidelines Support Library, GSL).
Next: subset: ban the use of low-level, inefficient, and error-prone features.

What we get is "C++ on steroids": Something simple, safe, flexible, and fast; rather than an impoverished subset or something relying on massive run-time checking. Nor do we create a language with novel and/or incompatible features. The result is 100% ISO standard C++. Messy, dangerous, low-level features can still be enabled and used when needed.

Stroustrup writes that the C++ Core Guidelines focus on rules "we hope that everyone eventually could benefit from."
  • No uninitialized variables
  • No range or nullptr violations
  • No resource leaks
  • No dangling pointers
  • No type violations
  • No invalidation

Bjarne Stroustrup answered questions from Slashdot readers in 2014...


Chrome

Google's 7-Year Slog To Improve Chrome Extensions Still Hasn't Satisfied Developers (theregister.com) 30

The Register's Thomas Claburn reports: Google's overhaul of Chrome's extension architecture continues to pose problems for developers of ad blockers, content filters, and privacy tools. [...] While Google's desire to improve the security, privacy, and performance of the Chrome extension platform is reasonable, its approach -- which focuses on code and permissions more than human oversight -- remains a work-in-progress that has left extension developers frustrated.

Alexei Miagkov, senior staff technology at the Electronic Frontier Foundation, who oversees the organization's Privacy Badger extension, told The Register, "Making extensions under MV3 is much harder than making extensions under MV2. That's just a fact. They made things harder to build and more confusing." Miagkov said with Privacy Badger the problem has been the slowness with which Google addresses gaps in the MV3 platform. "It feels like MV3 is here and the web extensions team at Google is in no rush to fix the frayed ends, to fix what's missing or what's broken still." According to Google's documentation, "There are currently no open issues considered a critical platform gap," and various issues have been addressed through the addition of new API capabilities.

Miagkov described an unresolved problem that means Privacy Badger is unable to strip Google tracking redirects on Google sites. "We can't do it the correct way because when Google engineers design the [chrome.declarativeNetRequest API], they fail to think of this scenario," he said. "We can do a redirect to get rid of the tracking, but it ends up being a broken redirect for a lot of URLs. Basically, if the URL has any kind of query string parameters -- the question mark and anything beyond that -- we will break the link." Miagkov said a Chrome developer relations engineer had helped identify a workaround, but it's not great. Miagkov thinks these problems are of Google's own making -- the company changed the rules and has been slow to write the new ones. "It was completely predictable because they moved the ability to fix things from extensions to themselves," he said. "And now they need to fix things and they're not doing it."

Java

Oracle Starts Laying Mines In JavaScript Trademark Battle (theregister.com) 36

The Register's Thomas Claburn reports: Oracle this week asked the US Patent and Trademark Office (USPTO) to partially dismiss a challenge to its JavaScript trademark. The move has been criticized as an attempt to either stall or water down legal action against the database goliath over the programming language's name. Deno Land, the outfit behind the Deno JavaScript runtime, filed a petition with the USPTO back in November in an effort to make the trademarked term available to the JavaScript community. This legal effort is led by Node.js creator and Deno Land CEO Ryan Dahl, summarized on the JavaScript.tm website, and supported by more than 16,000 members of the JavaScript community. It aims to remove the fear of an Oracle lawsuit for using the term "JavaScript" in a conference title or business venture.

"Programmers working with JavaScript have formed innumerable community organizations," the website explains. "These organizations, like the standards bodies, have been forced to painstakingly avoid naming the programming language they are built around -- for example, JSConf. Sadly, without risking a legal trademark challenge against Oracle, there can be no 'JavaScript Conference' nor a 'JavaScript Specification.' The world's most popular programming language cannot even have a conference in its name." [...] In the initial trademark complaint, Deno Land makes three arguments to invalidate Oracle's ownership of "JavaScript." The biz claims that JavaScript has become a generic term; that Oracle committed fraud in 2019 when it applied to renew its trademark; and that Oracle has abandoned its trademark because it does not offer JavaScript products or services.

Oracle's motion on Monday focuses on the dismissal of the fraud claim, while arguing that it expects to prevail on the other two claims, citing corporate use of the trademarked term "in connection with a variety of offerings, including its JavaScript Extension Toolkit as well as developer's guides and educational resources, and also that relevant consumers do not perceive JavaScript as a generic term." The fraud claim follows from Deno Land's assertion that the material Oracle submitted in support of its trademark renewal application has nothing to do with any Oracle product. "Oracle, through its attorney, submitted specimens showing screen captures of the Node.js website, a project created by Ryan Dahl, Petitioner's Chief Executive Officer," the trademark cancellation petition says. "Node.js is not affiliated with Oracle, and the use of screen captures of the 'nodejs.org' website as a specimen did not show any use of the mark by Oracle or on behalf of Oracle."

Oracle contends that in fact it submitted two specimens to the USPTO -- a screenshot from the Node.js website and another from its own Oracle JavaScript Extension Toolkit. And this, among other reasons, invalidates the fraud claim, Big Red's attorneys contend. "Where, as here, Registrant 'provided the USPTO with [two specimens]' at least one of which shows use of the mark in commerce, Petitioner cannot plausibly allege that the inclusion of a second, purportedly defective specimen, was material," Oracle's motion argues, adding that no evidence of fraudulent intent has been presented. Beyond asking the court to toss the fraud claim, Oracle has requested an additional thirty days to respond to the other two claims.

The Courts

NetChoice Sues To Block Maryland's Kids Code, Saying It Violates the First Amendment (theverge.com) 27

NetChoice has filed (PDF) its 10th lawsuit challenging state internet regulations, this time opposing Maryland's Age-Appropriate Design Code Act. The Verge's Lauren Feiner reports: NetChoice has become one of the fiercest -- and most successful -- opponents of age verification, moderation, and design code laws, all of which would put new obligations on tech platforms and change how users experience the internet. [...] NetChoice's latest suit opposes the Maryland Age-Appropriate Design Code Act, a rule that echoes a California law of a similar name. In the California litigation, NetChoice notched a partial win in the Ninth Circuit Court of Appeals, which upheld the district court's decision to block a part of the law requiring platforms to file reports about their services' impact on kids. (It sent another part of the law back to the lower court for further review.)

A similar provision in Maryland's law is at the center of NetChoice's complaint. The group says that Maryland's reporting requirement lets regulators subjectively determine the "best interests of children," inviting "discriminatory enforcement." The reporting requirement on tech companies essentially mandates them "to disparage their services and opine on far-ranging and ill-defined harms that could purportedly arise from their services' 'design' and use of information," NetChoice alleges. NetChoice points out that both California and Maryland have passed separate online privacy laws, which NetChoice Litigation Center director Chris Marchese says shows that "lawmakers know how to write laws to protect online privacy when what they want to do is protect online privacy."

Supporters of the Maryland law say legislators learned from California's challenges and "optimized" their law to avoid questions about speech, according to Tech Policy Press. In a blog analyzing Maryland's approach, Future of Privacy Forum points out that the state made some significant changes from California's version -- such as avoiding an "express obligationâ to determine users' ages and defining the "best interests of children." The NetChoice challenge will test how well those changes can hold up to First Amendment scrutiny. NetChoice has consistently maintained that even well-intentioned attempts to protect kids online are likely to backfire. Though the Maryland law does not explicitly require the use of specific age verification tools, Marchese says it essentially leaves tech platforms with a no-win decision: collect more data on users to determine their ages and create varied user experiences or cater to the lowest common denominator and self-censor lawful content that might be considered inappropriate for its youngest users. And similar to its arguments in other cases, Marchese worries that collecting more data to identify users as minors could create a "honey pot" of kids' information, creating a different problem in attempting to solve another.

Programming

Should We Sing the Praises of Agile, or Bury It? (acm.org) 235

"Stakeholders must be included" throughout an agile project "to ensure the evolving deliverables meet their expectations," according to an article this week in Communications of the ACM.

But long-time Slashdot reader theodp complains it's a "gushing how-to-make-Agile-even-better opinion piece." Like other pieces by Agile advocates, it's long on accolades for Agile, but short on hard evidence justifying why exactly Agile project management "has emerged as a critical component for firms looking to improve project delivery speed and flexibility" and the use of Agile approaches is being expanded across other departments beyond software development. Indeed, among the three examples of success offered in the piece to "highlight the effectiveness of agile methods in navigating complex stakeholder dynamics and achieving project success" is Atlassian's use of agile practices to market and develop its products, many of which are coincidentally designed to support Agile practices and teams (including Jira). How meta.

Citing "recent studies," the piece concludes its call for stakeholder engagement by noting that "59% of organizations measure Agile success by customer or user satisfaction." But that is one of those metrics that can create perverse incentives. Empirical studies of user satisfaction and engagement have been published since the 1970's, and sadly one of the cruel lessons learned from them is that the easiest path to having satisfied users is to avoid working on difficult problems. Keep that in mind when you ponder why difficult user stories seem to languish forever in the Kanban and Scrum Board "Ice Box" column, while the "Complete" column is filled with low-hanging fruit. Sometimes success does come easy!

So, are you in the Agile-is-Heaven or Agile-is-Hell camp?

Programming

Slashdot Asks: Do You Remember Your High School's 'Computer Room'? (gatesnotes.com) 192

Bill Gates' blog has been updated with short videos about his upcoming book, including one about how his school ended up with an ASR-33 teletype that could connect their Seattle classroom to a computer in California. "The teachers faded away pretty quickly," Gates adds, "But about six of us stayed hardcore. One was Paul Allen..." — the future co-founder of Microsoft. And the experience clearly meant a lot to Gates. "Microsoft just never would've happened without Paul — and this teletype room."

In a longer post thanking his "brilliant" teachers, Gates calls his teletype experience "an encounter that would shape my entire future" and "opened up a whole new world for me." Gates also thanks World War II Navy pilot and Boeing engineer Bill Dougall, who "was instrumental in bringing computer access to our school, something he and other faculty members pushed for after taking a summer computer class... The fascinating thing about Mr. Dougall was that he didn't actually know much about programming; he exhausted his knowledge within a week. But he had the vision to know it was important and the trust to let us students figure it out."

Gates shared a similar memory about the computer-room's 20-something overseer Fred Wright, who "intuitively understood that the best way to get students to learn was to let us explore on our own terms. There was no sign-up sheet, no locked door, no formal instruction." Instead, Mr. Wright let us figure things out ourselves and trusted that, without his guidance, we'd have to get creative... Some of the other teachers argued for tighter regulations, worried about what we might be doing in there unsupervised. But even though Mr. Wright occasionally popped in to break up a squabble or listen as someone explained their latest program, for the most part he defended our autonomy...

Mr. Wright gave us something invaluable: the space to discover our own potential.

Any Slashdot readers have a similarly impactful experience? Share your own thoughts and memories in the comments.

Do you remember your high school's computer room?
Oracle

Oracle Faces Java Customer Revolt After 'Predatory' Pricing Changes (theregister.com) 136

Nearly 90% of Oracle Java customers are looking to abandon the software maker's products following controversial licensing changes made in 2023, according to research firm Dimensional Research.

The exodus reflects growing frustration with Oracle's shift to per-employee pricing for its Java platform, which critics called "predatory" and could increase costs up to five times for the same software, Gartner found. The dissatisfaction runs deepest in Europe, where 92% of French and 95% of German users want to switch to alternative providers like Bellsoft Liberica, IBM Semeru, or Azul Platform Core.
The Almighty Buck

UK Council Sells Assets To Fund Ballooning $50 Million Oracle Project (theregister.com) 83

West Sussex County Council is using up to $31 million from the sale of capital assets to fund an Oracle-based transformation project, originally budgeted at $3.2 million but now expected to cost nearly $50 million due to delays and cost overruns. The project, intended to replace a 20-year-old SAP system with a SaaS-based HR and finance system, has faced multiple setbacks, renegotiated contracts, and a new systems integrator, with completion now pushed to December 2025. The Register reports: West Sussex County Council is taking advantage of the so-called "flexible use of capital receipts scheme" introduced in 2016 by the UK government to allow councils to use money from the sale of assets such as land, offices, and housing to fund projects that result in ongoing revenue savings. An example of the asset disposals that might contribute to the project -- set to see the council move off a 20-year-old SAP system -- comes from the sale of a former fire station in Horley, advertised for $3.1 million.

Meanwhile, the delays to the project, which began in November 2019, forced the council to renegotiate its terms with Oracle, at a cost of $3 million. The council had expected the new SaaS-based HR and finance system to go live in 2021, and signed a five-year license agreement until June 2025. The plans to go live were put back to 2023, and in the spring of 2024 delayed again until December 2025. According to council documents published this week [PDF], it has "approved the variation of the contract with Oracle Corporation UK Limited" to cover the period from June 2025 to June 2028 and an option to extend again to the period June 2028 to 2030. "The total value of the proposed variation is $2.96 million if the full term of the extension periods are taken," the council said.

Slashdot Top Deals