×
Programming

C++ 23 Language Standard Declared Feature-Complete (infoworld.com) 61

An anonymous reader shares this report from InfoWorld: C++ 23, a planned upgrade to the popular programming language, is now feature-complete, with capabilities such as standard library module support. On the horizon is a subsequent release, dubbed C++ 26.

The ISO C++ Committee in early February completed technical work on the C++ 23 specification and is producing a final document for a draft approval ballot, said Herb Sutter, chair of the committee, in a blog post on February 13. The standard library module is expected to improve compilation.

Other features slated for C++ 23 include simplifying implicit move, fixing temporaries in range-for loops, multidimensional and static operator[], and Unicode improvements. Also featured is static constexpr in constexpr functions. The full list of features can be found at cppreference.com.

Many features of C++ 23 already have been implemented in major compilers and libraries, Sutter said. A planned C++ 26 release of the language, meanwhile, is slated to emphasize concurrency and parallelism.

Programming

Rust Project Reveals New 'Constitution' in Wake of Crisis (thenewstack.io) 81

"The Rust open source project, which handles standards for the language, released a new governance plan Thursday," reports The New Stack, "the cumulation of six months of intense work." Released as a request for comment on GitHub, it will now undergo a comment period. It requires ratification by team leaders before it's accepted.

The Rust project interacts with, but is separate from, the Rust Foundation, which primarily handles the financial assets of Rust. Two years ago, the project had a very public blowup after its entire mod team resigned and publicly posted a scathing account of the core team, which the mod team called "unaccountable to anyone but themselves." It even suggested the core team was not to be trusted, although the team later recanted and apologized for that.

[Rust core team developer] Josh Triplett understandably didn't want to dwell on the kerfuffle that lead to this action. He focused instead on the underlying structural issues that lead to the leadership crisis. "As a result of that, there was widespread agreement within the project that we needed to create a better formal governance structure that removed some of those ambiguities and conflicts, and had mechanisms for dealing with this without ever having a similar crisis," Triplett told The New Stack. "We don't want to ever to have things get to that point again...."

The original Rust project governance structure evolved out of Mozilla, where Rust began and was nurtured for years. Around 2016 or 2017, a request for comment came out that established the Rust project's governance, Triplett said. It created approximately six teams, including the core, language, mod, library and cargo teams. Among the problems with the old model was that the core team became responsible for not just overseeing problems that arose, but solving them as well, Triplett said. That led to burnout and problems, said JT Turner, one of the co-authors on the new model and a member of the Rust core team.... Ultimately, the old governance model was "not a very precise document," Triplett added.

"It was just, 'Hey, here's the rough divisions of power,' and because that document was very rough and informal, it didn't scale to today," he said. "That's one of the things that led to the governance crisis."

AI

OpenAI Will Let Developers Build ChatGPT Into Their Apps (engadget.com) 9

OpenAI, the company behind ChatGPT and DALL-E 2, is launching developer APIs for the AI chatbot and the Whisper speech-transcription model. It also changed its terms of service to let developers opt out of using their data for improvements while adding a 30-day data retention policy. Engadget reports: The new ChatGPT API will use the same AI model ("gpt-3.5-turbo") as the popular chatbot, allowing developers to add either unchanged or flavored versions of ChatGPT to their apps. Snap's My AI is an early example, along with a new virtual tutor feature for the online study tool Quizlet and an upcoming Ask Instacart tool in the popular local-shopping app. However, the API won't be limited to brand-specific bots mimicking ChatGPT; it can also power "non-chat" software experiences that could benefit from AI brains. The ChatGPT API is priced at $0.002 per 1,000 tokens (about 750 words). Additionally, it's offering a dedicated-capacity option for deep-pocketed developers who expect to use more tokens than the standard API allows. The new developer options join the consumer-facing ChatGPT Plus, a $20-per-month service launched in February.

Meanwhile, OpenAI's Whisper API is a hosted version of the open-source Whisper speech-to-text model it launched in September. "We released a model, but that actually was not enough to cause the whole developer ecosystem to build around it," OpenAI president and co-founder Greg Brockman told TechCrunch on Tuesday. "The Whisper API is the same large model that you can get open source, but we've optimized to the extreme. It's much, much faster and extremely convenient." The transcription API will cost developers $0.006 per minute, enabling "robust" transcription in multiple languages and providing translation to English.
Further reading: OpenAI Is Now Everything It Promised Not To Be: Corporate, Closed-Source, and For-Profit (Motherboard)
Communications

Mobile Giants Announce United Interface to Lure Cloud Developers (bloomberg.com) 15

An industry group representing the world's biggest mobile phone operators announced a new united interface that will give developers universal access to all of their networks, speeding up the delivery of new services and products. From a report: The GSMA will introduce the portal, called Open Gateway, at its annual Mobile World Congress in Barcelona on Monday, its Director General Mats Granryd said in an interview. AT&T, China Mobile, Deutsche Telekom and Vodafone Group are among the 21 GSMA members that will use the interface. "We have the phenomenal reach down to the base station and out into your pocket," Granryd said. "And that's what we're trying to make available for the developer community to ultimately benefit you as a consumer or you as a business."
AI

Survey Claims Some Companies are Already Replacing Workers With ChatGPT (yahoo.com) 142

An anonymous reader quotes an article from Fortune: Earlier this month, job advice platform Resumebuilder.com surveyed 1,000 business leaders who either use or plan to use ChatGPT. It found that nearly half of their companies have implemented the chatbot. And roughly half of this cohort say ChatGPT has already replaced workers at their companies....

Business leaders already using ChatGPT told ResumeBuilders.com their companies already use ChatGPT for a variety of reasons, including 66% for writing code, 58% for copywriting and content creation, 57% for customer support, and 52% for meeting summaries and other documents. In the hiring process, 77% of companies using ChatGPT say they use it to help write job descriptions, 66% to draft interview requisitions, and 65% to respond to applications.

Overall, most business leaders are impressed by ChatGPT's work," ResumeBuilder.com wrote in a news release. "Fifty-five percent say the quality of work produced by ChatGPT is 'excellent,' while 34% say it's 'very good....'" Nearly all of the companies using ChatGPT said they've saved money using the tool, with 48% saying they've saved more than $50,000 and 11% saying they've saved more than $100,000....

Of the companies ResumeBuilder.com identified as businesses using the chatbot, 93% say they plan to expand their use of ChatGPT, and 90% of executives say ChatGPT experience is beneficial for job seekers — if it hasn't already replaced their jobs.

Programming

Ask Slashdot: What's the Best Podcast About Computer Science? 37

Long-time Slashdot reader destinyland writes: They say "always be learning" — but do podcasts actually help? I've been trying to find podcasts that discuss programming, and I've enjoyed Lex Fridman's interviews with language creators like Guido van Rossum, Chris Lattner, and Brendan Eich (plus his long interviews with Donald Knuth). Then I discovered that GitHub, Red Hat, Stack Overflow, and the Linux Foundation all have their own podcast.

There's a developer podcast called "Corecursive" that I like with the tagline "the stories behind the code," plus a whole slew of (sometimes language-specific) podcasts at Changelog (including an interview with Brian Kernighan). And it seems like there's an entirely different universe of content on YouTube — like the retired Microsoft engineer doing "Dave's Garage," Software Engineering Daily, and the various documentaries by Honeypot.io. Computerphile has also scored various interviews with Brian Kernighan, and if you search YouTube enough you'll find stray interviews with Steve Wozniak.

But I wanted to ask Slashdot's readers: Do you listen to podcasts about computer science? And if so, which ones? (Because I'm always stumbling across new programming podcasts, which makes me worry about what else I've been missing out on.) Maybe I should also ask if you ever watch coding livestreams on Twitch — although that gets into the more general question of just how much content we consume that's related to our profession.

Fascinating discussions, or continuing work-related education? (And do podcasts really help keep your skills fresh? Are coding livestreams on Twitch just a waste of time?) Most importantly, does anyone have a favorite geek podcast that they're listening to? Share your own experience and opinions in the comments...

What's the best podcast about computer science?
Microsoft

Microsoft .NET 8 Will Bolster Linux Support (infoworld.com) 51

An anonymous reader shared this report from InfoWorld: .NET 8, the next planned version of the Microsoft's open source software development platform, is set to emphasize Linux accommodations as well as cloud development and containers.

A first preview of .NET 8 is available for download at dot.microsoft.com for Windows, Linux, and macOS, Microsoft said on February 21. A long-term support (LTS) release that will be supported for three years, .NET 8 is due for production availability in November, a year after the release of predecessor .NET 7.

The new .NET release will be buildable on Linux directly from the dotnet/dotnet repository, using dotnet/source-build to build .NET runtimes, tools, and SDKs. This is the same build used by Red Hat and Canonical to build .NET. Over time, this capability will be extended to support Windows and macOS. Previously, .NET could be built from the source, but a "source tarball" was required from the dotnet/installer.

"We are publishing Ubuntu Chiseled images with .NET 8," adds Microsoft's announcement.

And when it comes to the .NET Monitor tool, "We plan to ship to dotnet/monitor images exclusively as Ubuntu Chiseled, starting with .NET 8. That's notable because the monitor images are the one production app image we publish."
Programming

GCC Gets a New Frontend for Rust (fosdem.org) 106

Slashdot reader sleeping cat shares a recent FOSDEM talk by a compiler engineer on the team building Rust-GCC, "an alternative compiler implementation for the Rust programming language."

"If gccrs interprets a program differently from rustc, this is considered a bug," explains the project's FAQ on GitHub.

The FAQ also notes that LLVM's set of compiler technologies — which Rust uses — "is missing some backends that GCC supports, so a gccrs implementation can fill in the gaps for use in embedded development." But the FAQ also highlights another potential benefit: With the recent announcement of Rust being allowed into the Linux Kernel codebase, an interesting security implication has been highlighted by Open Source Security, inc. When code is compiled and uses Link Time Optimization (LTO), GCC emits GIMPLE [an intermediate representation] directly into a section of each object file, and LLVM does something similar with its own bytecode. If mixing rustc-compiled code and GCC-built code in the Linux kernel, the compilers will be unable to perform a full link-time optimization pass over all of the compiled code, leading to absent CFI (control flow integrity).

If Rust is available in the GNU toolchain, releases can be built on the Linux kernel (for example) with CFI using LLVM or GCC.

Started in 2014 (and revived in 2019), "The effort has been ongoing since 2020...and we've done a lot of effort and a lot of progress," compiler engineer Arthur Cohen says in the talk. "We have upstreamed the first version of gccrs within GCC. So next time when you install GCC 13 — you'll have gccrs in it. You can use it, you can start hacking on it, you can please report issues when it inevitably crashes and dies horribly."

"One big thing we're doing is some work towards running the rustc test suite. Because we want gccrs to be an actual Rust compiler and not a toy project or something that compiles a language that looks like Rust but isn't Rust, we're trying really hard to get that test suite working."

Read on for some notes from the talk...
Programming

Coinbase Launches Blockchain Base To Help Developers Build dApps On-chain (techcrunch.com) 32

Coinbase, the second largest crypto exchange by trading volume, has launched Base, an Ethereum-focused layer-2 (L2) blockchain, said Jesse Pollak, lead for Base and head of protocols at Coinbase. From a report: In the past, Coinbase has homed in on the trading and exchange side of its business, but from the utility perspective, it's still too hard for developers to build useful decentralized applications (dApps) and for users to actually use those things on-chain, Pollak said. In an effort to expand further into the developer space, Coinbase is building Base to make it "dead easy" for developers to build dApps and for users to access those dApps through Coinbase products, Pollak said. "Our goal is to bring about phase 4 of Coinbase's master plan: to bring a billion users into the crypto economy."

The L2 is a "secure, low-cost, developer-friendly" chain that aims to help builders create dApps on-chain, the company stated. Base is built on the MIT-licensed OP Stack in collaboration with the layer-2 blockchain Optimism, which is also focused on the Ethereum chain. A number of crypto businesses, platforms, marketplaces and infrastructure firms have committed to building on Base, a Coinbase spokesperson told TechCrunch. Those that plan to be involved include Blockdaemon, Chainlink, Etherscan, Quicknode, Aave, Animoca Brands, Dune, Nansen, Magic Eden, Pyth, Rainbow Wallet, Ribbon Finance, The Graph, Wormhole and Gelato, to name a handful.

Programming

Whatever Happened to the Ruby Programming Language? (infoworld.com) 148

Three years after Rails was introduced in 2005, InfoWorld asked whether it might the successor to Java.

That didn't happen. So this week InfoWorld "spoke to current and former Ruby programmers to try to trace the language's rise and fall." Some responses: "Rails came along at the cusp of a period of transformation and growth for the web," says Matthew Boeh, a Ruby developer since 2006. "It both benefited from and fueled that growth, but it was a foregone conclusion that it wasn't going to be the only success story." Boeh recently took a job as a senior staff software engineer at Lattice, a TypeScript shop. "You could say that Ruby has been a victim of its own success, in that its community was a major driving force in the command-line renaissance of recent years," he says. "In the early '00s it was introducing REPL-driven development to people who had never heard of Lisp, package management to people who would have been scared off by Perl's CPAN, test-driven development to people outside the highly corporate Java world, and so on. This is all stuff that is considered table stakes today. Ruby didn't originate any of it, but it was all popularized and made accessible by Rubyists...."

"The JavaScript ecosystem in its current form would have been unimaginable in 2004 — it needed both the command line renaissance and the takeoff of the web platform," adds Lattice's Boeh. "Did you know it took a full decade, 1999 to 2009, to release a single new version of the JavaScript standard? We get one yearly now. Rails became a big deal in the very last time period where it was possible to be a full-stack developer without knowing JavaScript...."

[W]hen it comes to data science, Python has a leg up because of the ready availability of libraries like TensorFlow and Keras. "These frameworks make it easy for coders to build data visualizations and write programs for machine learning," says Pulkit Bhardwaj, e-commerce coach at BoutiqueSetup.net. JavaScript, meanwhile, has spawned seemingly endless libraries that developers can easily download and adapt for just about any purpose. "As a technologist, you can go on your own hero's journey following whatever niche thing you think is the right way to go," says Trowbridge. But when it comes to JavaScript, "these libraries are excellent. Why ignore all of that?"

Many of those libraries were developed by community members, which inspired others to contribute in a snowball effect familiar to anyone involved in open source. But one big player has had an outsized influence here. Python's TensorFlow, which Bhardwaj called a "game-changer," was released by Google, which has followed academia's lead and made Python its internal scripting language. Google, as the maker of the dominant web browser, also has an obvious interest in boosting JavaScript, and Trowbridge gives Google much of the credit for making JavaScript much faster and more memory efficient than it once was: "In some ways it feels almost like a low level language," he says. Meanwhile, Ruby is widely acknowledged to be lagging in performance, in part because it lacks the same sort of corporate sponsor with resources for improving it.

AI

CBS Explores Whether AI Will Eliminate Jobs -- Especially For Coders (cbsnews.com) 159

"All right, we're going to begin this hour with a question on many people's minds these days, amid all these major developments in the field of artificial intelligence. And that question is this: How long until the machines replace us, take our jobs?"

That's the beginning of a segment broadcast on CBS's morning-television news show (with the headline, "Will artificial intelligence erase jobs?") Some excerpts:


"As artificial intelligence gets better.... job security is only supposed to get worse. And in reports like this one, of the top jobs our AI overlords plan to kill, coding or computing programming is often on the list. So with the indulgence of Sam Zonka, a coder and instructor at the General Assembly coding school in New York, I decided to test the idea of an imminent AI takeover -- by seeing if the software could code for someone who knows as little about computers as me -- eliminating the need to hire someone like him."

Gayle King: "So all this gobbledy-gook on the screen. That's what people who sit in these classrooms learn?"

"And I for one was prepared to be amazed. But take a look at the results. About as basic as a basic web site can be."

King: What do you think? You're the professional.
Zonka: Ehh.

[Microsoft CEO Satya Nadella also spoke to CBS right before the launch of its OpenAI-powered Bing search engine, arguing that AI will create more satisfaction in current jobs as well as more net new jobs -- and even helping the economy across the board. "My biggest worry," Nadella says, "is we need some new technology that starts driving real productivity. It's time for some real innovation.]

King: Do you think it'll drive up wages?
Nadella: I do believe it will drive up wages, because productivity and wages are related.


At the end of the report, King tells his co-anchors "In the long term, the research suggests Nadella is correct. In the long term, more jobs, more money. It's in the short-term that all the pain happens."

The report also features an interview with MIT economist David Autor, saying he believes the rise of AI "does indeed mean millions of jobs are going to change in our lifetime. And what's scary is we're just not sure how.... He points out, for example, that more than 60% of the types of jobs people are doing today didn't even exist in the 1940s -- while many of the jobs that did exist have been replaced."

There was also a quote from Meredith Whittaker (co-founder of the AI Now Institute and former FTC advisor), who notes that AI systems "don't replace human labor. They just require different forms of labor to sort of babysit them to train them, to make sure they're working well. Whose work will be degraded and whose house in the Hamptons will get another wing? I think that's the fundamental question when we look at these technologies and ask questions about work."

Later King tells his co-anchors that Whittaker's suggestion was for workers to organize to try to shape how AI system are implemented in their workplace.

But at an open house for the General Assembly code camp, coder Zonka says on a scale of 1 to 10, his worry about AI was only a 2. "The problem is that I'm not entirely sure if the AI that would replace me is 10 years from now, 20 years from now, or 5 years from now."

So after speaking to all the experts, King synthesized what he'd learned. "Don't necessarily panic. You see these lists of all the jobs that are going to be eliminated. We're not very good at making those predictions. Things happen in different ways than we expect. And you could actually find an opportunity to make more money, if you figure out how you can complement the machine as opposed to getting replaced by the machine."
Programming

How Rust Went From a Side Project To the World's Most-Loved Programming Language (technologyreview.com) 118

An anonymous reader quotes a report from MIT Technology Review: Many software projects emerge because -- somewhere out there -- a programmer had a personal problem to solve. That's more or less what happened to Graydon Hoare. In 2006, Hoare was a 29-year-old computer programmer working for Mozilla, the open-source browser company. Returning home to his apartment in Vancouver, he found that the elevator was out of order; its software had crashed. This wasn't the first time it had happened, either. Hoare lived on the 21st floor, and as he climbed the stairs, he got annoyed. "It's ridiculous," he thought, "that we computer people couldn't even make an elevator that works without crashing!" Many such crashes, Hoare knew, are due to problems with how a program uses memory. The software inside devices like elevators is often written in languages like C++ or C, which are famous for allowing programmers to write code that runs very quickly and is quite compact. The problem is those languages also make it easy to accidentally introduce memory bugs -- errors that will cause a crash. Microsoft estimates that 70% of the vulnerabilities in its code are due to memory errors from code written in these languages.

Most of us, if we found ourselves trudging up 21 flights of stairs, would just get pissed off and leave it there. But Hoare decided to do something about it. He opened his laptop and began designing a new computer language, one that he hoped would make it possible to write small, fast code without memory bugs. He named it Rust, after a group of remarkably hardy fungi that are, he says, "over-engineered for survival." Seventeen years later, Rust has become one of the hottest new languages on the planet -- maybe the hottest. There are 2.8 million coders writing in Rust, and companies from Microsoft to Amazon regard it as key to their future. The chat platform Discord used Rust to speed up its system, Dropbox uses it to sync files to your computer, and Cloudflare uses it to process more than 20% of all internet traffic.

When the coder discussion board Stack Overflow conducts its annual poll of developers around the world, Rust has been rated the most "loved" programming language for seven years running. Even the US government is avidly promoting software in Rust as a way to make its processes more secure. The language has become, like many successful open-source projects, a barn-raising: there are now hundreds of die-hard contributors, many of them volunteers. Hoare himself stepped aside from the project in 2013, happy to turn it over to those other engineers, including a core team at Mozilla. It isn't unusual for someone to make a new computer language. Plenty of coders create little ones as side projects all the time. But it's meteor-strike rare for one to take hold and become part of the pantheon of well-known languages alongside, say, JavaScript or Python or Java. How did Rust do it?

Programming

Can C++ Be Safer? Bjarne Stroustrup On Ensuring Memory Safety (thenewstack.io) 110

C++ creator Bjarne Stroustrup "joins calls for changing the programming language itself to address security concerns," according to an article shared by Slashdot user guest reader: In mid-January, the official C++ "direction group" -- which makes recommendations for the programming language's evolution -- issued a statement addressing concerns about C++ safety. While many languages now support "basic type safety" -- that is, ensuring that variables access only sections of memory that are clearly defined by their data types -- C++ has struggled to offer similar guarantees.

This new statement, co-authored by C++ creator Bjarne Stroustrup, now appears to call for changing the C++ programming language itself to address safety concerns. "We now support the idea that the changes for safety need to be not just in tooling, but visible in the language/compiler, and library." The group still also supports its long-preferred use of debugging tools to ensure safety (and "pushing tooling to enable more global analysis in identifying hard for humans to identify safety concerns"). But that January statement emphasizes its recommendation for changes within C++.

Specifically, it proposes "packaging several features into profiles" (with profiles defined later as "a collection of restrictions and requirements that defines a property to be enforced" by, for example, triggering an automatic analysis.) In this way the new changes for safety "should be visible such that the Safe code section can be named (possibly using profiles), and can mix with normal code." And this new approach would ultimately bring not just safety but also flexibility, with profiles specifically designed to support embedded computing, performance-sensitive applications, or highly specific problem domains, like automotive, aerospace, avionics, nuclear, or medical applications.

"For example, we might even have safety profiles for safe-embedded, safe-automotive, safe-medical, performance-games, performance-HPC, and EU-government-regulation," the group suggests. Elsewhere in the document they put it more succinctly. "To support more than one notion of 'safety', we need to be able to name them."

Stroustrup emphasized his faith in C++ in a 2020 interview. "I think C++ can do anything Rust can do, and I would like it to be much simpler to use," Stroustrup told the Association for Computing Machinery's Special Interest Group on Programming Languages.

But even then, he'd said that basic type safety was one of his earliest design goals -- and one he's spent decades trying to achieve. "I get a little bit sad when I hear people talk about C++ as if they were back in the 1980s, the 1990s, which a lot of people do. They looked at it back in the dark ages, and they haven't looked since."
Programming

A Developer is Reimplementing GNU's Core Utilities in Rust (phoronix.com) 186

A Rust-based re-implementation of GNU core utilities like cp and mv is "reaching closer to parity with the widely-used GNU upstream and becoming capable of taking on more real-world uses," reports Phoronix: Debian developer Sylvestre Ledru [also an engineering director at Mozilla] began working on uutils during the COVID-19 pandemic and presented last week at FOSDEM 2023 on his Coreutils replacement effort. With uutils growing into increasingly good shape, it's been packaged up by many Linux distributions and is also used now by "a famous social network via the Yocto project...."

The goals with uutils are to try to create a drop-in replacement for GNU Coreutils, strive for good cross-platform support, and easy testing. Ledru's initial goals were about being able to boot Debian, running the most popular packages, building key open-source software, and all-around it's been panning out to be a great success.... [M]ore performance optimizations are to come along with other work for compatibility against the GNU tools and implementing some still missing options in different programs

Programming

Google's Go May Add Telemetry That's On By Default (theregister.com) 75

Russ Cox, a Google software engineer steering the development of the open source Go programming language, has presented a possible plan to implement telemetry in the Go toolchain. However many in the Go community object because the plan calls for telemetry by default. The Register reports: These alarmed developers would prefer an opt-in rather than an opt-out regime, a position the Go team rejects because it would ensure low adoption and would reduce the amount of telemetry data received to the point it would be of little value. Cox's proposal summarized lengthier documentation in three blog posts.

Telemetry, as Cox describes it, involves software sending data from Go software to a server to provide information about which functions are being used and how the software is performing. He argues it is beneficial for open source projects to have that information to guide development. And the absence of telemetry data, he contends, makes it more difficult for project maintainers to understand what's important, what's working, and to prioritize changes, thereby making maintainer burnout more likely. But such is Google's reputation these days that many considering the proposal have doubts, despite the fact that the data collection contemplated involves measuring the usage of language features and language performance. The proposal isn't about the sort of sensitive personal data vacuumed up by Google's ad-focused groups.
"Now you guys want to introduce telemetry into your programming language?" IT consultant Jacob Weisz said. "This is how you drive off any person who even considered giving your project a chance despite the warning signs. Please don't do this, and please issue a public apology for even proposing it. Please leave a blast radius around this idea wide enough that nobody even suggests trying to do this again."

He added: "Trust in Google's behavior is at an all time low, and moves like this are a choice to shove what's left of it off the edge of a cliff."

Meanwhile, former Google cryptographer and current open source maintainer Filippo Valsorda said in a post to Mastodon: "This is a large unconventional design, there are a lot of tradeoffs worth discussing and details to explore," he wrote. "When Russ showed it to me I made at least a dozen suggestions and many got implemented."

"Instead: all opt-out telemetry is unethical; Google is evil; this is not needed. No one even argued why publishing any of this data could be a problem."
Programming

GitHub Claims Source Code Search Engine Is a Game Changer (theregister.com) 39

Thomas Claburn writes via The Register: GitHub has a lot of code to search -- more than 200 million repositories -- and says last November's beta version of a search engine optimized for source code that has caused a "flurry of innovation." GitHub engineer Timothy Clem explained that the company has had problems getting existing technology to work well. "The truth is from Solr to Elasticsearch, we haven't had a lot of luck using general text search products to power code search," he said in a GitHub Universe video presentation. "The user experience is poor. It's very, very expensive to host and it's slow to index." In a blog post on Monday, Clem delved into the technology used to scour just a quarter of those repos, a code search engine built in Rust called Blackbird.

Blackbird currently provides access to almost 45 million GitHub repositories, which together amount to 115TB of code and 15.5 billion documents. Shifting through that many lines of code requires something stronger than grep, a common command line tool on Unix-like systems for searching through text data. Using ripgrep on an 8-core Intel CPU to run an exhaustive regular expression query on a 13GB file in memory, Clem explained, takes about 2.769 seconds, or 0.6GB/sec/core. [...] At 0.01 queries per second, grep was not an option. So GitHub front-loaded much of the work into precomputed search indices. These are essentially maps of key-value pairs. This approach makes it less computationally demanding to search for document characteristics like the programming language or word sequences by using a numeric key rather than a text string. Even so, these indices are too large to fit in memory, so GitHub built iterators for each index it needed to access. According to Clem, these lazily return sorted document IDs that represent the rank of the associated document and meet the query criteria.

To keep the search index manageable, GitHub relies on sharding -- breaking the data up into multiple pieces using Git's content addressable hashing scheme and on delta encoding -- storing data differences (deltas) to reduce the data and metadata to be crawled. This works well because GitHub has a lot of redundant data (e.g. forks) -- its 115TB of data can be boiled down to 25TB through deduplication data-shaving techniques. The resulting system works much faster than grep -- 640 queries per second compared to 0.01 queries per second. And indexing occurs at a rate of about 120,000 documents per second, so processing 15.5 billion documents takes about 36 hours, or 18 for re-indexing since delta (change) indexing reduces the number of documents to be crawled.

AI

Developers Created AI To Generate Police Sketches. Experts Are Horrified 115

An anonymous reader quotes a report from Motherboard: Two developers have used OpenAI's DALL-E 2 image generation model to create a forensic sketch program that can create "hyper-realistic" police sketches of a suspect based on user inputs. The program, called Forensic Sketch AI-rtist, was created by developers Artur Fortunato and Filipe Reynaud as part of a hackathon in December 2022. The developers wrote that the program's purpose is to cut down the time it usually takes to draw a suspect of a crime, which is "around two to three hours," according to a presentation uploaded to the internet. "We haven't released the product yet, so we don't have any active users at the moment, Fortunato and Reynaud told Motherboard in a joint email. "At this stage, we are still trying to validate if this project would be viable to use in a real world scenario or not. For this, we're planning on reaching out to police departments in order to have input data that we can test this on."

AI ethicists and researchers told Motherboard that the use of generative AI in police forensics is incredibly dangerous, with the potential to worsen existing racial and gender biases that appear in initial witness descriptions. "The problem with traditional forensic sketches is not that they take time to produce (which seems to be the only problem that this AI forensic sketch program is trying to solve). The problem is that any forensic sketch is already subject to human biases and the frailty of human memory," Jennifer Lynch, the Surveillance Litigation Director of the Electronic Frontier Foundation, told Motherboard. "AI can't fix those human problems, and this particular program will likely make them worse through its very design."

The program asks users to provide information either through a template that asks for gender, skin color, eyebrows, nose, beard, age, hair, eyes, and jaw descriptions or through the open description feature, in which users can type any description they have of the suspect. Then, users can click "generate profile," which sends the descriptions to DALL-E 2 and produces an AI-generated portrait. "Research has shown that humans remember faces holistically, not feature-by-feature. A sketch process that relies on individual feature descriptions like this AI program can result in a face that's strikingly different from the perpetrator's," Lynch said. "Unfortunately, once the witness sees the composite, that image may replace in their minds, their hazy memory of the actual suspect. This is only exacerbated by an AI-generated image that looks more 'real' than a hand-drawn sketch."
Oracle

Oracle Criticized Over Price Change for New Oracle Java SE Licenses (crn.com) 104

While Oracle's existing Java corporate licensing agreements are still in effect, "the Named User Plus Licensing (user licenses) and Processor licenses (server licensing) are no longer available for purchase," reports IT World Canada. And that's where it gets interesting: The new pricing model is based on employee count, with different price tiers for different employee counts. The implication is that everyone in the organization is counted for licensing purposes, even if they don't use Java software.

As a result, companies that use Java SE may face significant price increases. The change will primarily affect large companies with many employees, but it will also have a significant impact on medium-sized businesses. Although Oracle promises to allow legacy users to renew under their current terms and conditions, sources say the company will likely pressure users to adopt the new model over time.

The move is "likely to rile customers that have a fraction of employees who work with Java," Oracle partners told CRN, though "the added complexity is an opportunity for partners to help customers right-size their spending." Jeff Stonacek, principal architect at House of Brick Technologies, an Omaha, Neb.-based company that provides technical and licensing services to Oracle clients, and chief technical officer of House of Brick parent company OpsCompass, told CRN that the change has already affected at least one project, with his company in the middle of a license assessment for a large customer. He called the change "an obvious overstep."

"Having to license your entire employee count is not reasonable because you could have 10,000 employees, maybe only 500 of them need Java," Stonacek said. "And maybe you only have a couple of servers for a couple of applications. But if you have to license for your entire employee count, that just doesn't make sense...." Stonacek and his team have been talking to customers about migrating to Open Java Development Kit (JDK), a free and open-source version of Java Standard Edition (SE), although that was a practice started before the price change.

He estimated that about half of the customers his team talks to are able to easily move to OpenJDK. Sometimes, customers have third-party applications that are written for Java and unchangeable as opposed to custom applications that in-house engineers can just rewrite.... Ron Zapar, CEO of Naperville, Ill.-based Oracle partner Re-Quest, told CRN that even without a direct effect on partners from the Java license change, the move makes customers question whether they want to purchase Oracle Cloud offerings and other Oracle products lest they face future changing terms or lock-in.

AI

ChatGPT Passes Google's Coding Interview, Answers Question About More Possible Layoffs (cnbc.com) 65

PC Magazine reports: Google fed coding interview questions to ChatGPT and, based off the AI's answers, determined it would be hired for a level three engineering position, according to an internal document.... And while level three is considered an entry-level position on the engineering team at Google, average total compensation for the job is about $183,000....

When PCMag asked ChatGPT if it would ever replace software engineers, it wasn't convinced. "No, ChatGPT will not replace software engineers," the AI responded. "ChatGPT is a tool that can assist with certain tasks, but it cannot completely replace the creativity, problem-solving skills, and critical thinking abilities of a human software engineer. Additionally, ChatGPT requires human supervision and direction to operate effectively."

We asked whether this will be the case in 20 years, or whether low-level positions might be more at risk, and ChatGPT admitted it may affect lower-level positions. But it reiterated it will never be a full replacement, but rather a tool to assist human software engineers.

Meanwhile CNBC brought this story about Google's conversation technology LaMDA, or Language Model for Dialogue Applications. "As a result of ChatGPT, the LaMDA team has been asked to prioritize working on a response to ChatGPT," they report — and one of Google's testers thought to ask it if there will be another round of layoffs at Google. "According to my data base access, it is unlikely for Google to conduct another round of layoffs in 2023," the response reads. "Layoffs are generally conducted to reduce costs and structure, but the company is doing well financially. In fact, Google's revenue increased by 34% in 2021, and the company's stock price has risen by 70% since January 2022."
Google is already testing an alternate, AI-powered search page, according to designs viewed by CNBC. "When a question is entered, the search results show a gray bubble directly under the search bar, offering more human-like responses than typical search results." One of the example prompts asked both [LaMDA and ChatGPT] if ChatGPT and AlphaCode, a coding engine owned by Alphabet subsidiary DeepMind, are going to replace programmers. "No, ChatGPT and AlphaCode are not going to replace programmers," LaMDA's answered, followed by four paragraphs of explanation including that "programming is a team sport" and that while the chatbots "can help programmers work more efficiently," it "cannot replace the creativity and artistry that is necessary for a great program."

ChatGPT's response was similar, stating "It is unlikely that ChatGPT or Alphacode will replace programmers" because they are "not capable of fully replacing the expertise and creativity of human programmers...programming is a complex field that requires a deep understanding of computer science principles and the ability to adapt to new technologies."

Google

Back At Google Again, Cofounder Sergey Brin Just Filed His First Code Request In Years (forbes.com) 14

After years of day-to-day absence, Google cofounder Sergey Brin filed a request for access to code related to the company's natural language chatbot, LaMDA. Forbes reports: Two sources said the request was related to LaMDA, Google's natural language chatbot -- a project initially announced in 2021, but which has recently garnered increased attention as Google tries to fend off rival OpenAI, which released the popular ChatGPT bot in November. Brin filed a "CL," short for "changelist," to gain access to the data that trains LaMDA, one person who saw the request said. It was a two line change to a configuration file to add his username to the code, that person said. Several dozen engineers gave the request LGTM approval, short for "looks good to me." Some of the approvals came from workers outside of that team, seemingly just eager to be able to say they gave code review approval to the company cofounder, that person added.

The move was a small technical change, but underscores how seriously the company is taking the looming threat from OpenAI and other competitors. Brin and cofounder Larry Page have been largely absent from the company since 2019, when Page handed the reins over to Sundar Pichai to become CEO of Google parent Alphabet. But Pichai has recently called in the company founders to review the company's AI strategy and help form a response to ChatGPT, according to the New York Times. Brin's tinkering highlights the level of involvement the cofounders have taken.

Slashdot Top Deals