×
Programming

Are Python Libraries Riddled With Security Holes? (techradar.com) 68

"Almost half of the packages in the official Python Package Index (PyPI) repository have at least one security issue," reports TechRadar, citing a new analysis by Finnish researchers, which even found five packages with more than a thousand issues each... The researchers used static analysis to uncover the security issues in the open source packages, which they reason end up tainting software that use them. In total the research scanned through 197,000 packages and found more than 749,000 security issues in all... Explaining their methodology the researchers note that despite the inherent limitations of static analysis, they still found at least one security issue in about 46% of the packages in the repository. The paper reveals that of the issues identified, the maximum (442,373) are of low severity, while 227,426 are moderate severity issues. However, 11% of the flagged PyPI packages have 80,065 high severity issues.
The Register supplies some context: Other surveys of this sort have come to similar conclusions about software package ecosystems. Last September, a group of IEEE researchers analyzed 6,673 actively used Node.js apps and found about 68 per cent depended on at least one vulnerable package... The situation is similar with package registries like Maven (for Java), NuGet (for .NET), RubyGems (for Ruby), CPAN (for Perl), and CRAN (for R). In a phone interview, Ee W. Durbin III, director of infrastructure at the Python Software Foundation, told The Register, "Things like this tend not to be very surprising. One of the most overlooked or misunderstood parts of PyPI as a service is that it's intended to be freely accessible, freely available, and freely usable. Because of that we don't make any guarantees about the things that are available there..."

Durbin welcomed the work of the Finnish researchers because it makes people more aware of issues that are common among open package management systems and because it benefits the overall health of the Python community. "It's not something we ignore but it's also not something we historically have had the resources to take on," said Durbin. That may be less of an issue going forward. According to Durbin, there's been significantly more interest over the past year in supply chain security and what companies can do to improve the situation. For the Python community, that's translated into an effort to create a package vulnerability reporting API and the Python Advisory Database, a community-run repository of PyPI security advisories that's linked to the Google-spearheaded Open Vulnerability Database.

Programming

Free Software Foundation Will Fund Papers on Issues Around Microsoft's 'GitHub Copilot' (fsf.org) 111

GitHub's new "Copilot" tool (created by Microsoft and OpenAI) shares the autocompletion suggestions of an AI trained on code repositories. But can that violate the original coder's license? Now the Free Software Foundation (FSF) is calling for a closer look at these and many other issues...

"We already know that Copilot as it stands is unacceptable and unjust, from our perspective," they wrote in a blog post this week, arguing that Copilot "requires running software that is not free/libre (Visual Studio, or parts of Visual Studio Code), and Copilot is Service as a Software Substitute. These are settled questions as far as we are concerned."

"However, Copilot raises many other questions which require deeper examination..." The Free Software Foundation has received numerous inquiries about our position on these questions. We can see that Copilot's use of freely licensed software has many implications for an incredibly large portion of the free software community. Developers want to know whether training a neural network on their software can really be considered fair use. Others who may be interested in using Copilot wonder if the code snippets and other elements copied from GitHub-hosted repositories could result in copyright infringement. And even if everything might be legally copacetic, activists wonder if there isn't something fundamentally unfair about a proprietary software company building a service off their work.

With all these questions, many of them with legal implications that at first glance may have not been previously tested in a court of law, there aren't many simple answers. To get the answers the community needs, and to identify the best opportunities for defending user freedom in this space, the FSF is announcing a funded call for white papers to address Copilot, copyright, machine learning, and free software.

We will read the submitted white papers, and we will publish ones that we think help elucidate the problem. We will provide a monetary reward of $500 for the papers we publish.

They add that the following questions are of particular interest:
  • Is Copilot's training on public repositories infringing copyright? Is it fair use?
  • How likely is the output of Copilot to generate actionable claims of violations on GPL-licensed works?
  • How can developers ensure that any code to which they hold the copyright is protected against violations generated by Copilot?
  • Is there a way for developers using Copilot to comply with free software licenses like the GPL?
  • If Copilot learns from AGPL-covered code, is Copilot infringing the AGPL?
  • If Copilot generates code which does give rise to a violation of a free software licensed work, how can this violation be discovered by the copyright holder on the underlying work?
  • Is a trained artificial intelligence (AI) / machine learning (ML) model resulting from machine learning a compiled version of the training data, or is it something else, like source code that users can modify by doing further training?
  • Is the Copilot trained AI/ML model copyrighted? If so, who holds that copyright?
  • Should ethical advocacy organizations like the FSF argue for change in copyright law relevant to these questions?

Education

Texas Instruments' New Calculator Will Run Programs Written in Python (dallasnews.com) 126

"Dallas-based Texas Instruments' latest generation of calculators is getting a modern-day update with the addition of programming language Python," reports the Dallas Morning News: The goal is to expand students' ability to explore science, technology, engineering and math through the device that's all-but-required in the nation's high schools and colleges...

Though most of the company's $14 billion in annual revenue comes from semiconductors, its graphing calculator remains its most recognized consumer product. This latest TI-84 model, priced between $120 to $160 depending on the retailer, was made to accommodate the increasing importance of programming in the modern world.

Judging by photos in their press release, an "alpha" key maps the calculator's keys to the letters of the alphabet (indicated with yellow letters above each key). One page on its web site also mentions "Menu selections" that "help students with discovery and syntax." (And the site confirms the calculator will "display expressions, symbols and fractions just as you write them.")

There's even a file manager that "gives quick access to Python programs you have saved on your calculator. From here, you can create, edit, run and manage your files." And one page also mentions something called TI Connect CE software application, which "connects your computer and graphing calculator so they can talk to each other. Use it to transfer data, update your operating system, download calculator software applications or take screenshots of your graphing calculator."

I'm sure Slashdot's readers have some fond memories of their first calculator. But these new models have a full-color screen and a rechargeable battery that can last up to a month on a single charge. And Texas Instruments seems to think they could even replace computers in the classroom. "By adding Python to the calculators many students are already familiar with and use in class, we are making programming more accessible and approachable for all students," their press release argues, "eliminating the need for teachers to reserve separate computer labs to teach these important skills.
Programming

After YouTube-dl Incident, GitHub's DMCA Process Now Includes Free Legal Help (venturebeat.com) 30

"GitHub has announced a partnership with the Stanford Law School to support developers facing takedown requests related to the Digital Millennium Copyright Act (DMCA)," reports VentureBeat: While the DMCA may be better known as a law for protecting copyrighted works such as movies and music, it also has provisions (17 U.S.C. 1201) that criminalize attempts to circumvent copyright-protection controls — this includes any software that might help anyone infringe DMCA regulations. However, as with the countless spurious takedown notices delivered to online content creators, open source coders too have often found themselves in the DMCA firing line with little option but to comply with the request even if they have done nothing wrong. The problem, ultimately, is that freelance coders or small developer teams often don't have the resources to fight DMCA requests, which puts the balance of power in the hands of deep-pocketed corporations that may wish to use DMCA to stifle innovation or competition. Thus, GitHub's new Developer Rights Fellowship — in conjunction with Stanford Law School's Juelsgaard Intellectual Property and Innovation Clinic — seeks to help developers put in such a position by offering them free legal support.

The initiative follows some eight months after GitHub announced it was overhauling its Section 1201 claim review process in the wake of a takedown request made by the Recording Industry Association of America (RIAA), which had been widely criticized as an abuse of DMCA... [M]oving forward, whenever GitHub notifies a developer of a "valid takedown claim," it will present them with an option to request free independent legal counsel.

The fellowship will also be charged with "researching, educating, and advocating on DMCA and other legal issues important for software innovation," GitHub's head of developer policy Mike Linksvayer said in a blog post, along with other related programs.

Explaining their rationale, GitHub's blog post argues that currently "When developers looking to learn, tinker, or make beneficial tools face a takedown claim under Section 1201, it is often simpler and safer to just fold, removing code from public view and out of the common good.

"At GitHub, we want to fix this."
Security

Software Downloaded 30,000 Times From PyPI Ransacked Developers' Machines (arstechnica.com) 26

Open source packages downloaded an estimated 30,000 times from the PyPI open source repository contained malicious code that surreptitiously stole credit card data and login credentials and injected malicious code on infected machines, researchers said on Thursday. Ars Technica reports: In a post, researchers Andrey Polkovnichenko, Omer Kaspi, and Shachar Menashe of devops software vendor JFrog said they recently found eight packages in PyPI that carried out a range of malicious activity. Based on searches on https://pepy.tech, a site that provides download stats for Python packages, the researchers estimate the malicious packages were downloaded about 30,000 times. [...] Different packages from Thursday's haul carried out different kinds of nefarious activities. Six of them had three payloads, one for harvesting authentication cookies for Discord accounts, a second for extracting any passwords or payment card data stored by browsers, and the third for gathering information about the infected PC, such as IP addresses, computer name, and user name. The remaining two packages had malware that tries to connect to an attacker-designated IP address on TCP port 9009, and to then execute whatever Python code is available from the socket. It's not now known what the IP address was or if there was malware hosted on it.

Like most novice Python malware, the packages used only a simple obfuscation such as from Base64 encoders. Karas told me that the first six packages had the ability to infect the developer computer but couldn't taint the code developers wrote with malware. "For both the pytagora and pytagora2 packages, which allows code execution on the machine they were installed, this would be possible." he said in a direct message. "After infecting the development machine, they would allow code execution and then a payload could be downloaded by the attacker that would modify the software projects under development. However, we don't have evidence that this was actually done."

Programming

'Programming Is Hard' Considered Harmful (acm.org) 526

theodp writes: The commonly held belief that programming is inherently hard lacks sufficient evidence," begins CS Prof Brett Becker in [an article published in the journal Communications of the ACM]. "Stating this belief can send influential messages that can have serious unintended consequences including inequitable practices. [...] Language is a powerful tool. Stating that programming is hard should raise several questions but rarely does. Why does it seem routinely acceptable -- arguably fashionable -- to make such a general and definitive statement? Why are these statements often not accompanied by supporting evidence? What is the empirical evidence that programming, broadly speaking, is inherently hard, or harder than possible analogs such as calculus in mathematics? Even if that evidence exists, what does it mean in practice? In what contexts does it hold? To whom does it, and does it not, apply?"

Becker concludes: "Blanket messages that 'programming is hard' seem outdated, unproductive, and likely unhelpful at best. At worst they could be truly harmful. We need to stop blaming programming for being hard and focus on making programming more accessible and enjoyable, for everyone.

Chrome

Researchers Found a Malicious NPM Package Using Chrome's Password-Recovery Tools (threatpost.com) 13

Threatpost reports on "another vast software supply-chain attack" that was "found lurking in the npm open-source code repository...a credentials-stealing code bomb" that used the password-recovery tools in Google's Chrome web browser. Researchers caught the malware filching credentials from Chrome on Windows systems. The password-stealer is multifunctional: It also listens for incoming commands from the attacker's command-and-control (C2) server and can upload files, record from a victim's screen and camera, and execute shell commands...

ReversingLabs researchers, who published their findings in a Wednesday post, said that during an analysis of the code repository, they found an interesting embedded Windows executable file: a credential-stealing threat. Labeled "Win32.Infostealer.Heuristics", it showed up in two packages: nodejs_net_server and temptesttempfile. At least for now, the first, main threat is nodejs_net_server. Some details:

nodejs_net_server: A package with 12 published versions and a total of more than 1,300 downloads since it was first published in February 2019...finally upgrading it last December with a script to download the password-stealer, which the developer hosts on a personal website. It was subsequently tweaked to run TeamViewer.exe instead, "probably because the author didn't want to have such an obvious connection between the malware and their website," researchers theorized...

ReversingLabs contacted the npm security team on July 2 to give them a heads-up about the nodejs_net_server and tempdownloadtempfile packages and circled back once again last week, on Thursday, since the team still hadn't removed the packages from the repository. When Threatpost reached out to npm Inc., which maintains the repository, a GitHub spokesperson sent this statement: "Both packages were removed following our investigation...."

Open Source

Audacity's New Owner Is In Another Fight With the Open Source Community (arstechnica.com) 48

An anonymous reader quotes a report from Ars Technica: Muse Group -- owner of the popular audio-editing app Audacity -- is in hot water with the open source community again. This time, the controversy isn't over Audacity -- it's about MuseScore, an open source application that allows musicians to create, share, and download musical scores (especially, but not only, in the form of sheet music). The MuseScore app itself is licensed GPLv3, which gives developers the right to fork its source and modify it. One such developer, Wenzheng Tang ("Xmader" on GitHub) went considerably further than modifying the app -- he also created separate apps designed to bypass MuseScore Pro subscription fees. After thoroughly reviewing the public comments made by both sides at GitHub, Ars spoke at length with Muse Group Head of Strategy Daniel Ray -- known on GitHub by the moniker "workedintheory" -- to get to the bottom of the controversy.

While Xmader did, in fact, fork MuseScore, that's not the root of the controversy. Xmader forked MuseScore in November 2020 and appears to have abandoned that fork entirely; it only has six commits total -- all trivial, and all made the same week that the fork was created. Xmader is also currently 21,710 commits behind the original MuseScore project repository. Muse Group's beef with Xmader comes from two other repositories, created specifically to bypass subscription fees. Those repositories are musescore-downloader (created November 2019) and musescore-dataset (created March 2020). Musescore-downloader describes itself succinctly: "download sheet music from musescore.com for free, no login or MuseScore Pro required." Musescore-dataset is nearly as straightforward: it declares itself "the unofficial dataset of all music sheets and users on musescore.com." In simpler terms: musescore-downloader lets you download things from musescore.com that you shouldn't be able to; musescore-dataset is those files themselves, already downloaded. For scores that are in the public domain or that users have uploaded under Creative Commons licenses, this isn't necessarily a problem. But many of the scores are only available by arrangement between the score owner and Muse Group itself -- and this has several important implications.

Just because you can access the score via the app or website doesn't mean you're free to access it anywhere, anyhow, or redistribute that score yourself. The distribution agreement between Muse Group and the rightsholder allows legitimate downloads, but only when using the site or app as intended. Those agreements do not give users carte blanche to bypass controls imposed on those downloads. Further, those downloads can often cost the distributor real money -- a free download of a score licensed to Muse Group by a commercial rightsholder (e.g., Disney) is generally not "free" to Muse Group itself. The site has to pay for the right to distribute that score -- in many cases, based on the number of downloads made. Bypassing those controls leaves Muse Group on the hook either for costs it has no way to monetize (e.g., by ads for free users) or for violating its own distribution agreements with rightsholders (by failing to properly track downloads).

Databases

The Case Against SQL (scattered-thoughts.net) 297

Long-time Slashdot reader RoccamOccam shares "an interesting take on SQL and its issues from Jamie Brandon (who describes himself as an independent researcher who's built database engines, query planners, compilers, developer tools and interfaces).

It's title? "Against SQL." The relational model is great... But SQL is the only widely-used implementation of the relational model, and it is: Inexpressive, Incompressible, Non-porous. This isn't just a matter of some constant programmer overhead, like SQL queries taking 20% longer to write. The fact that these issues exist in our dominant model for accessing data has dramatic downstream effects for the entire industry:

- Complexity is a massive drag on quality and innovation in runtime and tooling
- The need for an application layer with hand-written coordination between database and client renders useless most of the best features of relational databases

The core message that I want people to take away is that there is potentially a huge amount of value to be unlocked by replacing SQL, and more generally in rethinking where and how we draw the lines between databases, query languages and programming languages...

I'd like to finish with this quote from Michael Stonebraker, one of the most prominent figures in the history of relational databases:

"My biggest complaint about System R is that the team never stopped to clean up SQL... All the annoying features of the language have endured to this day. SQL will be the COBOL of 2020..."

It's been interesting to follow the discussion on Twitter, where the post's author tweeted screenshots of actual SQL code to illustrate various shortcomings. But he also notes that "The SQL spec (part 2 = 1732) pages is more than twice the length of the Javascript 2021 spec (879 pages), almost matches the C++ 2020 spec (1853) pages and contains 411 occurrences of 'implementation-defined', occurrences which include type inference and error propagation."

His Twitter feed also includes a supportive retweet from Rust creator Graydon Hoare, and from a Tetrane developer who says "The Rust of SQL remains to be invented. I would like to see it come."
Programming

New Study Verifies Safety of Rust (eurekalert.org) 132

Slashdot reader Beeftopia writes: Rust has two modes: its default, safe mode, and an unsafe mode. In its default, safe mode, Rust prevents memory errors, such as "use-after-free" errors. It also prevents "data races" which is unsynchronized access to shared memory. In its unsafe mode (via use of the "unsafe" block), in which some of its APIs are written, it allows the use of potentially unsafe C-style features. The key challenge in verifying Rust's safety claims is accounting for the interaction between its safe and unsafe code. This article from April's issue of Communications of the ACM provides an overview of Rust and investigates its safety claims.
The article is co-authored by Ralf Jung, a prominent postdoctoral researcher in the 'Foundations of Programming' research group at the Max Planck Institute for Software Systems. And (spoiler alert) Jung has just received one of two 'Honorable Mentions' for the 'Dissertation Award' of the 'Association for Computing Machinery' (ACM), reports a nonprofit site operated by the American Association for the Advancement of Science: In his dissertation, Ralf Jung now provides the first formal proof that the safety promises of Rust actually hold. "We were able to verify the safety of Rust's type system and thus show how Rust automatically and reliably prevents entire classes of programming errors," says Ralf Jung.

In doing so, he also successfully addressed a special aspect of the programming language: "The so-called 'type safety' goes hand in hand with the fact that Rust imposes restrictions on the programmer and does not allow everything that the programmer wants to do. Sometimes, however, it is necessary to write an operation into the code that Rust would not accept because of its type safety," the computer scientist continues. "This is where a special feature of Rust comes into play: programmers can mark their code as 'unsafe' if they want to achieve something that contradicts the programming language's safety precautions. Together with international collaborators, including my thesis advisor Derek Dreyer, we developed a theoretical framework that allows us to prove that Rust's safety claims hold despite the possibility of writing 'unsafe' code," Jung says.

This proof, called RustBelt, is complemented by Ralf Jung with a tool called Miri, with which 'unsafe' Rust code can be automatically tested for compliance with important rules of the Rust specification - a basic requirement for correctness and safety of this code. "While RustBelt was a great success, especially in academic circles, Miri is already established in industry as a tool for security testing of programs written in Rust," explains Ralf Jung.... The ACM states: "Through Jung's leadership and active engagement with the Rust Unsafe Code Guidelines working group, his work has already had profound impact on the design of Rust and laid essential foundations for its future."

Privacy

Apple's IDFA Change Has Triggered 15% To 20% Revenue Drops For iOS Developers (venturebeat.com) 120

AmiMoJo shares a report from VentureBeat: Apple critics such as Epic Games CEO Tim Sweeney have complained about Apple's alleged anticompetitive behavior with the App Store. But Consumer Acquisition's Brian Bowman has frequently sounded the alarm on Apple's decision to favor user privacy over targeted ads by changing access to its Identifier for Advertisers (IDFA). Based on Consumer Acquisition's analysis of $300 million in paid social ad spending, IDFA has had a devastating impact, Bowman said in an interview with GamesBeat. In a report issued today, Bowman said that iOS advertisers are experiencing a 15% to 20% revenue drop and inflation in unattributed organic traffic.

Starting in April, Apple began releasing iOS 14.5, which prompted users to answer whether they would allow their data to be tracked for advertising purposes. Apple believes this puts privacy front and center. But Consumer Acquisition and many of its game developer advertisers worry it will break personalized advertising. Only 20% of consumers are saying yes to Apple's App Tracking Transparency prompt, which means they will enable apps to personalize ads by tracking their personal data. For the traffic Bowman's company evaluated, performance has faded. Across paid social platforms, downstream event optimization and "lookalike audience performance" is also eroding. [...] Bowman believes -- or at least holds out hope -- that Apple will roll back or soften the IDFA changes by Black Friday.

Privacy

Tor Project Hopes to Replace 'Complex', 'Fragile' C Code With Rust (yahoo.com) 107

CoinDesk reports that "A project is in the works to make the Tor Client more adaptable and easier for third parties to use, with some help from Zcash Open Major Grants (ZOMG)." ZOMG announced on Tuesday that it is awarding the privacy-focused Tor Project a $670,000 grant to continue to develop Arti, a Rust coding language implementation of the Tor Client... Arti should make it simpler for third parties to embed and customize the Tor Client than the current implementation in the C coding language... "Arti is a project to make an improved version of Tor that will be more reliable, more secure, and easier for other software to use," said Nick Mathewson, chief network architect and co-founder of the Tor Project. "We hope that within the next several years, Arti will become the preferred implementation of the Tor protocols...."

"Onion routing has just had its 25th anniversary in May, and although Tor is a great set of privacy tools, the C program 'tor' itself (note the lowercase t) is beginning to show its age," Mathewson said. "We've found over the recent years that the complexity of the existing C code, and the fragility of the C language, make it unnecessarily difficult to improve the code while maintaining our security and privacy guarantees....

"Roughly half of Tor's security issues since 2016 would have been impossible in Rust, and many of the other issues would have been much less likely, based on our informal audit," he said...

The funding will go toward developer salaries as they develop Arti. Mathewson said the goal with this round of funding is to advance Arti to the point where it is ready for general use, testing and embedding.

Java

Developers Finally Moving Away from Java 8 to Java 11 (sdtimes.com) 61

SD Times takes another look at the uptake of Java 11 Previous reports of the Java community found that developers were still mainly using Java 8 and didn't adopt newer versions, but according to Snyk's JVM Ecosystem Report 2021, that is starting to change. This year, 61.5% of respondents are using Java 11 somewhere in production, and almost 12% are using the latest release, which was Java 15 during the survey. "This is huge, because it shows that developers do upgrade their Java version beyond Java 8 to some extent. The mantra that most Java developers are comfortable staying on Java 8 seems to be slowly breaking apart," Snyk stated in the report.

However, half of the Java 11 users — which is currently the most used version — still use Java 8 in their production stack...

In addition, almost half of developers (44%) use the free AdoptOpenJDK distribution in production as one of their JDKs and 48% use it in development.

"Other findings are that Java is still by far the most popular language by a long shot and Snyk stated it will probably remain that way in the foreseeable future and that JetBrains IntellIJ IDEA still remains dominant as an IDE in the Java ecosystem."
Programming

Could Python Overtake C and Java as the Most Popular Programming Language? (zdnet.com) 170

The TIOBE index of programming language popularity celebrates 20 years of continuous publishing this month. Started as a hobbyist project back in 2001, the site estimates each programming language's popularity by counting search engine results for the phrase <language> programming (indirectly counting each listing for developers, courses, and third-party vendors).

When it was started 20 years ago, the top languages were Java, C, and C++.

20 years later, the top languages are now C, Java, Python, and C++

And "The difference between position 1 and position 3 is only 0.67%." This means that the next few months will be exciting. What language is going to win this battle? Python seems to have the best chances to become number 1, thanks to its market leadership in the booming field of data mining and artificial intelligence.
ZDNet also noted the trends: Searches for C were down 4.83 percentage points compared to last July. Java searches were down 3.93% over the period, while Python gained 1.86%.

The top 10 languages behind C, Java and Python are C++, C#, Visual Basic, Javascript, PHP, Assembly Language, and SQL.

But they also have this to say about TIOBE's calculations: It's a different methodology to developer analyst RedMonk, which looks at language usage on software projects hosted on GitHub and discussions on the developer Q&A site, Stack Overflow.

RedMonk's Q1 2021 rankings place JavaScript in top place, followed by Python and Java.


Other interesting moves this month:
  • C++ gained more than 0.5% getting closer to the top 3
  • Rust rose from #30 to #27
  • Go rose from #20 to #13
  • TypeScript rose from #45 to #37
  • Haskellrose rose from #49 to #39

Programming

Mixed Reactions to GitHub's AI-Powered Pair Programmer 'Copilot' (github.blog) 39

Reactions are starting to come in for GitHub's new Copilot coding tool, which one site calls "a product of the partnership between Microsoft and AI research and deployment company OpenAI — which Microsoft invested $1 billion into two years ago." According to the tech preview page: GitHub Copilot is currently only available as a Visual Studio Code extension. It works wherever Visual Studio Code works — on your machine or in the cloud on GitHub Codespaces. And it's fast enough to use as you type. "Copilot looks like a potentially fantastic learning tool — for developers of all abilities," said James Governor, an analyst at RedMonk. "It can remove barriers to entry. It can help with learning new languages, and for folks working on polyglot codebases. It arguably continues GitHub's rich heritage as a world-class learning tool. It's early days but AI-assisted programming is going to be a thing, and where better to start experiencing it than GitHub...?"

The issue of scale is a concern for GitHub, according to the tech preview FAQ: "If the technical preview is successful, our plan is to build a commercial version of GitHub Copilot in the future. We want to use the preview to learn how people use GitHub Copilot and what it takes to operate it at scale." GitHub spent the last year working closely with OpenAI to build Copilot. GitHub developers, along with some users inside Microsoft, have been using it every day internally for months.

[Guillermo Rauch, CEO of developer software provider Vercel, who also is founder of Vercel and creator of Next.js], cited in a tweet a statement from the Copilot tech preview FAQ page, "GitHub Copilot is a code synthesizer, not a search engine: the vast majority of the code that it suggests is uniquely generated and has never been seen before."

To that, Rauch simply typed: "The future."

Rauch's post is relevant in that one of the knocks against Copilot is that some folks seem to be concerned that it will generate code that is identical to code that has been generated under open source licenses that don't allow derivative works, but which will then be used by a developer unknowingly...

GitHub CEO Nat Friedman has responded to those concerns, according to another article, arguing that training an AI system constitutes fair use: Friedman is not alone — a couple of actual lawyers and experts in intellectual property law took up the issue and, at least in their preliminary analysis, tended to agree with Friedman... [U.K. solicitor] Neil Brown examines the idea from an English law perspective and, while he's not so sure about the idea of "fair use" if the idea is taken outside of the U.S., he points simply to GitHub's terms of service as evidence enough that the company can likely do what it's doing. Brown points to passage D4, which grants GitHub "the right to store, archive, parse, and display Your Content, and make incidental copies, as necessary to provide the Service, including improving the Service over time." "The license is broadly worded, and I'm confident that there is scope for argument, but if it turns out that Github does not require a license for its activities then, in respect of the code hosted on Github, I suspect it could make a reasonable case that the mandatory license grant in its terms covers this as against the uploader," writes Brown. Overall, though, Brown says that he has "more questions than answers."
Armin Ronacher, the creator of the Flask web framework for Python, shared an interesting example on Twitter (which apparently came from the game Quake III Arena) in which Copilot apparently reproduces a chunk of code including not only its original comment ("what the fuck?") but also its original copyright notice.
PlayStation (Games)

PlayStation Is Hard To Work With, Devs Say (kotaku.com) 42

After yesterday's industry-wide discussion of the cost of being visible on Sony's PlayStation Store, Kotaku has heard from multiple independent developers and publishers expressing similar frustrations and fury. From a report: There were two main responses to our article yesterday highlighting one independent developer's frustrations with working with Sony to sell games on the PlayStation store. The first was a confusing number of people convinced that this was somehow part of an underground conspiracy to destroy Sony. The second was many indie game developers and publishers getting in touch to say that, yes, wow, Sony are far harder to work with and sell games through than anywhere else.

It's not possible to rationalize with the former group. We had confirmed hard figures on Sony's fees for getting any visibility on the PlayStation's in-built store, so we reported them. The conspiracy, disappointingly, ends there. However, the information about just how much worse it is for indies to work with Sony than Microsoft or Nintendo keeps piling in. "Oh yeah, so there's Nintendo who supports you," one such response begins. "[Then] Microsoft who supports you and [then] there is Sony who supports its own AAA machine and gives a fuck about everyone else."

As Bloomberg reported in April, Sony shows extraordinary caution even with the games it makes itself, with an obsessive focus on blockbuster success. According to that article, the Japanese corporation is moving away from developing smaller in-house games, so fixated are they on only the largest games. It seems this lack of interest in smaller titles extends to third-party developers attempting to sell their games on the system. "Sony does not understand what indie means," an independent publisher tells me under the condition of anonymity, via Twitter DMs. "Not at all. For them indie is something in the lower million budgets."

Programming

Python Implementation Pyston Aims To Speed Up the Programming Language's Code for Web Applications (techrepublic.com) 55

An anonymous reader shares a report: When Kevin Modzelewski and his colleagues at Dropbox set out to create Pyston in 2014, they had a very simple objective: to lower the costs of running Python code on Dropbox's servers, by making the code itself faster. "We were growing exponentially, so our server cost was growing exponentially," Modzelewski tells TechRepublic. "If we could get Python running faster, we would spend less money running Python." The original cost reduction initiative at Dropbox snowballed into a bigger project for Modzelewski when the company moved away from Python in 2017 and cancelled the Pyston project. He had realized while working on the language that there was a strong demand for faster Python among the developer community, and while there were plenty of tools around for improving the performance in smaller applications, there were none designed for big, business logic-type applications such as Dropbox.

"There's a lot of tools out there for helping you run Python faster, but there weren't any that were a good fit for Dropbox's use case," says Modzelewski. "This was an area of the Python market where a lot of money was being spent, but not very many tools were being developed for helping. It was under served." Fast forward to today and Pyston is now in version 2.2, and has been open-sourced, with Modzelewski and fellow developer Marius Wachtler now leading the project as co-founders. The latest implementation promises a 30% performance improvement over Python 3.8.8, with a key benefit being that developers can simply drop their Python applications into Pyston and get going, without having to rewrite their code. It's also a "completely separate thing" to what Modzelewski and fellow developers built for Dropbox some seven years ago.

Microsoft

Microsoft and OpenAI Have a New AI Tool That Will Give Coding Suggestions To Software Developers (cnbc.com) 39

Microsoft on Tuesday announced an artificial intelligence system that can recommend code for software developers to use as they write code. From a report: Microsoft is looking to simplify the process of programming, the area where the company got its start in 1975. That could keep programmers who already use the company's tools satisfied and also attract new ones. The system, called GitHub Copilot, draws on source code uploaded to code-sharing service GitHub, which Microsoft acquired in 2018, as well as other websites. Microsoft and GitHub developed it with help from OpenAI, an AI research start-up that Microsoft backed in 2019.

Researchers at Microsoft and other institutions have been trying to teach computers to write code for decades. The concept has yet to go mainstream, at times because programs to write programs have not been versatile enough. The GitHub Copilot effort is a notable attempt in the field, relying as it does on a large volume of code in many programming languages and vast Azure cloud computing power. Nat Friedman, CEO of GitHub, describes GitHub Copilot as a virtual version of what software creators call a pair programmer -- that's when two developers work side by side collaboratively on the same project. The tool looks at existing code and comments in the current file and the location of the cursor, and it offers up one or more lines to add. As programmers accept or reject suggestions, the model learns and becomes more sophisticated over time. The new software makes coding faster, Friedman said in an interview last week. Hundreds of developers at GitHub have been using the Copilot feature all day while coding, and the majority of them are accepting suggestions and not turning the feature off, Friedman said.

Java

Drinking Coffee May Cut Risk of Chronic Liver Disease, Study Suggests (theguardian.com) 74

An anonymous reader quotes a report from The Guardian: From espresso to instant, coffee is part of the daily routine for millions. Now research suggests the brew could be linked to a lower chance of developing or dying from chronic liver disease. Chronic liver disease is a major health problem around the world. According to the British Liver Trust, liver disease is the third leading cause of premature death in the UK, with deaths having risen 400% since 1970. Writing in the journal BMC Public Health, Roderick and colleagues report how they analyzed data from 494,585 participants in the UK Biobank -- a project designed to help unpick the genetic and environmental factors associated with particular conditions. All participants were aged 40 to 69 when they signed up to the project, with 384,818 saying they were coffee drinkers at the outset compared with 109,767 who did not consume the beverage.

The team looked at the liver health of the participants over a median period of almost 11 years, finding 3,600 cases of chronic liver disease, with 301 deaths, and 1,839 cases of simple fatty liver disease. The analysis revealed that after taking into account factors such as body mass index, alcohol consumption, and smoking status, those who drank any amount of coffee, and of any sort, had a 20% lower risk of developing chronic liver disease or fatty liver disease (taken together) than those who did not consume the brew. The coffee drinkers also had a 49% lower risk of dying from chronic liver disease. The team said the magnitude of the effect increased with the amount of coffee consumed, up to about three to four cups a day, "beyond which further increases in consumption provided no additional benefit." A reduction in risk was also found when instant, decaffeinated and ground coffee were considered separately -- although the latter linked to the largest effect.

Slashdot Top Deals