Data Storage

'The Future is Not Self-Hosted' (drewlyton.com) 175

A software developer who built his own home server in response to Amazon's removal of Kindle book downloads now argues that self-hosting "is NOT the future we should be fighting for." Drew Lyton constructed a home server running open-source alternatives to Google Drive, Google Photos, Audible, Kindle, and Netflix after Amazon announced that "Kindle users would no longer be able to download and back up their book libraries to their computers."

The change prompted Amazon to update Kindle store language to say "users are purchasing licenses -- not books." Lyton's setup involved a Lenovo P520 with 128GB RAM, multiple hard drives, and Docker containers running applications like Immich for photo storage and Jellyfin for media streaming. The technical complexity required "138 words to describe but took me the better part of two weeks to actually do."

The implementation was successful but Lyton concluded that self-hosting "assumes isolated, independent systems are virtuous. But in reality, this simply makes them hugely inconvenient." He proposes "publicly funded, accessible, at cost cloud-services" as an alternative, suggesting libraries could provide "100GB of encrypted file storage, photo-sharing and document collaboration tools, and media streaming services -- all for free."
The Military

What Eyewitnesses Remembered About the World's First Atomic Bomb Explosion in 1945 (politico.com) 47

Historian Garrett M. Graff describes his upcoming book, The Devil Reached Toward the Sky: An Oral History of the Making and Unleashing of the Atomic Bomb. "I assembled an oral history of the Manhattan Project, the bombings of Hiroshima and Nagasaki and the end of World War II in the Pacific, told through the voices of around 500 participants and witnesses of the events — including luminaries like Albert Einstein and Oppenheimer and political figures like President Harry Truman."

It was 80 years ago this week that physicists and 150 other leaders in the atomic bomb program "gathered in the desert outside Alamogordo, New Mexico, for the world's first test of a nuclear explosion." In an except from his upcoming book, Graff publishes quotes from eyewitness: Brig. Gen. Leslie Groves: I had become a bit annoyed with Fermi when he suddenly offered to take wagers from his fellow scientists on whether or not the bomb would ignite the atmosphere, and if so, whether it would merely destroy New Mexico or destroy the world. He had also said that after all it wouldn't make any difference whether the bomb went off or not because it would still have been a well worthwhile scientific experiment. For if it did fail to go off, we would have proved that an atomic explosion was not possible. Afterward, I realized that his talk had served to smooth down the frayed nerves and ease the tension of the people at the base camp, and I have always thought that this was his conscious purpose. Certainly, he himself showed no signs of tension that I could see...

As the hour approached, we had to postpone the test — first for an hour and then later for 30 minutes more — so that the explosion was actually three- and one-half hours behind the original schedule... Our preparations were simple. Everyone was told to lie face down on the ground, with his feet toward the blast, to close his eyes and to cover his eyes with his hands as the countdown approached zero. As soon as they became aware of the flash they could turn over and sit or stand up, covering their eyes with the smoked glass with which each had been supplied... The quiet grew more intense. I, myself, was on the ground between Bush and Conant...

Edward Teller: We all were lying on the ground, supposedly with our backs turned to the explosion. But I had decided to disobey that instruction and instead looked straight at the bomb. I was wearing the welder's glasses that we had been given so that the light from the bomb would not damage our eyes. But because I wanted to face the explosion, I had decided to add some extra protection. I put on dark glasses under the welder's glasses, rubbed some ointment on my face to prevent sunburn from the radiation, and pulled on thick gloves to press the welding glasses to my face to prevent light from entering at the sides... We all listened anxiously as the broadcast of the final countdown started; but, for whatever reason, the transmission ended at minus five seconds...

Kenneth T. Bainbridge: My personal nightmare was knowing that if the bomb didn't go off or hang-fired, I, as head of the test, would have to go to the tower first and seek to find out what had gone wrong...

Brig. Gen. Thomas F. Farrell: Dr. Oppenheimer held on to a post to steady himself. For the last few seconds, he stared directly ahead.

A few examples of how they remembered the explosion:
  • William L. Laurence: There rose from the bowels of the earth a light not of this world, the light of many suns in one.
  • Kenneth T. Bainbridge: I felt the heat on the back of my neck, disturbingly warm.
  • George B. Kistiakowsky: I am sure that at the end of the world — in the last millisecond of the earth's existence — the last man will see what we have just seen.
  • Brig. Gen. Thomas F. Farrell: Oppenheimer's face relaxed into an expression of tremendous relief.
  • J. Robert Oppenheimer: We knew the world would not be the same. A few people laughed, a few people cried.
  • Norris Bradbury, physicist, Los Alamos Lab: Some people claim to have wondered at the time about the future of mankind. I didn't. We were at war, and the damned thing worked.

Transportation

'Edge of Space' Skydiver Felix Baumgartner Dies in Paragliding Accident (go.com) 38

Felix Baumgartner has died. He was 56.

In 2012 Slashdot extensively covered the skydiver's "leap from the edge of space." ABC News remembers it as a Red Bull-financed stunt that involved "diving 24 miles from the edge of space, in a plummet that reached a speed of more than 500 mph." Baumgartner recalled the legendary jump in the documentary, "Space Jump," and said, "I was the first human being outside of an aircraft breaking the speed of sound and the history books. Nobody remembers the second one...."

Baumgartner, also known as "Fearless Felix," accomplished many records in his career, including setting the world record for highest parachute jump atop the Petronas Towers in Malaysia, flying across the English Channel in a wingsuit in 2003, and base jumping from the 85-foot arm of the Christ the Redeemer statue in Brazil in 2007.

"Baumgartner's altitude record stood for two years," remembers the Los Angeles Times, "until Google executive Alan Eustace set new marks for the highest free-fall jump and greatest free-fall distance."

They report that Baumgartner died Thursday "while engaged in a far less intense activity, crashing into the side of a hotel swimming pool while paragliding in Porto Sant Elpidio, a town on central Italy's eastern coast." More details from the Associated Press: "It is a destiny that is very hard to comprehend for a man who has broke all kinds of records, who has been an icon of flight, and who traveled through space," Mayor Massimiliano Ciarpella told The Associated Press.Ciarpella said that Baumgartner had been in the area on vacation, and that investigators believed he may have fallen ill during the fatal flight... Baumgartner, a former Austrian military parachutist, made thousands of jumps from planes, bridges, skyscrapers and famed landmarks...
ABC News remembers that in 2022 Baumgartner wrote in Newsweek that "Since I was a little kid, I've always looked up to people who left a footprint on this planet... now I think I have left a footprint...

"I believe big dreamers always win."
The Courts

Judge Allows Nationwide Class Action Against Anthropic Over Alleged Piracy of 7 Million Books For AI Training (reuters.com) 49

A California federal judge has ruled that three authors suing Anthropic for copyright infringement can represent writers nationwide whose books the AI startup allegedly pirated to train its Claude chatbot.

U.S. District Judge William Alsup said the authors can bring a class action on behalf of all U.S. writers whose works Anthropic allegedly downloaded from pirate libraries LibGen and PiLiMi to create a repository of millions of books in 2021 and 2022.

Alsup said Anthropic may have illegally downloaded as many as 7 million books from the pirate websites, which could make it liable for billions of dollars in damages if the authors' case succeeds.
Businesses

Amazon Turns 30 45

Amazon.com marked its 30th anniversary Wednesday, three decades after Jeff Bezos launched the company as an online bookstore promising "one million titles" from Seattle. The e-commerce giant began in 1995 with Bezos, his then-wife MacKenzie Scott, and seven employees.

The company now employs 1.5 million people and carries a market capitalization exceeding $2 trillion. Amazon has expanded from books into groceries through its $13.7 billion Whole Foods acquisition, cloud computing via Amazon Web Services, and entertainment with Prime Video.
Social Networks

Are a Few People Ruining the Internet For the Rest of Us? 150

A small fraction of hyperactive social media users generates the vast majority of toxic online content, according to research by New York University psychology professor Jay Van Bavel and colleagues Claire Robertson and Kareena del Rosario. The study found that 10% of users produce roughly 97% of political tweets, while just 0.1% of users share 80% of fake news.

Twelve accounts known as the "disinformation dozen" created most vaccine misinformation on Facebook during the pandemic, the research found. In experiments, researchers paid participants to unfollow divisive political accounts on X. After one month, participants reported 23% less animosity toward other political groups. Nearly half declined to refollow hostile accounts after the study ended, and those maintaining healthier newsfeeds reported reduced animosity 11 months later. The research describes social media as a "funhouse mirror" that amplifies extreme voices while muting moderate perspectives.
Social Networks

Bay Area Restaurants Are Vetting Your Social Media Before You Even Walk In (sfgate.com) 154

Bay Area Michelin-starred restaurants are conducting extensive background research on diners before they arrive, mining social media profiles and maintaining detailed guest databases to personalize dining experiences. Lazy Bear maintains records on 115,000 people and employs a guest services coordinator who creates weekly reports by researching publicly available social media information.

Staff study color-coded Google documents containing guest data before each service. SingleThread's reservation team researches social media, Google, and LinkedIn profiles for guests, where meals cost over $500 on weekends. General manager Akeel Shah told SFGate the information helps "tailor the experience and make it memorable." Acquerello has collected guest data for 36 years, initially handwritten in books. Co-owner Giancarlo Paterlini said their director of operations reviews each reservation for dining history and wine preferences to customize service.
The Internet

FCC Chair Accused of 'Political Theater' to Please Net Neutrality's Foes (freepress.net) 35

The advocacy group Free Press on Friday blasted America's Federal Communications Commission chief "for an order that rips net neutrality rules off the books, without any time for public comment, following an unfavorable court ruling," reports the nonprofit progressive news site Common Dreams: A panel from the U.S. Court of Appeals for the 6th Circuit ruled in January that broadband is an "information service" instead of a "telecommunications service" under federal law, and the FCC did not have the authority to prohibit internet service providers (ISPs) from creating online "fast lanes" and blocking or throttling web content... FCC Chair Brendan Carr said in a Friday statement that as part of his "Delete, Delete, Delete" initiative, "we're continuing to clean house at the FCC, working to identify and eliminate rules that no longer serve a purpose, have been on our books for decades, and have no place in the current Code of Federal Regulations...."

Responding in a lengthy statement, Free Press vice president of policy and general counsel Matt Wood said that "the FCC's so-called deletion today is little more than political grandstanding. It's true that the rules in question were first stayed by the 6th Circuit and then struck down by that appellate court — in a poorly reasoned opinion. So today's bookkeeping maneuver changes very little in reality... There's no need to delete currently inoperative rules, much less to announce it in a summer Friday order. The only reason to do that is to score points with broadband monopolies and their lobbyists, who've fought against essential and popular safeguards for the past two decades straight...."

Wood noted that "the appeals process for this case has not even concluded yet, as Free Press and allies sought and got more time to consider our options at the Supreme Court. Today's FCC order doesn't impact either our ability to press the case there or our strategic considerations about whether to do so," he added. "It's little more than a premature housekeeping step..."

Space

'Space Is Hard. There Is No Excuse For Pretending It's Easy' (spacenews.com) 163

"For-profit companies are pushing the narrative that they can do space inexpensively," writes Slashdot reader RUs1729 in response to an opinion piece from SpaceNews. "Their track record reveals otherwise: cutting corners won't do it for the foreseeable future." Here's an excerpt from the article, written by Robert N. Eberhart: The headlines in the space industry over the past month have delivered a sobering reminder: space is not forgiving, and certainly not friendly to overpromising entrepreneurs. From iSpace's second failed lunar landing attempt (making them 0 for 2) to SpaceX's ongoing Starship test flight setbacks -- amid a backdrop of exploding prototypes and shifting goalposts -- the evidence is mounting that the commercialization of space is not progressing in the triumphant arc that press releases might suggest. This isn't just a series of flukes. It points to a structural, strategic and cultural problem in how we talk about innovation, cost and success in space today.

Let's be blunt: 50 years ago, we did this. We sent humans to the moon, not once but repeatedly, and brought them back. With less computational power than your phone, using analog systems and slide rules, we achieved feats of incredible precision, reliability and coordination. Today's failures, even when dressed up as "learning opportunities," raises the obvious question: Why are we struggling to do now what we once achieved decades ago with far more complexity and far less technology?

Until very recently, the failure rate of private lunar exploration efforts underscored this reality. Over the past two decades, not a single private mission had fully succeeded -- until last March when Firefly Aerospace's Blue Ghost lander touched down on the moon. It marked the first fully successful soft landing by a private company. That mission deserves real credit. But that credit comes with important context: It took two decades of false starts, crashes and incomplete landings -- from Space IL's Beresheet to iSpace's Hakuto-R and Astrobotic's Peregrine -- before even one private firm delivered on the promise of lunar access. The prevailing industry answer -- "we need to innovate for lower cost" -- rings hollow. What's happening now isn't innovation; it's aspiration masquerading as disruption...
"This is not a call for a retreat to Cold War models or Apollo-era budgets," writes Eberhart, in closing. "It's a call for seriousness. If we're truly entering a new space age, then it needs to be built on sound engineering, transparent economics and meaningful technical leadership -- not PR strategy. Let's stop pretending that burning money in orbit is a business model."

"The dream of a sustainable, entrepreneurial space ecosystem is still alive. But it won't happen unless we stop celebrating hype and start demanding results. Until then, the real innovation we need is not in spacecraft -- it's in accountability."

Robert N. Eberhart, PhD, is an associate professor of management and the faculty director of the Ahlers Center for International Business at the Knauss School of Business of University of San Diego. He is the author of several academic publications and books. He is also part of Oxford University's Smart Space Initiative and contributed to Berkeley's Space Sciences Laboratory. Before his academic career, Prof. Eberhart founded and ran a successful company in Japan.
GNU is Not Unix

For the Free Software Foundation's Summer Fundraiser, the 'GNU Press Shop' is Open (fsf.org) 6

The Free Software Foundation is a non-profit — and they're having some fun with it.

They've just announced a summer fundraiser, "and that means the GNU Press Shop is open!" From now until July 28, you can buy your FSF gear at the GNU Press shop. First and foremost, there's the launch of the FSF's fortieth anniversary shirt in a summery yellow. We're taking orders for a limited time for these (until July 28), and then printing them — you should have yours on your shoulders a few weeks after the shop closes.

We've also restocked some favorites in the shop:

- A fresh batch of the popular Ada & Zangemann: A Tale of Software, Skateboards, and Raspberry Ice Cream book by Matthias Kirschner from the Free Software Foundation Europe (FSFE). This tale of software, skateboards, and raspberry ice cream teaches kids how neat and exciting it is having control over your software, a perfect fun summer read!

- Reading is hard in the glaring sun, so shade your eyes with a freshly restocked GNU baseball cap in pitch black with brilliant gold embroidery. These are great for wearing anywhere, especially to free software events.

- For privacy, protect yourself from surveillance with ease and panache with this slick webcam guard.

We also hope you'll consider becoming an FSF associate member, putting yourself at the heart of our commitment to ensuring a world where all software respects our freedom and dignity. Plus, you'll help us reach our summer fundraising goal of 200 new associate members before July 11, and of course you'll also receive a 20% discount at the GNU Press Shop. A note about shipping: the GNU Press shop opens periodically, and we collect all orders during this time and schedule orders to be sent out on specific shipping dates with the help of volunteers. We will be doing the shipping at the end of the FSF's fundraiser, which means there will be a delay between placing your order and receiving it...

If you happen to be in the Boston area in July, and would like to support the FSF's work, we are looking for volunteers to help pack and ship our orders.

Also on sale are the book "Free as in Freedom 2.0" (Richard Stallman's 2010 revision of the 2002 biography by Sam Williams with extensive additional commentary) and "Free Software Free Society: Selected Essays of Richard M. Stallman" (the 3rd edition published in 2015).

And there's also several other books, t-shirts, other FSF-branded gear, and even a sticker that warns people "There is no cloud... just other people's computers."
Facebook

Meta Beats Copyright Suit From Authors Over AI Training on Books (bloomberglaw.com) 83

An anonymous reader shares a report: Meta escaped a first-of-its-kind copyright lawsuit from a group of authors who alleged the tech giant hoovered up millions of copyrighted books without permission to train its generative AI model called Llama.

San Francisco federal Judge Vince Chhabria ruled Wednesday that Meta's decision to use the books for training is protected under copyright law's fair use defense, but he cautioned that his opinion is more a reflection on the authors' failure to litigate the case effectively. "This ruling does not stand for the proposition that Meta's use of copyrighted materials to train its language models is lawful," Chhabria said.

Microsoft

Microsoft Sued By Authors Over Use of Books in AI Training (reuters.com) 15

Microsoft has been hit with a lawsuit by a group of authors who claim the company used their books without permission to train its Megatron artificial intelligence model. From a report: Kai Bird, Jia Tolentino, Daniel Okrent and several others alleged that Microsoft used pirated digital versions of their books to teach its AI to respond to human prompts. Their lawsuit, filed in New York federal court on Tuesday, is one of several high-stakes cases brought by authors, news outlets and other copyright holders against tech companies including Meta Platforms, Anthropic and Microsoft-backed OpenAI over alleged misuse of their material in AI training.

[...] The writers alleged in the complaint that Microsoft used a collection of nearly 200,000 pirated books to train Megatron, an algorithm that gives text responses to user prompts.

AI

Anthropic Bags Key 'Fair Use' Win For AI Platforms, But Faces Trial Over Damages For Millions of Pirated Works (aifray.com) 92

A federal judge has ruled that Anthropic's use of copyrighted books to train its Claude AI models constitutes fair use, but rejected the startup's defense for downloading millions of pirated books to build a permanent digital library.

U.S. District Judge William Alsup granted partial summary judgment to Anthropic in the copyright lawsuit filed by authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson. The court found that training large language models on copyrighted works was "exceedingly transformative" under Section 107 of the Copyright Act. Anthropic downloaded over seven million books from pirate sites, according to court documents. The startup also purchased millions of print books, destroyed the bindings, scanned every page, and stored them digitally.

Both sets of books were used to train various versions of Claude, which generates over $1 billion in annual revenue. While the judge approved using books for AI training purposes, he ruled that downloading pirated copies to create what Anthropic called a "central library of all the books in the world" was not protected fair use. The case will proceed to trial on damages related to the pirated library copies.
AI

What if Customers Started Saying No to AI? (msn.com) 213

An artist cancelled their Duolingo and Audible subscriptions to protest the companies' decisions to use more AI. "If enough people leave, hopefully they kind of rethink this," the artist tells the Washington Post.

And apparently, many more people feel the same way... In thousands of comments and posts about Audible and Duolingo that The Post reviewed across social media — including on Reddit, YouTube, Threads and TikTok — people threatened to cancel subscriptions, voiced concern for human translators and narrators, and said AI creates inferior experiences. "It destroys the purpose of humanity. We have so many amazing abilities to create art and music and just appreciate what's around us," said Kayla Ellsworth, a 21-year-old college student. "Some of the things that are the most important to us are being replaced by things that are not real...."

People in creative jobs are already on edge about the role AI is playing in their fields. On sites such as Etsy, clearly AI-generated art and other products are pushing out some original crafters who make a living on their creations. AI is being used to write romance novels and coloring books, design logos and make presentations... "I was promised tech would make everything easier so I could enjoy life," author Brittany Moone said. "Now it's leaving me all the dishes and the laundry so AI can make the art."

But will this turn into a consumer movement? The article also cites an assistant marketing professor at Washington State University, who found customers are now reacting negatively to the term "AI" in product descriptions — out of fear for losing their jobs (as well as concerns about quality and privacy). And he does predict this can change the way companies use AI.

"There will be some companies that are going to differentiate themselves by saying no to AI." And while it could be a niche market, "The people will be willing to pay more for things just made by humans."
AI

Meta's Llama 3.1 Can Recall 42% of the First Harry Potter Book (understandingai.org) 85

Timothy B. Lee has written for the Washington Post, Vox.com, and Ars Technica — and now writes a Substack blog called "Understanding AI."

This week he visits recent research by computer scientists and legal scholars from Stanford, Cornell, and West Virginia University that found that Llama 3.1 70BÂ(released in July 2024) has memorized 42% of the first Harry Potter book well enough to reproduce 50-token excerpts at least half the time... The paper was published last month by a team of computer scientists and legal scholars from Stanford, Cornell, and West Virginia University. They studied whether five popular open-weight models — three from Meta and one each from Microsoft and EleutherAI — were able to reproduce text from Books3, a collection of books that is widely used to train LLMs. Many of the books are still under copyright... Llama 3.1 70B — a mid-sized model Meta released in July 2024 — is far more likely to reproduce Harry Potter text than any of the other four models....

Interestingly, Llama 1 65B, a similar-sized model released in February 2023, had memorized only 4.4 percent of Harry Potter and the Sorcerer's Stone. This suggests that despite the potential legal liability, Meta did not do much to prevent memorization as it trained Llama 3. At least for this book, the problem got much worse between Llama 1 and Llama 3. Harry Potter and the Sorcerer's Stone was one of dozens of books tested by the researchers. They found that Llama 3.1 70B was far more likely to reproduce popular books — such as The Hobbit and George Orwell's 1984 — than obscure ones. And for most books, Llama 3.1 70B memorized more than any of the other models...

For AI industry critics, the big takeaway is that — at least for some models and some books — memorization is not a fringe phenomenon. On the other hand, the study only found significant memorization of a few popular books. For example, the researchers found that Llama 3.1 70B only memorized 0.13 percent of Sandman Slim, a 2009 novel by author Richard Kadrey. That's a tiny fraction of the 42 percent figure for Harry Potter... To certify a class of plaintiffs, a court must find that the plaintiffs are in largely similar legal and factual situations. Divergent results like these could cast doubt on whether it makes sense to lump J.K. Rowling, Richard Kadrey, and thousands of other authors together in a single mass lawsuit. And that could work in Meta's favor, since most authors lack the resources to file individual lawsuits.

Why is it happening? "Maybe Meta had trouble finding 15 trillion distinct tokens, so it trained on the Books3 dataset multiple times. Or maybe Meta added third-party sources — such as online Harry Potter fan forums, consumer book reviews, or student book reports — that included quotes from Harry Potter and other popular books..."

"Or there could be another explanation entirely. Maybe Meta made subtle changes in its training recipe that accidentally worsened the memorization problem."
Medicine

The Medical Revolutions That Prevented Millions of Cancer Deaths (vox.com) 76

Vox publishes a story about "the quiet revolutions that have prevented millions of cancer deaths....

"The age-adjusted death rate in the US for cancer has declined by about a third since 1991, meaning people of a given age have about a third lower risk of dying from cancer than people of the same age more than three decades ago... " The dramatic bend in the curve of cancer deaths didn't happen by accident — it's the compound interest of three revolutions. While anti-smoking policy has been the single biggest lifesaver, other interventions have helped reduce people's cancer risk. One of the biggest successes is the HPV vaccine. A study last year found that death rates of cervical cancer — which can be caused by HPV infections — in US women ages 20-39 had dropped 62 percent from 2012 to 2021, thanks largely to the spread of the vaccine. Other cancers have been linked to infections, and there is strong research indicating that vaccination can have positive effects on reducing cancer incidence.

The next revolution is better and earlier screening. It's generally true that the earlier cancer is caught, the better the chances of survival... According to one study, incidences of late-stage colorectal cancer in Americans over 50 declined by a third between 2000 and 2010 in large part because rates of colonoscopies almost tripled in that same time period. And newer screening methods, often employing AI or using blood-based tests, could make preliminary screening simpler, less invasive and therefore more readily available. If 20th-century screening was about finding physical evidence of something wrong — the lump in the breast — 21st-century screening aims to find cancer before symptoms even arise.

Most exciting of all are frontier developments in treating cancer... From drugs like lenalidomide and bortezomib in the 2000s, which helped double median myeloma survival, to the spread of monoclonal antibodies, real breakthroughs in treatments have meaningfully extended people's lives — not just by months, but years. Perhaps the most promising development is CAR-T therapy, a form of immunotherapy. Rather than attempting to kill the cancer directly, immunotherapies turn a patient's own T-cells into guided missiles. In a recent study of 97 patients with multiple myeloma, many of whom were facing hospice care, a third of those who received CAR-T therapy had no detectable cancer five years later. It was the kind of result that doctors rarely see.

The article begins with some recent quotes from Jon Gluck, who was told after a cancer diagnosis that he had as little as 18 months left to live — 22 years ago...
AI

AI Firms Say They Can't Respect Copyright. But A Nonprofit's Researchers Just Built a Copyright-Respecting Dataset (msn.com) 100

Is copyrighted material a requirement for training AI? asks the Washington Post. That's what top AI companies are arguing, and "Few AI developers have tried the more ethical route — until now.

"A group of more than two dozen AI researchers have found that they could build a massive eight-terabyte dataset using only text that was openly licensed or in public domain. They tested the dataset quality by using it to train a 7 billion parameter language model, which performed about as well as comparable industry efforts, such as Llama 2-7B, which Meta released in 2023." A paper published Thursday detailing their effort also reveals that the process was painstaking, arduous and impossible to fully automate. The group built an AI model that is significantly smaller than the latest offered by OpenAI's ChatGPT or Google's Gemini, but their findings appear to represent the biggest, most transparent and rigorous effort yet to demonstrate a different way of building popular AI tools....

As it turns out, the task involves a lot of humans. That's because of the technical challenges of data not being formatted in a way that's machine readable, as well as the legal challenges of figuring out what license applies to which website, a daunting prospect when the industry is rife with improperly licensed data. "This isn't a thing where you can just scale up the resources that you have available" like access to more computer chips and a fancy web scraper, said Stella Biderman [executive director of the nonprofit research institute Eleuther AI]. "We use automated tools, but all of our stuff was manually annotated at the end of the day and checked by people. And that's just really hard."

Still, the group managed to unearth new datasets that can be used ethically. Those include a set of 130,000 English language books in the Library of Congress, which is nearly double the size of the popular-books dataset Project Gutenberg. The group's initiative also builds on recent efforts to develop more ethical, but still useful, datasets, such as FineWeb from Hugging Face, the open-source repository for machine learning... Still, Biderman remained skeptical that this approach could find enough content online to match the size of today's state-of-the-art models... Biderman said she didn't expect companies such as OpenAI and Anthropic to start adopting the same laborious process, but she hoped it would encourage them to at least rewind back to 2021 or 2022, when AI companies still shared a few sentences of information about what their models were trained on.

"Even partial transparency has a huge amount of social value and a moderate amount of scientific value," she said.

AI

Business Insider Recommended Nonexistent Books To Staff As It Leans Into AI (semafor.com) 23

An anonymous reader shares a report: Business Insider announced this week that it wants staff to better incorporate AI into its journalism. But less than a year ago, the company had to quietly apologize to some staff for accidentally recommending that they read books that did not appear to exist but instead may have been generated by AI.

In an email to staff last May, a senior editor at Business Insider sent around a list of what she called "Beacon Books," a list of memoirs and other acclaimed business nonfiction books, with the idea of ensuring staff understood some of the fundamental figures and writing powering good business journalism.

Many of the recommendations were well-known recent business, media, and tech nonfiction titles such as Too Big To Fail by Andrew Ross Sorkin, DisneyWar by James Stewart, and Super Pumped by Mike Isaac. But a few were unfamiliar to staff. Simply Target: A CEO's Lessons in a Turbulent Time and Transforming an Iconic Brand by former Target CEO Gregg Steinhafel was nowhere to be found. Neither was Jensen Huang: the Founder of Nvidia, which was supposedly published by the company Charles River Editors in 2019.

Education

Blue Book Sales Surge As Universities Combat AI Cheating (msn.com) 93

Sales of blue book exam booklets have surged dramatically across the nation as professors turn to analog solutions to prevent ChatGPT cheating. The University of California, Berkeley reported an 80% increase in blue book sales over the past two academic years, while Texas A&M saw 30% growth and the University of Florida recorded nearly 50% increases this school year. The surge comes as students who were freshmen when ChatGPT launched in 2022 approach senior year, having had access to AI throughout their college careers.
Television

Amazon Cancels the 'Wheel of Time' Prime Video Series After 3 Seasons (deadline.com) 101

Long-time Slashdot reader SchroedingersCat shares this article from Deadline: Prime Video will not be renewing The Wheel of Time for a fourth season according to Deadline article. The decision, which comes more than a month after the Season 3 finale was released April 17, followed lengthy deliberations. As often is the case in the current economic environment, the reasons were financial as the series is liked creatively by the streamer's executives...

The Season 3 overall performance was not strong enough compared to the show's cost for Prime Video to commit to another season and the streamer could not make it work after examining different scenarios and following discussions with lead studio Sony TV, sources said. With the cancellation possibility — and the show's passionate fanbase — in mind, the Season 3 finale was designed to offer some closure. Still, the news would be a gut punch for fans who have been praising the latest season as the series' best yet creatively... Prime Video and Sony TV will continue to back the Emmy campaign for The Wheel of Time's third season.

Slashdot Top Deals