AI

Slashdot Asks: How Are You Using ChatGPT? 192

OpenAI's ChatGPT has taken the world by storm with its ability to give solutions to complex problems almost instantly and with nothing more than a text prompt. Up until yesterday, ChatGPT was based on GPT-3.5, a deep learning language model that was trained on an impressive 175 billion parameters. Now, it's based on GPT-4 (available for ChatGPT+ subscribers), capable of solving even more complex problems with greater accuracy (40% percent more likely to give factual responses). It's also capable of receiving images as a basis for interaction, instead of just text. While the company has chosen not to reveal how large GPT-4 is, they claim it scored in the 88th percentile on a number of tests, including the Uniform Bar Exam, LSAT, SAT Math and SAT Evidence-Based Reading & Writing exams.

ChatGPT is extremely capable but its responses largely depend on the questions or prompts you enter. In other words, the better you describe and phrase the problem/question, the better the results. We're already starting to see companies require that new hires know not only how to use ChatGPT but how to extract the most out of it.

That being said, we'd like to know how Slashdotters are using the chatbot. What are some of your favorite prompts? Have you used it to become more efficient at work? What about for coding? Please share specific prompts too to help us get similar results.
AI

OpenAI Announces GPT-4 (theverge.com) 56

After months of rumors and speculation, OpenAI has announced GPT-4: the latest in its line of AI language models that power applications like ChatGPT and the new Bing. From a report: The company claims the model is "more creative and collaborative than ever before," and "can solve difficult problems with greater accuracy, thanks to its broader general knowledge and problem solving abilities." OpenAI says it's already partnered with a number of companies to integrate GPT-4 into their products, including Duolingo, Stripe, and Khan Academy. The new model will also be available on ChatGPT Plus and as an API.

In a research blog post, OpenAI said the distinction between GPT-4 and its predecessor GPT-3.5 is "subtle" in casual conversation (GPT-3.5 is the model that powers ChatGPT), but that the differences between the systems are clear when faced with more complex tasks. The company says these improvements can be seen on GPT-4's performance on a number of tests and benchmarks, including the Uniform Bar Exam, LSAT, SAT Math and SAT Evidence-Based Reading & Writing exams. In the exams mentioned GPT-4 scored in the 88th percentile and above, with a full list of exams and scores seen here. Speculation about GPT-4 and its capabilities have been rife over the past year, with many suggesting it would be a huge leap over previous systems. "People are begging to be disappointed and they will be," said OpenAI CEO Sam Altman in an interview in January. "The hype is just like... We don't have an actual AGI and that's sort of what's expected of us."

Earth

MIT Team Makes a Case For Direct Carbon Capture From Seawater, Not Air 131

The oceans soak up enormous quantities of carbon dioxide, and MIT researchers say they've developed a way of releasing and capturing it that uses far less energy than direct air capture -- with some other environmental benefits to boot. New Atlas reports: According to IEA figures from 2022, even the more efficient air capture technologies require about 6.6 gigajoules of energy, or 1.83 megawatt-hours per ton of carbon dioxide captured. Most of that energy isn't used to directly separate the CO2 from the air, it's in heat energy to keep the absorbers at operating temperatures, or electrical energy used to compress large amounts of air to the point where the capture operation can be done efficiently. But either way, the costs are out of control, with 2030 price estimates per ton ranging between US$300-$1,000. According to Statista, there's not a nation on Earth currently willing to tax carbon emitters even half of the lower estimate; first-placed Uruguay taxes it at US$137/ton. Direct air capture is not going to work as a business unless its costs come way down.

It turns out there's another option: seawater. As atmospheric carbon concentrations rise, carbon dioxide begins to dissolve into seawater. The ocean currently soaks up some 30-40% of all humanity's annual carbon emissions, and maintains a constant free exchange with the air. Suck the carbon out of the seawater, and it'll suck more out of the air to re-balance the concentrations. Best of all, the concentration of carbon dioxide in seawater is more than 100 times greater than in air. Previous research teams have managed to release CO2 from seawater and capture it, but their methods have required expensive membranes and a constant supply of chemicals to keep the reactions going. MIT's team, on the other hand, has announced the successful testing of a system that uses neither, and requires vastly less energy than air capture methods.

In the new system, seawater is passed through two chambers. The first uses reactive electrodes to release protons into the seawater, which acidifies the water, turning dissolved inorganic bicarbonates into carbon dioxide gas, which bubbles out and is collected using a vacuum. Then the water's pushed through to a second set of cells with a reversed voltage, calling those protons back in and turning the acidic water back to alkaline before releasing it back into the sea. Periodically, when the active electrode is depleted of protons, the polarity of the voltage is reversed, and the same reaction continues with water flowing in the opposite direction. In a new study published in the peer-reviewed journal Energy & Environmental Science, the team says its technique requires an energy input of 122 kJ/mol, equating by our math to 0.77 mWh per ton. And the team is confident it can do even better: "Though our base energy consumption of 122 kJ/mol-CO2 is a record-low," reads the study, "it may still be substantially decreased towards the thermodynamic limit of 32 kJ/mol-CO2."
Education

Steep Declines In Data Science Skills Among Fourth- and Eighth-Graders Across America, Study Finds (phys.org) 228

A new report (PDF) from the Data Science 4 Everyone coalition reveals that data literacy skills among fourth and eighth-grade students have declined significantly over the last decade even as these skills have become increasingly essential in our modern, data-driven society. Phys.Org reports: Based on data from the latest National Assessment of Educational Progress results, the report uncovered several trends that raise concerns about whether the nation's educational system is sufficiently preparing young people for a world reshaped by the rise of big data and artificial intelligence. Key findings include:

- The pandemic decline is part of a much longer-term trend. Between 2019 and 2022, scores in the data analysis, statistics, and probability section of the NAEP math exam fell by 10 points for eighth-graders and by four points for fourth-graders. Declining scores are part of a longer-term trend, with scores down 17 points for eighth-graders and down 10 points for fourth-graders over the last decade. That means today's eighth-graders have the data literacy of sixth-graders from a decade ago, and today's fourth-graders have the data literacy of third-graders from a decade ago.

- There are large racial gaps in scores. These gaps exist across all grade levels but are at times most dramatic in the middle and high school levels. For instance, fourth-grade Black students scored 28 points lower -- the equivalent of nearly three grade levels -- than their white peers in data analysis, statistics, and probability.

- Data-related instruction is in decline. Every state except Alabama reported a decline or stagnant trend in data-related instruction, with some states -- like Maryland and Iowa -- seeing double-digit drops. The national share of fourth-grade math teachers reporting "moderate" or "heavy" emphasis on data analysis dropped five percentage points between 2019 and 2022.

Education

The End of Grading (wired.com) 231

How the irrational mathematics of measuring, ranking, and rating distort the value of stuff, work, people -- everything. From a report: More irrational even than pi, assessing people amounts to quantifying a relationship between unknown, usually unknowable things. Every measurement, the mathematician Paul Lockhart reminds us in his book Measurement, is a comparison: "We are comparing the thing we are measuring to the thing we are measuring it with." What thing do we use to measure undergraduates? What aspects can be compared? Quality or quantity? Originality or effort? Participation or progress? Apples and oranges at best. Closer to bananas and elephants. Even quantitative tests mark, at most, a comparison between what the test-maker thought the student should know and the effectiveness of instruction. Grades become the permanent records of these passing encounters.

And how do we grade the grader? When a physicist friend found out that a first-year Harvard student he knew -- a math star in high school -- got an F in physics, he said: "Harvard should be ashamed of itself." A Harvard grad himself, he believed that schools fail students far more often than students fail schools. Some STEM profs, I'm told, tell the class at the outset that half of them will fail. I give that teacher an F. I'm not alone in my discomfort with the irrational business of ranking, rating, and grading. The deans of Yale's and Harvard's law schools recently removed themselves from the rankings of US News & World Report, followed by Harvard Medical School and scores of others. "Rankings cannot meaningfully reflect ... educational excellence," Harvard dean George O. Daley explained. Rankings lead schools to falsify data and make policies designed to raise rankings rather than "nobler objectives." The very thing that's been eating education is now devouring everything else. My doctor recently urged me to get an expensive diagnostic test because it "makes our numbers look good." Her nurse asked me to rank my pain on a totem pole of emojis. Then after the visit, to rate my experience. The numbers are all irrational. And rather like the never-ending digits of pi, there seems to be no end to them.

Google

Google Begins Testing Its Own ChatGPT-Style AI (gizmodo.com) 19

Google is rushing to release its own artificial intelligence products in the wake of OpenAI's ChatGPT. From a report: The search engine pioneer is working hard and fast on a "code red" effort to respond to ChatGPT with a large language chatbot and testing new ways to incorporate that AI-powered bot into search, according to a report from CNBC. The new report backs up earlier news from the New York Times and elsewhere, which outlined a rapid re-alignment in Google's priorities in direct response to the rise of ChatGPT. CEO Sundar Pichai reportedly re-assigned employees and "upended" meetings to boost the amount of resources going towards the company's AI development.

CNBC's Tuesday account offers further details. Google's new chatbot, reportedly named "Apprentice Bard," is based on the company's pre-existing LaMDA (Language Model for Dialogue Applications) technology. The application looks and functions similarly to ChatGPT: Users input a question in natural language and receive a generated text response as an answer. But Apprentice Bard seemingly has a couple of important skills beyond what ChatGPT can do. For one, it can draw on recent events and information, according to CNBC, unlike ChatGPT which is limited to online information from before 2021. And it may be better at achieving that elusive AI accuracy. For instance, LaMDA correctly responded to a math riddle that ChatGPT failed to grasp, as recorded in company documents viewed by CNBC.

Education

Why This Teacher Has Adopted an Open ChatGPT Policy (npr.org) 113

An anonymous reader quotes a report from NPR: Ethan Mollick has a message for the humans and the machines: can't we all just get along? After all, we are now officially in an A.I. world and we're going to have to share it, reasons the associate professor at the University of Pennsylvania's prestigious Wharton School. "This was a sudden change, right? There is a lot of good stuff that we are going to have to do differently, but I think we could solve the problems of how we teach people to write in a world with ChatGPT," Mollick told NPR. [...] This year, Mollick is not only allowing his students to use ChatGPT, they are required to. And he has formally adopted an A.I. policy into his syllabus for the first time.

He teaches classes in entrepreneurship and innovation, and said the early indications were the move was going great. "The truth is, I probably couldn't have stopped them even if I didn't require it," Mollick said. This week he ran a session where students were asked to come up with ideas for their class project. Almost everyone had ChatGPT running and were asking it to generate projects, and then they interrogated the bot's ideas with further prompts. "And the ideas so far are great, partially as a result of that set of interactions," Mollick said. He readily admits he alternates between enthusiasm and anxiety about how artificial intelligence can change assessments in the classroom, but he believes educators need to move with the times. "We taught people how to do math in a world with calculators," he said. Now the challenge is for educators to teach students how the world has changed again, and how they can adapt to that.

Mollick's new policy states that using A.I. is an "emerging skill"; that it can be wrong and students should check its results against other sources; and that they will be responsible for any errors or omissions provided by the tool. And, perhaps most importantly, students need to acknowledge when and how they have used it. "Failure to do so is in violation of academic honesty policies," the policy reads. [...] "I think everybody is cheating ... I mean, it's happening. So what I'm asking students to do is just be honest with me," he said. "Tell me what they use ChatGPT for, tell me what they used as prompts to get it to do what they want, and that's all I'm asking from them. We're in a world where this is happening, but now it's just going to be at an even grander scale." "I don't think human nature changes as a result of ChatGPT. I think capability did."

AI

ChatGPT Passes MBA Exam Given By a Wharton Professor (nbcnews.com) 155

An anonymous reader quotes a report from NBC News: New research (PDF) conducted by a professor at University of Pennsylvania's Wharton School found that the artificial intelligence-driven chatbot GPT-3 was able to pass the final exam for the school's Master of Business Administration (MBA) program. Professor Christian Terwiesch, who authored the research paper "Would Chat GPT3 Get a Wharton MBA? A Prediction Based on Its Performance in the Operations Management Course," said that the bot scored between a B- and B on the exam.

The bot's score, Terwiesch wrote, shows its "remarkable ability to automate some of the skills of highly compensated knowledge workers in general and specifically the knowledge workers in the jobs held by MBA graduates including analysts, managers, and consultants." The bot did an "amazing job at basic operations management and process analysis questions including those that are based on case studies," Terwiesch wrote in the paper, which was published on Jan. 17. He also said the bot's explanations were "excellent." The bot is also "remarkably good at modifying its answers in response to human hints," he concluded.

While Chat GPT3's results were impressive, Terwiesch noted that Chat GPT3 "at times makes surprising mistakes in relatively simple calculations at the level of 6th grade Math." The present version of Chat GPT is "not capable of handling more advanced process analysis questions, even when they are based on fairly standard templates," Terwiesch added. "This includes process flows with multiple products and problems with stochastic effects such as demand variability." Still, Terwiesch said ChatGPT3's performance on the test has "important implications for business school education, including the need for exam policies, curriculum design focusing on collaboration between human and AI, opportunities to simulate real world decision making processes, the need to teach creative problem solving, improved teaching productivity, and more."
The latest findings come as educators become increasingly concerned that AI chatbots like ChatGPT could inspire cheating. Earlier this month, New York City's education department banned access to ChatGPT. While the education department cited "safety and accuracy" as reasons for the decision, the Washington Post notes how some teachers are "in a near-panic" about the technology enabling students to cheat on assignments.

Yesterday, for example, The Stanford Daily reported that a large number of Stanford students have already used ChatGPT on their final exams. It's prompting anti-plagiarism software Turnitin to build a tool to detect text generated by AI.
Microsoft

Bill Gates Discusses AI, Climate Change, and his Time at Microsoft (gatesnotes.com) 112

Bill Gates took his 11th turn answering questions in Reddit's "Ask My Anything" forum this week — and occasionally looked back on his time at Microsoft: Is technology only functional for you nowadays, or is there a still hobby aspect to it? Do you for instance still do nerdy or geeky things in your spare time; e.g. write code?

Yes. I like to play around and code. The last time my code shipped in a Microsoft product was 1985 — so a long time ago. I can no longer threaten when I think a schedule is too long that "I will come in and code it over the weekend."


Mr Gates, with the benefit of hindsight regarding your years of involvement with Microsoft, what is the single biggest thing you wish you had done differently?

I was CEO until 2000. I certainly know a lot now that I didn't back then. Two areas I would change would be our work in phone Operating systems (Android won) and trying to settle the antitrust lawsuit sooner.

Gates posted all of his responses on his personal web site Gates Notes — and there were also some discussion about AI's coming role in our future. Asked for his opinion about generative AI, and how it will impact the world, Gates said "I am quite impressed with the rate of improvement in these AIs" I think they will have a huge impact. Thinking of it in the Gates Foundation context we want to have tutors that help kids learn math and stay interested. We want medical help for people in Africa who can't access a doctor. I still work with Microsoft some, so I am following this very closely.

Do you think that using technology to push teachers and doctors out of jobs will have a positive impact on our world? What about, instead, we use AI to give equitable access to education and training for more human teachers and doctors, without the $500,000 price tag. Do you think that might have a more positive impact on, ya know, humans?

I think we need more teachers and doctors, not less. In the Foundation's work, the shortage of doctors means that most people never see a doctor and they suffer because of that. We want class sizes to be smaller. Digital tools can help although their impact so far has been modest.


[W]hat are your views on OpenAI's ChatGPT?

It gives a glimpse of what is to come. I am impressed with this whole approach and the rate of innovation....


Many years ago, I think around 2000, I heard you say something on TV like, "people are vastly overestimating what the internet will be like in 5 years, and vastly underestimating what it will be like in 10 years." Is any mammoth technology shift at a similar stage right now? Any tech shift — not necessarily the Internet

AI is the big one. I don't think Web3 was that big or that metaverse stuff alone was revolutionary, but AI is quite revolutionary....


What are you excited about in the year ahead?

First being a grandfather. Second being a good friend and father. Third progress in health and climate innovation. Fourth helping to shape the AI advances in a positive way.

Gates also offered an update on the Terrapower molten salt Thorium reactors, shared his thoughts on veganism, and made predictions about climate change. "I still believe we can avoid a terrible outcome. The pace of innovation is really picking up even though we won't make the current timelines or avoid going over 1.5.... The key on climate is making the clean products as cheap as the dirty products in every area of emission — planes, concrete, meat etc."

Gates also revealed what kind of smartphone he uses (a foldable Samsung Fold 4), what he thought of the latest Avatar ("good"), and that his favorite bands include U2. "I loved Bono's recent book and he is a good friend."

And he said he believes that the very rich "should pay a lot more in taxes." But in addition, Gates said, "they should give away their wealth over time. It has been very fulfilling for me and is my full-time job."
Math

UK PM Rishi Sunak To Propose Compulsory Math To Students Up To 18 (cnbc.com) 110

U.K. Prime Minister Rishi Sunak will on Wednesday announce plans to force school pupils in England to study math up to the age of 18, according to a Downing Street briefing. The initiative attempts to tackle innumeracy and better equip young people for the workplace. CNBC reports: In his first speech of 2023, Sunak is expected to outline plans for math to be offered through alternative qualification routes. Comparatively, traditional A-Levels subject-based qualifications allow high school students in England to elect academic subjects to study between the ages of 16 and 18. [...] Sunak's education proposals would only affect pupils in England. Education is a devolved issue, with Welsh, Scottish and Northern Irish authorities managing their own systems.

School-based education in England is only compulsory up to the age of 16, after which children can choose to pursue further academic qualifications such as A-Levels or alternative qualifications, or vocational training. The prime minister is expected to say in his Wednesday speech that the issue of mandatory math is "personal" for him. "Every opportunity I've had in life began with the education I was so fortunate to receive. And it's the single most important reason why I came into politics: to give every child the highest possible standard of education," he will say.

Sunak attended prestigious fee-paying institutions -- the Stroud School and Winchester College -- before studying at Oxford University. He is expected to acknowledge that the planned overhaul will be challenging and time consuming, with work beginning during the current parliamentary term and finishing in the next.

Programming

MIT's Newest fMRI Study: 'This is Your Brain on Code' (mit.edu) 9

Remember when MIT researchers did fMRI brain scans measuring the blood flow through brains to determine which parts were engaged when programmers evaluated code? MIT now says that a new paper (by many of the same authors) delves even deeper: Whereas the previous study looked at 20 to 30 people to determine which brain systems, on average, are relied upon to comprehend code, the new research looks at the brain activity of individual programmers as they process specific elements of a computer program. Suppose, for instance, that there's a one-line piece of code that involves word manipulation and a separate piece of code that entails a mathematical operation. "Can I go from the activity we see in the brains, the actual brain signals, to try to reverse-engineer and figure out what, specifically, the programmer was looking at?" asks Shashank Srikant, a PhD student in MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL). "This would reveal what information pertaining to programs is uniquely encoded in our brains." To neuroscientists, he notes, a physical property is considered "encoded" if they can infer that property by looking at someone's brain signals.

Take, for instance, a loop — an instruction within a program to repeat a specific operation until the desired result is achieved — or a branch, a different type of programming instruction than can cause the computer to switch from one operation to another. Based on the patterns of brain activity that were observed, the group could tell whether someone was evaluating a piece of code involving a loop or a branch. The researchers could also tell whether the code related to words or mathematical symbols, and whether someone was reading actual code or merely a written description of that code.....

The team carried out a second set of experiments, which incorporated machine learning models called neural networks that were specifically trained on computer programs. These models have been successful, in recent years, in helping programmers complete pieces of code. What the group wanted to find out was whether the brain signals seen in their study when participants were examining pieces of code resembled the patterns of activation observed when neural networks analyzed the same piece of code. And the answer they arrived at was a qualified yes. "If you put a piece of code into the neural network, it produces a list of numbers that tells you, in some way, what the program is all about," Srikant says. Brain scans of people studying computer programs similarly produce a list of numbers. When a program is dominated by branching, for example, "you see a distinct pattern of brain activity," he adds, "and you see a similar pattern when the machine learning model tries to understand that same snippet."

But where will it all lead? They don't yet know what these recently-gleaned insights can tell us about how people carry out more elaborate plans in the real world.... Creating models of code composition, says O'Reilly, a principal research scientist at CSAIL, "is beyond our grasp at the moment." Lipkin, a BCS PhD student, considers this the next logical step — figuring out how to "combine simple operations to build complex programs and use those strategies to effectively address general reasoning tasks." He further believes that some of the progress toward that goal achieved by the team so far owes to its interdisciplinary makeup. "We were able to draw from individual experiences with program analysis and neural signal processing, as well as combined work on machine learning and natural language processing," Lipkin says. "These types of collaborations are becoming increasingly common as neuro- and computer scientists join forces on the quest towards understanding and building general intelligence."
Education

MPs and Peers Do Worse Than 10-Year-Olds in Maths and English Sats 108

MPs and peers tasked with completing a year 6 Sats exam have scored lower results on average than the country's 10-year-olds. From a report: MPs including Commons education select committee chair Robin Walker took part in the exams, invigilated by 11-year-olds, at a Westminster event organised by More Than A Score, who campaign for the tests to be scrapped. Only 44% of the cross-party group of parliamentarians dubbed the Westminster Class of 2022 achieved the expected standard in maths and just 50% had achieved the expected standard in spelling, punctuation and grammar.

Across the country, 59% of pupils aged 10 and 11 reached the expected standard in the Sats tests of maths, reading and writing this year, down from 65% in 2019, the previous time the tests were taken. Detailed figures published by the Department for Education in the summer revealed disadvantaged children had a steeper fall than their better-off peers. Walker took part in the Big SATS Sit-In Westminster alongside his Conservative colleagues Flick Drummond and Gagan Mohindra; Labour MPs Ian Byrne and Emma Lewell-Buck with the Green party's Lady Bennett to experience the high-stakes nature of the exams. More Than A Score hope the politicians will take the high-pressured experience away with them and realise that "the exams only judge schools but do not help children's learning" at that age.
Math

Computer Program For Particle Physics At Risk of Obsolescence (quantamagazine.org) 105

"Maintenance of the software that's used for the hardest physics calculations rests almost entirely with a retiree," reports Quanta magazine, saying the situation "reveals the problematic incentive structure of academia." Particle physicists use some of the longest equations in all of science. To look for signs of new elementary particles in collisions at the Large Hadron Collider, for example, they draw thousands of pictures called Feynman diagrams that depict possible collision outcomes, each one encoding a complicated formula that can be millions of terms long. Summing formulas like these with pen and paper is impossible; even adding them with computers is a challenge. The algebra rules we learn in school are fast enough for homework, but for particle physics they are woefully inefficient.

Programs called computer algebra systems strive to handle these tasks. And if you want to solve the biggest equations in the world, for 33 years one program has stood out: FORM. Developed by the Dutch particle physicist Jos Vermaseren, FORM is a key part of the infrastructure of particle physics, necessary for the hardest calculations. However, as with surprisingly many essential pieces of digital infrastructure, FORM's maintenance rests largely on one person: Vermaseren himself. And at 73, Vermaseren has begun to step back from FORM development. Due to the incentive structure of academia, which prizes published papers, not software tools, no successor has emerged. If the situation does not change, particle physics may be forced to slow down dramatically...

Without ongoing development, FORM will get less and less usable — only able to interact with older computer code, and not aligned with how today's students learn to program. Experienced users will stick with it, but younger researchers will adopt alternative computer algebra programs like Mathematica that are more user-friendly but orders of magnitude slower. In practice, many of these physicists will decide that certain problems are off-limits — too difficult to handle. So particle physics will stall, with only a few people able to work on the hardest calculations.

In April, Vermaseren is holding a summit of FORM users to plan for the future. They will discuss how to keep FORM alive: how to maintain and extend it, and how to show a new generation of students just how much it can do. With luck, hard work and funding, they may preserve one of the most powerful tools in physics.

Thanks to long-time Slashdot reader g01d4 for submitting the story.
Education

Amazon To Shut Down Its Online Learning Platform in India (techcrunch.com) 9

Amazon will be shutting down Amazon Academy, an online learning platform it launched in India for high-school students last year. From a report: The retailer says it will wind down the edtech service in the country in a phased manner starting August 2023. Those who signed up for the current academic batch will receive a full refund, it said. Amazon officially launched Academy, previously called JEE Ready, early last year, but had been testing the platform since mid-2019. Academy sought to help students prepare for entry into the nation's prestigious engineering colleges. The service offered curated learning material, live lectures, mock tests and comprehensive assessments to help students learn and practice math, physics and chemistry and prepare for the Joint Entrance Examinations (JEE), a government-backed engineering entrance assessment conducted in India for admission to various engineering colleges.
Facebook

Meta's Latest Large Language Model Survived Only Three Days Online (technologyreview.com) 57

On November 15 Meta unveiled a new large language model called Galactica, designed to assist scientists. But instead of landing with the big bang Meta hoped for, Galactica has died with a whimper after three days of intense criticism. Yesterday the company took down the public demo that it had encouraged everyone to try out. From a report: Meta's misstep -- and its hubris -- show once again that Big Tech has a blind spot about the severe limitations of large language models. There is a large body of research that highlights the flaws of this technology, including its tendencies to reproduce prejudice and assert falsehoods as facts.

Galactica is a large language model for science, trained on 48 million examples of scientific articles, websites, textbooks, lecture notes, and encyclopedias. Meta promoted its model as a shortcut for researchers and students. In the company's words, Galactica "can summarize academic papers, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more." But the shiny veneer wore through fast. Like all language models, Galactica is a mindless bot that cannot tell fact from fiction. Within hours, scientists were sharing its biased and incorrect results on social media.

Programming

Should Functional Programming Be the Future of Software Development? (ieee.org) 186

The CTO of a software company argues the software industry's current trajectory "is toward increasing complexity, longer product-development times, and greater fragility of production systems" — not to mention nightmarish problems maintaining code.

"To address such issues, companies usually just throw more people at the problem: more developers, more testers, and more technicians who intervene when systems fail. Surely there must be a better way," they write in IEEE Spectrum. "I'm part of a growing group of developers who think the answer could be functional programming...." Today, we have a slew of dangerous practices that compromise the robustness and maintainability of software. Nearly all modern programming languages have some form of null references, shared global state, and functions with side effects — things that are far worse than the GOTO ever was. How can those flaws be eliminated? It turns out that the answer has been around for decades: purely functional programming languages....

Indeed, software based on pure functions is particularly well suited to modern multicore CPUs. That's because pure functions operate only on their input parameters, making it impossible to have any interactions between different functions. This allows the compiler to be optimized to produce code that runs on multiple cores efficiently and easily....

Functional programming also has a solution to Hoare's "billion-dollar mistake," null references. It addresses that problem by disallowing nulls. Instead, there is a construct usually called Maybe (or Option in some languages). A Maybe can be Nothing or Just some value. Working with Maybe s forces developers to always consider both cases. They have no choice in the matter. They must handle the Nothing case every single time they encounter a Maybe. Doing so eliminates the many bugs that null references can spawn.

Functional programming also requires that data be immutable, meaning that once you set a variable to some value, it is forever that value. Variables are more like variables in math...

Pure functional programming solves many of our industry's biggest problems by removing dangerous features from the language, making it harder for developers to shoot themselves in the foot.... I anticipate that the adoption of pure functional languages will improve the quality and robustness of the whole software industry while greatly reducing time wasted on bugs that are simply impossible to generate with functional programming. It's not magic, but sometimes it feels like that, and I'm reminded of how good I have it every time I'm forced to work with a non-functional codebase.

Math

Why Mathematicians Study Knots (quantamagazine.org) 19

Far from being an abstract mathematical curiosity, knot theory has driven many findings in math and beyond. Quanta magazine: Knot theory began as an attempt to understand the fundamental makeup of the universe. In 1867, when scientists were eagerly trying to figure out what could possibly account for all the different kinds of matter, the Scottish mathematician and physicist Peter Guthrie Tait showed his friend and compatriot Sir William Thomson his device for generating smoke rings. Thomson -- later to become Lord Kelvin (namesake of the temperature scale) -- was captivated by the rings' beguiling shapes, their stability and their interactions. His inspiration led him in a surprising direction: Perhaps, he thought, just as the smoke rings were vortices in the air, atoms were knotted vortex rings in the luminiferous ether, an invisible medium through which, physicists believed, light propagated.

Although this Victorian-era idea may now sound ridiculous, it was not a frivolous investigation. This vortex theory had a lot to recommend it: The sheer diversity of knots, each slightly different, seemed to mirror the different properties of the many chemical elements. The stability of vortex rings might also provide the permanence that atoms required. Vortex theory gained traction in the scientific community and inspired Tait to begin tabulating all knots, creating what he hoped would be equivalent to a table of elements. Of course, atoms are not knots, and there is no ether. By the late 1880s Thomson was gradually abandoning his vortex theory, but by then Tait was captivated by the mathematical elegance of his knots, and he continued his tabulation project. In the process, he established the mathematical field of knot theory.

We are all familiar with knots -- they keep shoes on our feet, boats secured to docks, and mountain climbers off the rocks below. But those knots are not exactly what mathematicians (including Tait) would call a knot. Although a tangled extension cord may appear knotted, it's always possible to disentangle it. To get a mathematical knot, you must plug together the free ends of the cord to form a closed loop. Because the strands of a knot are flexible like string, mathematicians view knot theory as a subfield of topology, the study of malleable shapes. Sometimes it is possible to untangle a knot so it becomes a simple circle, which we call the "unknot." But more often, untangling a knot is impossible.

Math

Math Scores Fell In Nearly Every State, Reading Dipped On National Exam (nytimes.com) 196

U.S. students in most states and across almost all demographic groups have experienced troubling setbacks in both math and reading, according to an authoritative national exam released on Monday, offering the most definitive indictment yet of the pandemic's impact on millions of schoolchildren. The New York Times reports: In math, the results were especially devastating, representing the steepest declines ever recorded on the National Assessment of Educational Progress, known as the nation's report card, which tests a broad sampling of fourth and eighth graders and dates to the early 1990s. In the test's first results since the pandemic began, math scores for eighth graders fell in nearly every state. A meager 26 percent of eighth graders were proficient, down from 34 percent in 2019. Fourth graders fared only slightly better, with declines in 41 states. Just 36 percent of fourth graders were proficient in math, down from 41 percent.

Reading scores also declined in more than half the states, continuing a downward trend that had begun even before the pandemic. No state showed sizable improvement in reading. And only about one in three students met proficiency standards, a designation that means students have demonstrated competency and are on track for future success. And for the country's most vulnerable students, the pandemic has left them even further behind. The drops in their test scores were often more pronounced, and their climbs to proficiency are now that much more daunting.

Classic Games (Games)

How a Mathematician-Magician Revealed a Casino Loophole (bbc.com) 102

It's the tale of a company manufacuring precision card-shuffling machines for casinos — and a gang of hustlers who used a hidden video camera to film the shuffler's insides. "The images, transmitted to an accomplice outside in the casino parking lot, were played back in slow motion to figure out the sequence of cards in the deck," remembers the BBC, "which was then communicated back to the gamblers inside. The casino lost millions of dollars before the gang were finally caught."

So the company turned for help to a mathematician/magician: The executives were determined not to be hacked again. They had developed a prototype of a sophisticated new shuffling machine, this time enclosed in an opaque box. Their engineers assured them that the machine would sufficiently randomise a deck of cards with one pass through the device, reducing the time between hands while also beating card-counters and crooked dealers. But they needed to be sure that their machine properly shuffled the deck. They needed Persi Diaconis.

Diaconis, a magician-turned-mathematician at Stanford University, is regarded as the world's foremost expert on the mathematics of card shuffling. Throughout the surprisingly large scholarly literature on the topic, his name keeps popping up like the ace of spades in a magician's sleight-of-hand trick. So, when the company executives contacted him and offered to let him see the inner workings of their machine — a literal "black box" — he couldn't believe his luck. With his collaborator Susan Holmes, a statistician at Stanford, Diaconis travelled to the company's Las Vegas showroom to examine a prototype of their new machine.

The pair soon discovered a flaw. Although the mechanical shuffling action appeared random, the mathematicians noticed that the resulting deck still had rising and falling sequences, which meant that they could make predictions about the card order. To prove this to the company executives, Diaconis and Holmes devised a simple technique for guessing which card would be turned over next. If the first card flipped was the five of hearts, say, they guessed that the next card was the six of hearts, on the assumption that the sequence was rising. If the next card was actually lower — a four of hearts, for instance — this meant they were in a falling sequence, and their next guess was the three of hearts. With this simple strategy, the mathematicians were able to correctly guess nine or 10 cards per deck — one-fifth of the total — enough to double or triple the advantage of a competent card-counter....

The executives were horrified. "We are not pleased with your conclusions," they wrote to Diaconis, "but we believe them and that's what we hired you for." The company quietly shelved the prototype and switched to a different machine.

The article also explains why seven shuffles "is just as close to random as can be" — rendering further shuffling largely ineffective.
Communications

US Opts To Not Rebuild Renowned Puerto Rico Telescope (apnews.com) 130

The National Science Foundation announced Thursday that it will not rebuild a renowned radio telescope in Puerto Rico, which was one of the world's largest until it collapsed nearly two years ago. The Associated Press reports: Instead, the agency issued a solicitation for the creation of a $5 million education center at the site that would promote programs and partnerships related to science, technology, engineering and math. It also seeks the implementation of a research and workforce development program, with the center slated to open next year in the northern mountain town of Arecibo where the telescope was once located. The solicitation does not include operational support for current infrastructure at the site that is still in use, including a 12-meter radio telescope or the Lidar facility, which is used to study the upper atmosphere and ionosphere to analyze cloud cover and precipitation data.

The decision was mourned by scientists around the world who used the telescope at the Arecibo Observatory for years to search for asteroids, planets and extraterrestrial life. The 1,000-foot-wide (305-meter-wide) dish also was featured in the Jodie Foster film "Contact" and the James Bond movie "GoldenEye." The reflector dish and the 900-ton platform hanging 450 feet above it previously allowed scientists to track asteroids headed to Earth, conduct research that led to a Nobel Prize and determine if a planet is potentially habitable.
The Arecibo Observatory collapsed in on itself in December 2020, after the telescope suffered two major cable malfunctions in the two months prior. The National Science Foundation released shocking footage of the moment when support cables snapped, causing the massive 900-ton structure suspended above Arecibo to fall onto the observatory's iconic 1,000-foot-wide dish.

Slashdot Top Deals