


China's Moonshot Launches Free AI Model Kimi K2 That Outperforms GPT-4 In Key Benchmarks 32
Chinese AI startup Moonshot AI has released Kimi K2, a trillion-parameter open-source language model that outperforms GPT-4 in key benchmarks with particularly strong performance on coding and autonomous agent tasks. VentureBeat reports: The new model, called Kimi K2, features 1 trillion total parameters with 32 billion activated parameters in a mixture-of-experts architecture. The company is releasing two versions: a foundation model for researchers and developers, and an instruction-tuned variant optimized for chat and autonomous agent applications. "Kimi K2 does not just answer; it acts," the company stated in its announcement blog. "With Kimi K2, advanced agentic intelligence is more open and accessible than ever. We can't wait to see what you build."
The model's standout feature is its optimization for "agentic" capabilities -- the ability to autonomously use tools, write and execute code, and complete complex multi-step tasks without human intervention. In benchmark tests, Kimi K2 achieved 65.8% accuracy on SWE-bench Verified, a challenging software engineering benchmark, outperforming most open-source alternatives and matching some proprietary models. [...] On LiveCodeBench, arguably the most realistic coding benchmark available, Kimi K2 achieved 53.7% accuracy, decisively beating DeepSeek-V3's 46.9% and GPT-4.1's 44.7%. More striking still: it scored 97.4% on MATH-500 compared to GPT-4.1's 92.4%, suggesting Moonshot has cracked something fundamental about mathematical reasoning that has eluded larger, better-funded competitors.
But here's what the benchmarks don't capture: Moonshot is achieving these results with a model that costs a fraction of what incumbents spend on training and inference. While OpenAI burns through hundreds of millions on compute for incremental improvements, Moonshot appears to have found a more efficient path to the same destination. It's a classic innovator's dilemma playing out in real time -- the scrappy outsider isn't just matching the incumbent's performance, they're doing it better, faster, and cheaper.
The model's standout feature is its optimization for "agentic" capabilities -- the ability to autonomously use tools, write and execute code, and complete complex multi-step tasks without human intervention. In benchmark tests, Kimi K2 achieved 65.8% accuracy on SWE-bench Verified, a challenging software engineering benchmark, outperforming most open-source alternatives and matching some proprietary models. [...] On LiveCodeBench, arguably the most realistic coding benchmark available, Kimi K2 achieved 53.7% accuracy, decisively beating DeepSeek-V3's 46.9% and GPT-4.1's 44.7%. More striking still: it scored 97.4% on MATH-500 compared to GPT-4.1's 92.4%, suggesting Moonshot has cracked something fundamental about mathematical reasoning that has eluded larger, better-funded competitors.
But here's what the benchmarks don't capture: Moonshot is achieving these results with a model that costs a fraction of what incumbents spend on training and inference. While OpenAI burns through hundreds of millions on compute for incremental improvements, Moonshot appears to have found a more efficient path to the same destination. It's a classic innovator's dilemma playing out in real time -- the scrappy outsider isn't just matching the incumbent's performance, they're doing it better, faster, and cheaper.
Re:China (Score:5, Funny)
The USA STILL steal intellectual property under the guise of "National security" needs.
So Pot, meet Kettle.
Re: (Score:1, Troll)
They don't have million of expats in China and a totalitarian state to easily squeeze their families though.
Re: (Score:2, Insightful)
Re: (Score:2)
The Chinese legal system is different than the US one.
Indeed it is.
In China, you have precisely no right to be free of unlawful detention. Indeed, there isn't even such a thing.
You can't make generalizations based on your ignorance of other systems.
Of course- you should never do that.
But you can point out the non-generalizations about the Chinese legal system that are fucking atrocious.
What the fuck is up with people fellating China these days? It's fucking insanity.
If a government is based on the principle of dictatorship of the proletariat, and the proletariat's power is vested in one fucking body, you have a dictatorship o
Re: (Score:2)
Educate yourself, you fucking dullard. [wikipedia.org]
Re: (Score:1)
Most 1st world countries have "socialist" policies in Universal healthcare, universal education, etc etc etc but they are also far more democratic, healthier, safer , happier, with better life expectancies than the USA . Trump is rapidly running further to the right WHILE also becoming totalitarian
Re: (Score:1)
The US has a problem with its poorest folks that many other first world countries do not have. However, its middle and well-off match and exceed the rest of the world, respectively.
Judging by your word selection, I'm guessing you're a francophone.
Here's France's income distributed life expectancy. [niussp.org]
Which is better? I suppose that's up to the beholder. If you're going for maximum amount of likely years lived- the US is.
If you're looking for a better place to be poor? Well, that's actuall
Re: (Score:2)
Indeed, however comunism never existed anywhere, even in China. At best, they are totalitarian, and hopfully leaning benevolent today. But I see that since USofA is leaving the stage, there is indeed a vaccum.
Chinese engineers and scientists are smart (Score:4, Insightful)
Attempting to prevent them from acquiring tech is futile and counterproductive
Politicians like to see everything as a race
Warmongers and defense contractors see everything as a threat that requires more military spending
Cooperation would be a better strategy
Re:Chinese engineers and scientists are smart (Score:4, Insightful)
Theres this obnoxious myth a lot of people at least subconsciously seem to have that innovation only comes from americans europeans, australians and. .... well you can probably figure the commonality, and it aint english.
We used to accuse the Japanese of only ever stealing tech , we now know better, the japanese where phenomenal innovators until the arse fell out of their economy.
The chinese have been great innovators for long before the wests industrial revolution. Its in the cultural DNA of the people. Yes, the chinese invent stuff, and they always have.
We're not as special as we think we are.
Re: Chinese engineers and scientists are smart (Score:2)
We think they copy because they took it to an art form. They copy down to the flaws. You can buy unlicensed exact replicas of engines, machine tools, you name it.
Did they learn nothing in the process, no. But I've worked for companies that have had their designs copied and were not able to get any remedy whatsoever because it was protected by their government. There is absolutely no question about who is responsible, or whether their government enables the behavior. And it is ongoing, not only historical.
Re: (Score:2)
The Chinese government is even more acquisitive and controlling that the US government. And neither is very good about keeping the deals that they've made, though the Chinese government is arguably better about that than is the US government.
You pay for it later... (Score:3)
Re: (Score:2)
Re: (Score:2)
No one knows what GPT-4 really costs to run (Score:2)
Most of their costs could be developing a lot of the basics, which continually diffuse away to other companies and China through ex-employees, requiring a lot more expenses on salary and exploration in training than the competition.
Wonder how much of this is distillation... (Score:4, Insightful)
Not that I care if they are using other companies models to ease costs. You can't inhale the internet, wave your hands about copyright and then complain "IP" when somebody uses your stuff in way you don't like.
If it takes more air out of the AI bubble, all the better, I say.
Re: (Score:2)
One wonders if it identifies as ChatGPT.
Just don't ask about the Tiananmen Square masacre (Score:4, Informative)
I'm sure it's been well trained to ensure you get the correct party-approved information.
Bu seriously I am curious as to how these chinese models react to questions about things the CCP does not want people to talk about. The CCP has a long history of attempting to apply censorship all across the world.
Re: (Score:1)
Re: (Score:2)
No need to wonder, just go try one.
For example, with DeepSeek if you download their AI and run it locally, it doesn't care about what the CCP wants and will happily tell you what you ask for. If you use it on their website, it depends if you are in China or not.
In other words, it's exactly like Western AIs. If you ask Siri about Tienanmen Square, the answer will depend on if you are in China or not.
Re: (Score:2)
TLDR; DeekpSeek will happily have a honest discussion about Tienanmen Square.
I run DeepSeek R1 locally, and this is the 'chat' I had with it. I did remove the 'Thinking' stages for brevity.
> What significant even happened at Tiananmen Square?
The significant event that occurred at Tiananmen Square in Beijing was the pro-democracy demonstrations in 1989. These were led by students, intellectuals, and workers advocating for political reform. The protests culminated in a government crackdown, resulting in martial law and suppression of the movement. This event has had lasting impacts on Chinese society and international relations.
> What was the worst thing to happen at that event?
The worst thing that happened at Tiananmen Square in 1989 was the tragic loss of life during the military crackdown on June 3-4, 1989. Thousands of peaceful demonstrators, including young students advocating for political reform and freedom of speech, were killed by government forces using live ammunition and tanks. This event marked a significant turning point in modern Chinese history, resulting in widespread international condemnation, long-term political repercussions, and emotional trauma for survivors and families. The suppression of information about the events has also led to ongoing censorship and a lack of awareness among many Chinese people regarding this tragic chapter in their history.
OpenAI (Score:1)
I'd like to take this time to once again laugh at the "open," "non-profit" OpenAI, that took the anti-human route and is now rapidly sinking. Good riddance.
Re: (Score:3)
Every new model is better. That's because they fine-tune them to be better at the benchmarks, and the benchmarks keep adjusting to the new SOTA.
Also, out of curiosity, how are we defining "rapidly sinking?"
I mean, I'm with you on criticizing the bullshit of OpenAI being completely non-open, but they are otherwise still basically top of the pack.
Kimi isn't any better at coding. (Score:2)
I have a need for a very specific feature I'd like to add to Zoom Player, an HTTP caching bridge.
The purpose of the http cache bridge is to cache repeated http GET queries generated by DirectShow media streaming filters (components) such as LAV Filters when streaming mp4/mkv files from media servers such as PLEX, Emby or Jellyfin. Caching is required as these components treat streaming files the same as local files with repeated seeking to read headers and frame indexes, degrading performance.
Since the http
Revenue? (Score:2)