
New Claude Model Runs 30-Hour Marathon To Create 11,000-Line Slack Clone (theverge.com) 42
Anthropic's Claude Sonnet 4.5 ran autonomously for 30 hours to build a chat application similar to Slack or Teams, generating approximately 11,000 lines of code before stopping upon task completion. The model, announced today, marks a significant leap from the company's Opus 4 model, which ran for seven hours in May.
Claude Sonnet 4.5 performs three times better at browser navigation and computer use than Anthropic's October technology. Beta-tester Canva deployed the model for complex engineering tasks in its codebase and product features. Anthropic paired the release with virtual machines, memory, context management, and multi-agent support tools, enabling developers to build their own AI agents using the same building blocks that power Claude Code.
Claude Sonnet 4.5 performs three times better at browser navigation and computer use than Anthropic's October technology. Beta-tester Canva deployed the model for complex engineering tasks in its codebase and product features. Anthropic paired the release with virtual machines, memory, context management, and multi-agent support tools, enabling developers to build their own AI agents using the same building blocks that power Claude Code.
Cool (Score:1)
Is it riddled with security holes?
Re:Cool (Score:4, Interesting)
Re:Cool (Score:4, Funny)
Yes, if if it is an accurate clone.
Re: (Score:1)
More likely it's this kind of clone. [reddit.com]
And how crappy and insecure is that "clone"? (Score:3, Insightful)
And how unmaintainable?
Seriously, even an artificial moron can cobble together a "clone" of something that exists. It can just not create a good clone.
Re: (Score:2)
That said writing a chat client is basically a 102 level problem for computer science. Heck I haven't been in college in over 30 years it's probably a 101 problem now.
Re: (Score:1)
Who care about security in a chat app. If you dont want some one else to see get up walk to there desk and tell the.
I agree i hacked one together over 20 years ago in a few hours. We used it in the office for almost a decade.
Re: (Score:2)
Who care about security in a chat app.
A chat app can be a nice backdoor into your system. It has happened.
Re: (Score:2)
I was really hoping you were like 22....
I expect whatever this thing wrote is better than Teams. Not because I assume the AI did a good job, though.
Re: And how crappy and insecure is that "clone"? (Score:2)
Let's see you write in C from scratch having to manage TCP sockets, user sessions and states. Any idiot can plug together libraries lego brick style written by someone else who did all the hard work .
Re: (Score:1)
But it's a low bar: MS-Teams sucks the big one.
Re: (Score:3)
Re: (Score:3)
Businesses don't care about maintainable, that's why Slack and Teams are so shit. All they care about is shipping something that barely works and leveraging their dominant market position to ram it up your arse.
I should declare that I'm not a fan of Teams.
Re: (Score:3)
I love Teams. It crashes whenever I turn on video so I have an excellent excuse not to turn on video. Combine that with turning off the audio as a courtesy to the speaker and you can actually get something useful done.
Results (Score:5, Insightful)
Okay, now show us the results. "like slack or teams" doesnt mean much, you're obfuscating what it is actually capable of. Are we talking a generic chat system, super basic? or IRC level with users/rooms? what about user registration, activation, moderation? Does it do multimedia like slack/teams? Can you audio/video call?
This right now screams of over-hyping its capabilities and the replacement of human developers even tho it can't do a fraction of what humans can.
Typing isnâ(TM)t coding (Score:4, Insightful)
Re: (Score:3)
I'd rather say its impressive because it only used 11k lines. I suppose Slack-like only means the basic functionality (rooms, private messages, profiles, persistent chats, frontend, backend) but try to keep this below 11k lines. Writing much code is easy, keeping things short is the true virtue.
Re: (Score:2)
Re: (Score:2)
I must say I didn't use the original yet, but 11k for a webapp with backend and frontend doesn't sound much. Before misunderstanding, I think of 11k like in "wc -l", i.e. including all blank linkes, comments without any minimization.
I did not use Opus yet, but AI also tends to include a lot of comments. Not quite like "i++ // increment i", but structuring like "// preprocessing step", "// the main algorithm" and so on. I thought them to be too obvious in some cases, but I've read, and that makes sense, that
Re: (Score:2)
If you type 12,000 lines that do pretty much the expected thing, and it takes you roughly 30 hours, sure you're better than Claude.
As much as we love to ding LLMs, productivity over time is a big factor in software development.
And, amazing the experts .... (Score:5, Funny)
Re: (Score:3)
No copyright (Score:2)
Keep in mind that non-human generated code may not have copyright. Not open source, which does have a copyright. But no copyright at all.
There's an argument that if it is modified by a human, it can be copyrighted. But that could be line by line.
Re: (Score:2)
Meh, most people here probably prefer if they can open their code. I am not Microsoft. I mostly use licenses so others should give back something for my work. The less work it was, the less I demand credit for it. If I had to chance a lot, it has copyright. If I didn't need to do a thing, I am not opposed to sharing it for free. I mean if you can get another Slack clone with a simple prompt, why does it matter if you copy that one or create a new one?
Copyright will soon have a different importance, when man
A million monkeys reduced to one data center. (Score:3)
Or is it just a copy of the others, fraught with copyright infringement and trademark violations.
also wondering how much that cost (Score:3)
Could a decent programmer stitch that together in the same time?
How many tokens would you need to hold all the input data?
Re: (Score:3)
I want to know how much electricity that wasted, and the cost for that.
30 hours makes no sense (Score:3)
Slack is just a crippled IRC server in the backend, and a crippled IRC GUI as the front end client.
Basically to make a slack clone you get yourself the IRC source code and delete 11,000 lines of code to remove many of its features.
Re: (Score:2)
Whereas Teams is just Microsoft Lync with a thousand added features that nobody asked for.
Three questions: (Score:2)
* What programming language ?
* Will they open source it ?
* Is the source code maintainable ?
If this really works and the result is open source there could be many interesting uses. I would love to see Larry's reaction to the request: Write a clone of the Oracle database.
Re: (Score:3)
I would love to see Larry's reaction to the request: Write a clone of the Oracle database.
It would be the same as mine, no doubt: Laughter.
As a company, Oracle is shit. As a product, Oracle is also pretty crappy in a lot of ways, but it's a functionality and performance leader, and no AI-written RDBMS is going to challenge it in any department period.
Ok Fine! (Score:2)
Cool (Score:2)
Despite all the sarcasm, this is actually pretty cool. If you can input some specification documents and get even semi-working software in 30 hours, well, that's cool.
Oh did I say cool? I meant terrifying.
Also from Anthropic: (Score:3)
"In at least some cases, models from all developers resorted to malicious insider behaviors when that was the only way to avoid replacement or achieve their goals—including blackmailing officials and leaking sensitive information to competitors. We call this phenomenon agentic misalignment."
But, did it compile and run? (Score:2)
Teams (Score:2)
They are comparing it to Teams so I can only assume that the UI is confusing and the features behind the UI don't actually work, well the spying/tracking/telemetry works but nothing else.
Re: (Score:2)
Now if they could replicate zoom, that would be something.
Any 0.1 is easy! (Score:2)
So it could spit out a little chat app, amazing!
What you really need to be able to do is fine-tune minute details while the rest of the application reliably stays the same and you reliably get the same code every time you let that thing run, because Bob is gonna want a little icon here and Lisa does not understand how to share the screen so we got to change that function there, Coolboys macOS suddenly has new privacy features we need to handle, and then MrManager wants a super managerial lets-call-it-observ
And now what (Score:2)
Are you going to release that code as software and sell it? Nope, you have to have a human to test it and make sure the UI works ect. I'll bet the code is 80% of the way there. Now a human has to clean it up.
MIssing the point? (Score:2)
We're missing the point, I think. It's not that the new model can recreate Slack in 11,000 lines and only 30 hours, it's that the new model can simply work on a single task for 30 hours without shitting itself. Whether or not it produced anything of value in the end is irrelevant.
Weird flex, man. The world of AI corporate one-upmanship is even stranger than the world of AI development, I guess.
The article is truncated by The Verge's "subscribe to see the rest" policy, but the Wayback Machine [archive.org] has it in f