Cassandra Rewritten In C++, Ten Times Faster 341
urdak writes: At Cassandra Summit opening today, Avi Kivity and Dor Laor (who had previously written KVM and OSv) announced ScyllaDB — an open-source C++ rewrite of Cassandra, the popular NoSQL database. ScyllaDB claims to achieve a whopping 10 times more throughput per node than the original Java code, with sub-millisecond 99%ile latency. They even measured 1 million transactions per second on a single node. The performance of the new code is attributed to writing it in Seastar — a C++ framework for writing complex asynchronous applications with optimal performance on modern hardware.
First post (Score:5, Funny)
Because it was written in Seastar
%ile? Are We Texting? (Score:5, Insightful)
Seriously. WTF?
Re:%ile? Are We Texting? (Score:5, Funny)
Re: (Score:2)
Well, let's see. % means its a conversion code, l means the converted quantity is a long, i means its an integer, so a long integer, but e means it's a float to be converted to exponential notation. But it was supposed to be an integer. Does not compute.
Well, yes and no - this is obviously a python string formatting operator and ile is the variable they want to format into the string; the error is laughable, really: 99 is not a valid string. Talk about clueless.
Lies! (Score:5, Funny)
That is a lie!
I think they mean the C++ port is 10X SLOWER than Java.
Java is faster than C,C++ everyone knows that!
Maybe if they ran the code on a java interpreter, written in java, running on a java interpreter...
More recursive use of java == more speed!
Why slow a system down with all that C++ bloatware?
In other News (Score:5, Funny)
Oracle has just launched a new series of patent infringement lawsuits. Oracle allegations include reverse engineering Java to improve the speed of applications like Cassandra, benchmarking Java without permission. They are seeking an immediate cease and desist order, in addition to immediate financial relief for sustaining PPS (More commonly known as Poopy Pants Syndrome.).
Re:Lies! (Score:5, Informative)
It comes from an old (15+ years) defense of Java. The claim was that Java was no longer slow thank to JIT, with HotSpot making it possible for Java code to run faster than equivalent code written in C or C++.
OP is playing the part of a turn-of-the-century die-hard Java zealot cracking under the harsh light of reality, desperately clinging to their long-cherished beliefs.
Re: (Score:2, Insightful)
And back in 1997 I remember telling a C.S. prof. that java was running like a narcoticized slug. True I was running a 66 MHz '486 at home and the university labs were Sparc 1+'s (and since the profs were running some kind of global process 'lightly' on each, they ran slower than molasses in January anyway), but Java seemed to slow them even more. He told me all the stuff about just in time compilers, byte code yadda yadda. In the end, Java is a flavor of the month from 1997. I like javascript though. I
Re: (Score:3)
An *that* is why I think JavaScript should be the first language that new devs learn. I agree it sucks as a language, but it is SOOOO accessible with plenty of examples (good and bad) online. But the barrier to entry is essentially nil. Almost any computer (and certainly any computer released since Windows 95) comes with everything you need to get started --- a text editor and a browser (and modern browsers also include developer tools).
I'll take my offtopic mod back to my cave, now.
Re: (Score:2)
And it would have been funny (or atleast, less unfunny) 10 years ago. Right now it's the equivalent of making jokes about what fools people who believed in flat earth were. The people made fun of no longer exist.
Re: (Score:3, Informative)
Not all true. Over the years I have compared "slow" languages Lisp, Java, and .Net to the "fast" C. For various odd reasons the slow languages were faster each time.
The modern Jit compilers have a huge advantage over C because they can do whole of program optimization and they can aggressively inline. Sure, one can declare C methods inline, but I compared Java and .Net to real production code where the programmers forgot to. So in practice the slow languages really were faster. And in-lined routines ki
Re:Lies! (Score:5, Informative)
Yeah, it's more to do with using a framework that helps with the aggressive use of computer resources than being in one language over another.
Some of the latency gains might be down to C++ vs Java, but the throughput is probably because the CPU is less idle.
Re: (Score:3)
Except....
The modern JIT compilers do not do the kind of performance optimisations that they could, in theory, do. It simply costs too much in development time to cater for all the combinations and possibilities, or it costs more much in CPU time to calculate the optimisations than it would save in executing the slower code.
GC is faster than malloc for allocating, but when it comes to deallocating.. its a lot slower. Obviously, it has to do a lot more like compact the heap which is a pretty slow operation.
M
Re: (Score:2)
The people made fun of no longer exist.
That's only because they fell off the damn rim, silly.
Re: (Score:3, Insightful)
The claim was that Java was no longer slow thank to JIT, with HotSpot making it possible for Java code to run faster than equivalent code written in C or C++.
Really? Sounds a bit rich to claim that an interpreted language would be faster than a compiled one, but I suppose if your interpreted program calls into some really well-written libraries, and you compiled program doesn't....
Be that as it may, I don't think it is all relevant any more. In many practical situations, Java is fast enough, and the fact that it defines and complies with a huge number of valuable standards - and is portable across HW and OS - is the main selling point. It is not a bad language t
Re:Lies! (Score:5, Informative)
Really? Sounds a bit rich to claim that an interpreted language would be faster than a compiled one,
The reasoning is because any bottleneck in code will be in a loop (or recursion, or whatever).
Java is roughly only interpreted on the first iteration of a loop, when it gets compiled by JIT. After that, it's assembly code, just like C.
Add to that, there are some optimizations that can be done at run-time by the JIT that can't be done at compile time.
These are typically the reasons people claim Java is faster than C or C++.
Also, it seems the Java creators at Sun were really competitive and got upset when people said their language was slower than C++, so they spent a lot of time optimizing the efficiency of their standard library, more than the C++ compiler writers of the time.
Re: (Score:2)
Add to that, there are some optimizations that can be done at run-time by the JIT that can't be done at compile time.
The problem with the statement is that it misses HUGE number of over-expensive optimizations which can't be done at run-time because, duh, they are slow and very resource-consuming.
Which is why Java's HotSpot might produce fast code - but the code typically is times larger than what normal compiler would do. (Consider a simple example: It is easy to unroll the loop - but optimizing the resulting code duplication to improve the i/d-cache usage is a no-go at run-time.) The consequence is that HotSpot has to
Re:Lies! (Score:5, Informative)
Why are you talking about an interpreted program? We're specifically talking about JIT-compiled Java. Modern JITs use trace-based optimisation, which means that they will generate straight-line binary code for hot paths that span multiple method calls and returns. This is something that an AoT-compiled implementation can't do without a lot of profiling information. A JIT compiler can also optimise based on assumptions that are true for one phase of the program, then throw away the result if it stops being true for a later phase.
There are also other trades. For example, if you're writing memory-safe C++ and sharing pointers across threads, then you're going to be using std::shared_ptr, which performs an atomic operation (MESI bus traffic) on every assignment. In a typical JVM, copying pointers doesn't require atomic operations, but the cost of this is the GC pass. Depending on your workload, the GC cost can be a lot cheaper, a lot more expensive, or about the same as doing it correctly with smart pointers.
Unfortunately, a big part of the current 'Java is slow' claim is from idiots who don't understand that different GC implementations are all on a spectrum trading throughput for latency and who then build big distributed systems where tail latency in the edge nodes is important, then run a throughput-optimised stop-the-world collector on the edges and wonder why it sucks.
Re:Lies! (Score:4, Interesting)
High performance software requires several things, among them good native code generation and good libraries. Java used to have neither, then it got the JIT. Unfortunately, Java's semantics and built-in data types make writing high performance software in it really hard.
C++ started out with good native code generation, and its standard library and built in data types make writing high performance software a bit easier if you know what you are doing. Most C++ programmers don't know what they are doing, though, so their software ends up bloated and inefficient anyway.
Re: (Score:3)
Still, there's no good reason for it to be that much faster, unless the Java was also incredibly crappily written (which is quite likely). They are both compiled languages, written at about the same level. Something seriously wrong must have been going on in that code.
Note: This is from a professional C++ developer, who also happens to have done his Master's thesis on Compiler Construction. Admittedly, I don't care that much about Java, except in theory. Have only used it a few times. But either there was
Garbage collected virtual machines! (Score:5, Insightful)
Almost as fast as native! Maybe even faster for some tasks!
sure
Out of points or would mod up (Score:3)
Sans sarcasm I would've also accepted: "duh"
Re:Garbage collected virtual machines! (Score:5, Interesting)
-Bjarne Stroustrup
Re: Garbage collected virtual machines! (Score:5, Insightful)
Re:Garbage collected virtual machines! (Score:5, Insightful)
Most of what they've done seems to be rearchitecting, not getting a simple speed boost from using an unmanaged language. They're bypassing the OS to get more locality and cache retention. Those problems would not be addressed by merely rewriting in C++.
For one, they've replaced the OS network stack with an in-process one, where each thread gets its own NIC queue so they can have "zero-copy, zero-lock, and zero-context-switch[es] [seastar-project.org]"
They're also keeping more data in memory and eschewing relying on the the OS file cache. It seems like they're taking every opportunity to use the in memory representation to avoid using sstables. They try harder than Cassandra to update instead of invalidate [scylladb.com] that cache on writes.
Re:Garbage collected virtual machines! (Score:5, Informative)
The headline is rather misleading. This isn't just a plain port of the code from Java to C++ to get a magical 10x speedup. Amongst other things they appear to be running an entire TCP stack in userspace and using special kernel drivers to avoid interrupts. This is the same team that produced OSv, an entirely new kernel written in C++ that gets massive speedups over Linux ..... partly by doing things like not using memory virtualisation at all. Fast but unsafe. These guys are hard core in a way more advanced way than just "hey let's switch languages".
Re: (Score:3)
As for Java vs C++, I expect that most enterprises put stability and portability well above raw speed when developing software. Modern JVMs do just in time compila
Because it was written in Seastar or C++ (Score:2, Insightful)
This is the trademark reason why Java shouldn't be used in performance sensitive environments in the first place.
As for would it have been any faster if it was written in C or straight ASM, probably not worth chasing down that extra 1%. Generally the justification for straight C or ASM is to remove runtime bloat, and you'd first have to give up using any frameworks to get there.
Just to remind potential programmers. Lean C before you learn any other programming language, otherwise you will not understand why
Re: (Score:2, Funny)
Re: Because it was written in Seastar or C++ (Score:2, Interesting)
Cassandra is nothing to sneeze out since it outperforms other db-engines (which are written in C, like MySQL).
Anyhow, you use the right tool for the job, and the big question is: would ScyllaDB even exist if Cassandra wasn't written first?
Re: Because it was written in Seastar or C++ (Score:5, Insightful)
Cassandra is nothing to sneeze out since it outperforms other db-engines (which are written in C, like MySQL).
Cassandra and MySQL are very different types of databases designed to handle different tasks. It's like saying a hammer is better than the saw without mentioning what job needs to be done with it.
Re:Because it was written in Seastar or C++ (Score:5, Insightful)
Just to remind potential programmers. Lean C before you learn any other programming language, otherwise you will not understand why your code's performance is terrible.
It may not be apparent even then. Java looks an awful lot like C++ at the code level. So... what's different? Java (and other managed languages like C#) have a bunch of neat features like reflection and automatic memory management, which inherently comes at the cost of runtime efficiency. Simply learning C or C++ won't point out exactly why those languages are so much faster than managed languages. You can write nearly the same code in C++, Java, and C#, and you'll see C++ win performance benchmarks - at least in all but the most contrived examples.
Among the more significant differences are that C++ compilers are extremely good at optimizing, and C++ code generally compiles down to better cache-coherent structures than other languages. The difference is in the language itself, which adheres to a zero-cost principle, in that you don't pay for features you don't use. A lot of C++ abstractions are eliminated *entirely* at runtime, and are only used to protect the code's integrity during the compilation phase. We were told for years that native-equivalent performance was just around the corner or even already here, and it just never really happened outside of small, contrived benchmarks.
I don't think it's necessary to always learn C or C++ first, although I do think it's worthwhile to learn it at some point, simply because there's a lot of it out there. I'm primarily a C++ programmer myself, but I tend to be a bit more pragmatic about language preference. Use the language that's right for the job. For example, C is a *horrible* choice if you're writing a simple application that needs to do a bunch of string processing. In many cases, high performance isn't even a consideration, rather than correctness, security, and development speed.
Re:Because it was written in Seastar or C++ (Score:5, Interesting)
I would say that 95% of all people I know in person, who learned C first and not: Assembler, Pascal, SmallTalk, Lisp are extremely bad on advanced language concepts like functional or oo programming. Most of them shifted to scripting and operating servers and don't "code". A minority is doing embedded programming in C++ which mainly looks like C.
The idea that learning C first has any advantage is completely bollocks, a /. myth.
I started with C in 1987 ... on Sun Solaris (after 6 years Assembler, Pascal and BASIC) ... 1989 I switched to C++. I never looked back.
Only masochists would look back at C of that period.
ANSI C is much better ... but still: when I see a self proclaimed C genius with 30 years experience program Java or C++ ... shudder.
Re:Because it was written in Seastar or C++ (Score:4, Insightful)
I would say that 95% of all people I know in person, who learned C first and not: Assembler, Pascal, SmallTalk, Lisp are extremely bad on advanced language concepts like functional or oo programming. Most of them shifted to scripting and operating servers and don't "code". A minority is doing embedded programming in C++ which mainly looks like C.
Almost no one learns to program in assembler, Pascal, SmallTalk, or Lisp as their first language these days. It's all Python now, or Java.
Re: (Score:3)
Not sure why people learning Pascal, assembler, or Lisp first would be better at OO. There's nothing OO about any of those. I would turn that around and say that 95% of programmers are bad at OO programming, period, regardless of what language they started with. Most folks frequently forget what are, IMO, som
Re: (Score:3)
I would turn that around and say that 95% of programmers are bad at OO programming, period, regardless of what language they started with.
That's because 95% of written OO solutions don't fit an OO domain. There is this myth that OO is the best we have, but in reality OO is very counter-intuitive to the human brain. Most OO solutions would be better off structured. The human brain handles that much better than OO.
Re: (Score:2)
*shrugs*
I find functional programming to be dubious as a general tool, because it doesn't map onto the way people think about doing things. When I think about how to cook a meal, I think about putting particular ingredients together. I don't think about creating a list of items and combining operators, then magically evaluating those combinators all at once and getting a cake. The entire notion of lazy evaluation is simply antithetical to the way most people think about doing things. Functional program
Re: (Score:3)
Your characterization of functional programming is pretty astonishing, to me as a person who was most recently employed writing and maintaining software in Clojure. The following are all surely idioms, features or possibilities of one or another FP language/approach, but none of them are essential to FP:
1. "creating a list of items and combining operators, then magically evaluating those combinators all at once and getting a cake" — I think the reality is that this can be overdone (and sometimes it is
Re: (Score:2)
Ask anyone and they'll tell you that you're doing it wrong, I'm doing it wrong, that guy over there is doing it wrong... The only one not doing it wrong is the person you asked. Ask someone else, and they'll tell you the first guy is doing it wrong!
This is a notable truth. The thing with (even mildly complex) OOP (and programming in general) is that there is not one perfect solution. The solutions only differ in the tradeoffs one has to deal with at that time and in the future. The (very hard) challenge is thus to choose a solution (a set of design patterns, a sensible entity model, etc.) that has the least troublesome tradeoffs for the project at hand.
Given that people tend to become fanboys of pretty much everything they have some extended experienc
Re: (Score:2)
Re: (Score:2)
But how long after C89 was ratified did it take angel'o'sphere's employer or school to acquire a compiler that conforms to C89?
Re: (Score:2)
I doubt any school/univeristy around that time in Germany used C as teaching language.
C is horrible as a teaching language. Typical languages where Pascal, in small extend C++, then Java, Eiffel, Sather. Most European universities prefer to use their own pet languages for teaching, e.g. Sather-K in my case at University of Karlsruhe (TH), now called KIT. In Swizerland it was Oberon, in France partly Eiffel, in Scandinavia Simula and BETA.
A few years after I started studying (I learned programming in school
Re:Because it was written in Seastar or C++ (Score:5, Insightful)
That is only true if you haven't written a string processing library. Which pretty much anyone who is going to address tasks like this will do, presuming they just don't go out and find one already written. Same thing for lists, dictionaries, trees, GEOdata, IPs, etc. Whatever. There's nothing that says one has to use C's built-in model for strings, either. Make a better one. It was one of the first things I did, and I did it in assembler, as soon as I ran into the convention of an EOT embedded in the actual text being the end marker -- I thought it was stupid then, and I didn't think a zero was any smarter when C first came to my attention lo those many decades ago. It's also a bear trap anyone can throw a bear into with regard to vulnerabilities -- one that can be entirely obviated by a decent string handling module.
C isn't a bad language to do *anything* in. It's just a language that requires you to be competent, or better, and to address it through the lens of that competence in order to get enough out of it to make the result and the effort expended worth the candle. And no, if the programmer doesn't write in such a way as to almost always create generally reusable components, I'd not be willing to apply the appellation "competent" to the programmer.
C's key inherent characteristics are portability, leanness and close-to-the-metal speed. It doesn't hold your hand. It's a language for experienced, skilled programmers when we're talking about creating actual products that are expected to perform in the wild. Lean code isn't nearly the issue it used to be, but it's still "nice" to have.
Re: (Score:3)
Re: (Score:2)
Lean code may be an issue if performance is critical for your application/system.
If you are writing an app to manage photos, you had better favor reliability over performance or you might receive death threats from moms and dads whose baby photos have vanished.
Re: (Score:2)
Barring specialized service providers, compute costs are seldom the biggest item in company expenses.
Re: (Score:2)
That is only true if you haven't written a string processing library.
Memory management is still a pain though, because a lot of times you want to create a lot of new strings when you are doing splicing and inserting etc.
Re: (Score:2)
Certainly there things which are more easily parsed by regular expression.
Re: (Score:3)
I'm not trying to slam C. You can do just about anything in C - that's one of it's strengths. I'm just pointing out that it's not the *optimal* choice for certain types of tasks, in my opinion. C has advantages in it's relative simplicity, portability, and power. Moreover, it works very well as a "least common denominator" language, in that nearly every other language can easily interop with it because of it's stable ABI. This is why nearly every OS and many widely-used libraries are written in pure C.
Re: (Score:3, Interesting)
1. C is not portable, it's tied to the architectures/OSs/APIs the programmer chose to target at write time.
2. Leanness and close-to-the-metal speed are irrelevant in most business scenaios (time to market rules, cores and memory are commodity, see ABAP and related monsters successfully running most of the world transactions regardless of C).
3. C is not a language meant to implement business solutions, it's a wrapper for ASM for idiots who can't write ASM themselves.(rethorical)
4. Writing string processing l
Re: (Score:2)
I agree, legacy technology is the best case use for C.
Re: (Score:3)
> Among the more significant differences are that C++ compilers are extremely good at optimizing,
LOL. No they aren't [youtube.com]
Mike Acton gave an excellent talk Code Clinic 2015: How to Write Code the Compiler Can Actually Optimize where he picked an integer sequence to optimize the run-time to calculate the sequence. Techniques include: memoization, and common sub-term recognition. For 20 values pre-optimization time was: 31 seconds, post-optimization time was: 0.01 seconds.
Linked above.
Re: (Score:2)
C++ compilers are pretty good at optimizing the code you write (subject to aliasing issues inherited from C, and so forth). They're not functional compilers that construct all kinds of state behind your back in hopes that it will be useful. If your complaint is that they don't do things like memoize results for you, somebody probably has a nice C++11 header that will, and most C++ developers will ask why you think a C++ compiler should memoize things without you asking.
Re: (Score:3)
The points are two fold:
1. Naive use of algorithms and OOP without understanding the data flow will always be slower then understanding and optimizing for the (data) cache usage.
Pitfalls of Object Oriented Programming [cat-v.org]
2. C/C++ compilers do a really shitty job of optimizing even trivial code.
CppCon 2014: Mike Acton "Data-Oriented Design and C++" [youtube.com]
Mike demonstrates a simple example where a bool member flag is used as a test. MSVC does a horrible job at O2; Clang does a much better job, but still crappy. (Note: U
Re: (Score:2)
Your first link doesn't say anything about compilers -- it's about cache locality and branch prediction, and how the application's architecture can make those more or less of an issue.
I watched part of the first Mike Acton video you linked, and was reminded why I hate watching videos to try to learn something: People talk too slowly and basically never organize their information for efficient understanding or consumption. I'm not about to sit through another one to find out he is mostly complaining about h
Re: (Score:3)
You have to consider that compilers are also going to perform a wide variety of micro-optimizations that humans simply couldn't do on a massive scale, over millions of lines of code. No one would argue that a compiler can radically restructure your algorithms during optimization, because it doesn't know which side-effects are acceptable and which are not. So, yes, human programmers still need to be aware of how to structure code for best results on a given platform.
Of course, you can always find specific
Re: (Score:2)
You may also be interested in my followup [slashdot.org]
I linked to second trivial case, which in practice tends to show up time and time again in typical game code, using member bools where C++ compilers fall completely over.
Compilers have a very narrow range where they are very good. Outside that domain, they suck at generating optimal code.
As I say there are 3 levels of optimizations:
1. Micro-optimization aka bit twiddling
2. Algorithm optimization
3. Data-flow optimization. Data Orientation Design focusing on the commo
Re: (Score:2)
Yep, I don't disagree with you. When I talked about optimizations, I was of course only talking about case 1. Anything above that certainly requires human-level work, and typically a substantial effort and deeper knowledge of the compiler and platform.
I wonder if the compiler does a better job if const is properly used? It's meant as a compiler hint, so that the compiler can be more aggressive because it knows there are supposed to be no side effects in the functions.
Also, I'd have a serious talk with a
Re: (Score:2)
Re: (Score:2)
At the micro level, yes, agreed.
At the macro level. Not even close.
Re: (Score:2)
Re: (Score:2)
On average they are better at optimizing than a human could ever be.
What? No way! If you want to do better than the compiler, follow these steps:
1) Compile your program, get the assembly output from the compiler.
2) Find improvements (this will not be hard). Profile to determine how much faster your changes are.
3) After many repetitions of this, you will get good enough to write faster than the compiler without getting the assembly output.
Compilers do better than an average human writing assembly, but that's only because most humans aren't good at writing assembly. It'
Re: (Score:2)
It may not be apparent even then. Java looks an awful lot like C++ at the code level. So... what's different? Java (and other managed languages like C#) have a bunch of neat features like reflection and automatic memory management, which inherently comes at the cost of runtime efficiency. Simply learning C or C++ won't point out exactly why those languages are so much faster than managed languages. You can write nearly the same code in C++, Java, and C#, and you'll see C++ win performance benchmarks - at least in all but the most contrived examples.
All that stuff comes at a cost, but not a 1000% cost.
If you're seeing that much of a speedup, then it's likely you were doing silly things in the Java version.
I'd rather write C than Java, but let's be honest about the performance: Java's not that much worse.
Re: (Score:2)
C++ code generally compiles down to better cache-coherent structures than other languages
It's possible to do cache-coherent programming in managed runtimes - it's mostly about knowing the rules though, and people who've never used unmanaged languages are less likely to know the rules.
Everything that's an "object" is a pointer, and they'll be scattered all over the heap. The only way you get cache-coherency is using structs, which not all managed languages have, or arrays, which even managed languages allocate as a contiguous block of memory.
I've written sorted collection classes in Java that us
Re: (Score:2)
Just to remind potential programmers. Lean C before you learn any other programming language, otherwise you will not understand why your code's performance is terrible.
C doesn't let you understand why people's code has terrible performance. C has the same problems that make code slow - reference semantics. It's a mistake to conflate low level with "performance". For example, std::sort is faster than qsort, and you can't understand why just by understanding C.
Re: (Score:2)
Another justification of C (Score:2)
Generally the justification for straight C or ASM is to remove runtime bloat, and you'd first have to give up using any frameworks to get there.
Another is if you have to security audit the result and protect it from attack, as in OSes. C++ can generate stuff that isn't obvious from the local source code - thanks to definitions, overridings, and the like. (Linus makes this point - it's why the Linux kernel is in C and will stay there for the foreseeable future.)
But that shouldn't be enough of an issue here
Re: (Score:2)
User mode memory mapped network stack (Score:2)
The real reason is much more nuanced than language differences between C++ and Java. The Seastar network architecture bypasses kernel TCP/IP stack entirely, but instead implements user mode TCP/IP stack using dpdk, which allows user mode to poll network card's packet buffer directly over memory mapped I/O. The user mode stack runs on single core only, but you could run multiple instances on multiple cores. It can scale linearly because there is very little shared state across cores.
C++ with custom network s
Re: (Score:2)
As for would it have been any faster if it was written in C or straight ASM, probably not worth chasing down that extra 1%.
Assembly can give you huge performance gains, much more than 1%. One of the reasons why is because it gives you more control of caching. Paul Hsieh has written quite a bit on this topic, it's worth checking out.
Re: (Score:2)
learn a proper object oriented language before you write any serious programs
Although I agree with the general gist of your comment; I could not disagree more strongly about learning a "proper object oriented language". I would rather have you learn functional programming than object oriented; but optimally both. The problem is that allot of object oriented code is redundant and bloated, generally stemming from kingdom of the nouns [blogspot.de] type syndrome. Hybrid languages like C++ or JavaScript are genius in that you can mix functional and object oriented programming to form a concise soluti
LMDB (Score:2)
yaaaa... but are they using Lightning Memory Database (LMDB) as the back-end? http://developers.slashdot.org... [slashdot.org] https://en.wikipedia.org/wiki/... [wikipedia.org]
Yes, but is it web scale? (Score:2)
Sure, but is it Web scale?
Re: (Score:2)
Reference [youtube.com] for the unfamiliar.
Now returns null pointers in half the time! (Score:3, Funny)
They also boosted performance by never freeing memory, too!
Re: (Score:3)
It's a miracle! C++ makes disks spin 10x faster! (Score:2, Interesting)
Databases are usually I/O bound and improvement of storage structure/network protocol is more important than spot optimization of code. A more likely statement is that scylladb performed ten times faster than Cassandra in one particular benchmark for which Cassandra has not been specifically optimized for yet and is ten percent faster in an average case.
In either case, good luck maintaining speed and stability after 5 releases when you implement every corner case of every feature and have to deal with legac
Re:It's a miracle! C++ makes disks spin 10x faster (Score:5, Insightful)
Databases used to be disk bound, sure. But these days we have huge RAM caches and SSDs - no spinning disks. It's very common for the vast majority of requests to be served entirely from cache. Read the guys' site - it looks like they know what they're doing.
Imagine if Redis was ten times slower or ten times faster. It would matter.
Re: (Score:3)
Yes. It's now easy to scale to a million or more IOPS on a single server. That makes the CPU the bottleneck again.
I find it depressing... (Score:3, Insightful)
I find it depressing that so little attention is paid to efficient computing. People now just throw memory and cycles at problems because they can with passable results. But I wonder how much more we could get out of our machines if software was carefully crafted from bottom to top.
Re: (Score:2)
You're looking in the wrong place. Try prying open your smoke alarm or CO detector. It's probably got a PIC12F or 10F, or an ATTiny or something. Those things have 1k words of program memory and a few tens of bytes of ram (typically). People pay a lot of attention to efficient computing. Too big a program means the model up, or an extra 5c per unit which clobbers profit margin. Too inefficient and the non replacable battery lasts 3 years not 5.
Or look at the compute bound stuff. FFMPeg gets faster year on y
Rewrites are easier than the first strike (Score:5, Insightful)
Wow, two years ago everyone here told us that NoSQL is evil and tried to convince us that we should stick to MySQL.
Now everyone tells us Java is evil, because a rewrite in C++ is faster.
What a surprise.
If I would rewrite Cassandra from scratch, in Java, it also would be faster than the actual code.
Why? Because all the learning the original team did over a course of a decade I can reuse and improve on.
Keep in mind, the rewrite uses a new framework and new concepts for concurrency. Concurrency is one of the core areas where computing in future will certainly make lots of progress.
I for my part I'm waiting for a Lucene rewrite, regardless in what language. Probably the worst OSS code I have ever see ... actually the worst code regardless of OSS or closed source.
Re: (Score:2, Insightful)
Great (In the unlikely event you could actually single-handedly rewrite it at all.) Now make it 10 times faster. Nice Red Herring post though!
Re: (Score:2)
I did not say I make it 10 times faster, I said: I make it faster.
It is a no brainer that the speed increase has nothing to do with the language used but with better architecture and better approaches. That can be easy repeated with a Java rewrite.
Why should it not be possible to single handed rewrite it in a reaonable time?
Re: (Score:2)
Perhaps that is the conclusion one reaches without a brain, but people with an actual brain and an understanding of the fundamental differences between C++ and Java know that is complete bullshit, and recognize you as the incompetent posturing for recognition with no ability to back your ignorant bullshit up with actual skills
Re: (Score:2)
Perhaps you shouzld read the plenty full of Posts here in this thread that explain why the new C++ rewrite is faster. Or simply go to the vendors web page. It is very well explained :D
You Sir, are just an ignorant idiot without a brain. Or you had grasped what I implied with my previous comment :D ... sorry to use your own wording on you, but it seemed fitting.
Re: (Score:2)
Know doubt your grasp of programming languages is similar to your grasp of the English language. Nobody is arguing that re-architecting isn't a major advantage. The point is that one cannot attribute the entire improvement to it, and claiming that you could use Java and acheive similar results represents your blatant blathering to the world that you lack an understanding of language architectures. Put Bjarne Stroustrup and James Gosling together
Re: (Score:2)
Would it be sufficient to pick a badly coded C++ application and write a better implementation in Java? That way Java would shine like C++ does in this case.
Did it in the past. 6 times faster to be exact and we are talking about java 1.3 back then, which barely got basic symantec(?) jit, no Hotspot yet.
I was trying to explain to my boss that original application was braindead and I just optimized the sqls they were using while rewriting the code but he touted 'that new java technology making everything 5-10 times faster' to client anyway...
Re: (Score:2)
I will admit that I don't quite understand the fuss about "NoSQL".
It's just a two column table with a primary key and a data blob. Congratulations, I guess. Yes, a specialized piece of software for this might be fast, but it's not anything new or innovative. It's just a two column table with a bow on top.
Re: (Score:2)
There are plenty of different variations of NoSQL Databases. Column Stores like Cassandra (with unlimited ampoounts of columns, *cough* *cough* how useful would a storage with only two columns be?), document based like Mongo DB, XML based or simply graph based ones (e.g. OrientDB and Neo4J).
Re: (Score:3)
I do agree that a language-to-language rewrite would yield impressive gains... but that's not the whole of it. Cassandra is an edge case ... and yes, the Lucene code could use some love (contribute some patches??)
C++ isn't necessarily the best choice for everything, just like a Mclaren F1 isn't the optimum choice to pick up groceries. But if your requirements dictate that performance is a chief priority, it most certainly is.
I've written many Java and C++ systems at scale. Java simply does not excel at maxi
Re: (Score:2)
Re: (Score:2)
Fads are fads. Today's 'best practices' are tomorrow's 'horrible mistakes'.
You're right that rewrites often result in a significantly better product. I suspect that, in addition to your reason, the two are correlated.
Re: (Score:2)
> Wow, two years ago everyone here told us that NoSQL is evil and tried to convince us that we should stick to MySQL.
The proper alternative to NoSQL would be an SQL database, rather than the misleadingly named MySQL database.
Re: (Score:2)
A SQL database is only an alternative to a NoSQL database if having chosen the later was an architecture mistake.
I never have seen usage of an NoSQL database where it made sense to switch to a relational one (SQL).
If you have examples, I guess many are eager to hear them.
Re: (Score:2)
Are you saying that MySQL is neither SQL nor noSQL?
But is it web scale? MongoDB is web scale... (Score:2, Funny)
I will only use MongoDB because it is web scale.
Re: (Score:2)
Re: (Score:2)
What optimizations are valid on AMD/Intel x86-64 but not on ARM AArch64?
Re: (Score:2)