Asm.js Gets Faster 289

Posted by timothy on Sunday December 22, 2013 @06:22PM from the getting-better-all-the-time dept.

mikejuk writes "Asm.js is a subset of standard JavaScript that is simple enough for JavaScript engines to optimize. Now Mozilla claims that with some new improvements it is at worst only 1.5 times slower than native code. How and why? The problem with JavaScript as an assembly language is that it doesn't support the range of datatypes that are needed for optimization. This is good for human programmers because they can simply use a numeric variable and not worry about the difference between int, int32, float, float32 or float64. JavaScript always uses float64 and this provides maximum precision, but not always maximum efficiency. The big single improvement that Mozilla has made to its SpiderMonkey engine is to add a float32 numeric type to asm.js. This allows the translation of float32 arithmetic in a C/C++ program directly into float32 arithmetic in asm.js. This is also backed up by an earlier float32 optimization introduced into Firefox that benefits JavaScript more generally. Benchmarks show that firefox f32 i.e. with the float32 type is still nearly always slower than native code, it is now approaching the typical speed range of native code. Mozilla thinks this isn't the last speed improvement they can squeeze from JavaScript. So who needs native code now?"

Asm.js Gets Faster

This discussion has been archived. No new comments can be posted.

Search 289 Comments Log In/Create an Account

Comments Filter:

"So who needs native code now?" (Score:5, Informative)

by Anonymous Coward writes: on Sunday December 22, 2013 @06:27PM (#45762631)

Umm, anyone who wants their code to not run substantially slower. Seriously, do you front end programmers really think nobody does numerical simulations or other performance-sensitive work? In my line of work, I'd kill for the opportunity to make my code 1.5 times faster!

Or anything running in a VM (Score:5, Informative)

by Sycraft-fu ( 314770 ) writes: on Sunday December 22, 2013 @07:25PM (#45763067)

I get pissed when you hear programmers say "Oh memory is cheap, we don't need to optimize!" Yes you do. In the server world these days we don't run things on physical hardware usually, we run it in a VM. The less resources a given VM uses, the more VMs we can pack on a system. So if you have some crap code that gobbles up tons of memory that is memory that can't go to other things.
It is seriously like some programmers can't think out of the confines of their own system/setup. They have 16GB of RAM on their desktop so they write some sprawling mess that uses 4GB. They don't think this is an issue after all "16GB was super cheap!" Heck, they'll look at a server and see 256GB in it and say "Why are you worried!" I'm worried because your code doesn't get its own 256GB server, it gets to share that with 100, 200, or even more other things. I want to pack in services as efficient as possible.
The less CPU, memory, disk, etc a given program uses, the more a system can do. Conversely, the less powerful a system needs to be. In terms of a single user system, like maybe an end user computer, well it would always be nice is we could make them less powerful because that means less power hungry. If we could make everything run 1.5 times as fast, what that would really mean is we could cut CPU power by that amount and not affect the user experience. That means longer battery life, less heat, less waste, smaller devices, etc, etc.

Re:Suspect even at -O0 -g (Score:2, Informative)

by Anonymous Coward writes: on Sunday December 22, 2013 @07:33PM (#45763095)

Asm.js is not JavaScript. It's Mozilla's way of hacking together a very specific optimization for JS-targeting compilers such as Emscripten, because they don't want to adopt the sane route of just implementing PPAPI and Google's Native Client sandbox, both of which don't work well with single-process browsers. From a developer perspective it's fairly trivial to target both Asm.js and PNaCl (Google's Native Client except with LLVM bytecodes), or target one and write a polyfill for the other. In either case, both of these environments are for executing C/C++ native code in the browser with minimal slowdown, they don't touch run of the mill JS anyway.

Re:don't we know it (Score:4, Informative)

by Concerned Onlooker ( 473481 ) writes: on Sunday December 22, 2013 @07:36PM (#45763117) Homepage Journal

Websites are no less than distributed applications. If you had been paying attention you would have noticed that website development has gotten a lot more rigorous than in the old days.

Re:"So who needs native code now?" (Score:2, Informative)

by Anonymous Coward writes: on Sunday December 22, 2013 @07:43PM (#45763161)

Tracing garbage collected languages will always be slower because:
1) The tracing adds a ridiculous amount of unnecessary work; and
2) While allocation is at best O(N) for GCd and regular programs, deallocation can (and often is) made O(1) using memory pools in C and C++ programs, something that can't be done in GCd languages because the collector doesn't understand semantic interdependencies.
For ref-counted collectors #2 still applies.
Unless and until some unforeseen, miraculous breakthrough happens in language design, GCd languages will always be slower when it comes to memory management. And because memory management is so critical for complex applications, GCd languages will effectively always be slower, period.
But the only people who care that GCd languages are slower are people who only know GCd languages. I extensively code in C and Lua, and I love both.

Maximum precision? (Score:5, Informative)

by raddan ( 519638 ) * writes: on Sunday December 22, 2013 @07:48PM (#45763197)

Let's just open up my handy Javascript console in Chrome...

(0.1 + 0.2) == 0.3
false

It doesn't matter how many bits you use in floating point. It is always an approximation. And in base-2 floating point, the above will never be true.

If they're saying that JavaScript is within 1.5x of native code, they're cherry-picking the results. There's a reason why people who care have a rich set of numeric datatypes [ibm.com].

Re:"So who needs native code now?" (Score:5, Informative)

by dshk ( 838175 ) writes: on Sunday December 22, 2013 @08:21PM (#45763371)

deallocation can (and often is) made O(1) using memory pools in C and C++ programs, something that can't be done in GCd languages
I believe current Java (not Javascript!) virtual machines do exactly this. They do escape analysis, and free a complete block of objects in a single step. This works out of the box, there is no need for memory pools or any other special constructs.

Re:64-bit computation vs. 64-bit storage (Score:5, Informative)

by Pinhedd ( 1661735 ) writes: on Sunday December 22, 2013 @08:32PM (#45763419)

Take a look at the image at the following link
http://www.anandtech.com/show/6355/intels-haswell-architecture/8 [anandtech.com]
That's the backend of the Haswell microarchitecture. Note that 4 of the 8 execution ports have integer ALUs on them, allowing for up to 4 scalar integer operations to begin execution every cycle (including multiplication). Two of these are on the same port as vector integer unit, which can be exploited for an obnoxious amount of integer math to be performed at once. There are only two scalar FP units, one for multiplication on port-0 and one for addition on port-1.
The same FP hardware is used to perform scalar, vector, and fused FP operations, but taking advantage of this requires a compiler that is smart enough to exploit those instructions in the presence of a Haswell microprocessor only and fast enough to do it quickly. Exploiting multiple identical execution units in a dynamically scheduled machine requires no extra effort on behalf of the compiler.
Microprocessors used in PCs have always been very integer heavy for practical reasons (they're just not needed for most applications), and mobile devices are even more integer heavy for power consumption reasons.
Using FP64 for all data types is obnoxiously lazy and it makes me want to strangle front end developers.

Re:Or anything running in a VM (Score:3, Informative)

by doublebackslash ( 702979 ) writes: <doublebackslash@gmail.com> on Sunday December 22, 2013 @09:33PM (#45763707)

Just fine, at least I do. Just different sets of optimizations to keep in mind, as well as different expectations. I don't think any reasonable person would approach the two problems the same way, but it all boils down to basic computer science.
Light up pin 1 when the ADC says voltage is dropping which indicates that pressure is too low on the other side of the PPC. Compare that to indexing a few gigs of text into a search engine. Completely different goals, completely different expectations. I'm not master of the embedded domain, but I don't think it is a dark art.
Perhaps I'm looking at it the wrong way or perhaps my experience is unique or at least rare, but in my eyes it is all the same thing at different scales. Tell me my app is using too much memory then I'll first look at how I can reduce memory pressure, then I'll tell you what is and isn't possible to do and give you a list of sacrifices that would be needed to reduce memory pressure (time to refactor, concurrent operations, latency because of disk, etc etc etc. Not just talking about capabilities but the whole deal). Find the balance and go for it. On the embedded side the same sorts of compromises are made but the scale is just so much smaller. Finite number of IO pins, time to optimize your code to accommodate a new feature, meeting real-time, writing something in ASM to get around a GOD FREAKING AWFUL EMBEDDED COMPILER, etc etc etc.
I dunno, do I have my head on straight here? All seems fairly straightforward in the end. Specialists can do their bits faster than someone less familiar but with equal skill and understanding. Thats what earns the bucks, getting things done in a timely fassion.
((Or at heart I'm an embedded guy. Possible!))

Re:Or anything running in a VM (Score:4, Informative)

by Pinhedd ( 1661735 ) writes: on Monday December 23, 2013 @05:06AM (#45765261)

Ideal virtual machines are indistinguishable from networked servers. Most x86 VMMs don't quite reach this level of isolation, but the VMMs used on IBM's PowerPC based servers and mainframes do.
From the perspective of system security, a single compromised application risks exposing to an attacker data used by other applications which would normally be outside of the scope of the compromised application. Most of these issues can be addressed through some simple best practices such as proper use of chroot and user restrictions, but those do not scale well and do not address usability concerns. A good example is the shared hosting that grew dominant in the early 2000s while x86 virtualization was still in its infancy. It was common to see web servers with dozens if not hundreds of amateur websites running on them at once. For performance reasons a web server would have read access to all of the web data; a vulnerability in one website allowing arbitrary script execution would allow an attacker to read data belonging to other websites on the same server.
From the perspective of users, a system designed to run 100 applications from 20 different working groups does not provide a lot of room for rapid reconfiguration. Shared resource conflicts, version conflicts, permissions, mounts, network access, etc... it gets extremely messy extremely quickly. Addressing this requires a lot of administrative overhead and every additional person that is given root privileges is an additional person that can bring the entire system down.
Virtual machines on the other hand give every user their own playground, including full administrative privileges, in which they can screw around with to their hearts content without the possibility of screwing up anything else or compromising anything that is not a part of their environment. Everyone gets to be a god in their own little sandbox.
Now, that doesn't mean that the entire operating system needs to be duplicated for every single application. Certain elements such as the kernel and drivers can be factored out and applied to all environments. Solaris provides OS level virtualization in which a single kernel can manage multiple fully independent "zones" for a great deal of reduced overhead. Linux Containers is a very similar approach that has garnered some recent attention.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Asm.js Gets Faster 289

Asm.js Gets Faster More Login

Asm.js Gets Faster

"So who needs native code now?" (Score:5, Informative)

Or anything running in a VM (Score:5, Informative)

Re:Suspect even at -O0 -g (Score:2, Informative)

Re:don't we know it (Score:4, Informative)

Re:"So who needs native code now?" (Score:2, Informative)

Maximum precision? (Score:5, Informative)

Re:"So who needs native code now?" (Score:5, Informative)

Re:64-bit computation vs. 64-bit storage (Score:5, Informative)

Re:Or anything running in a VM (Score:3, Informative)

Re:Or anything running in a VM (Score:4, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot