Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Programming IT Technology

Dynamic Cross-Processor Binary Translation 179

GFD writes: "EETimes has a story about software that dynamically translates the binary of a program targeted for one processor (say x86) to another (say MIPS). Like Transmeta they have incorporated optimization routines and claim that they have improved execution times between one RISC architecture and another by 25%. This may break the hammer lock that established architectures have on the market and open the door for a renaissance in computer architecture."
This discussion has been archived. No new comments can be posted.

Dynmamic Cross-Processor Binary Translation

Comments Filter:
  • by Anonymous Coward
    AFAIK the Linux port of Basilisk II does exactly this with 68k code on x86 and has for quite some time (a year or so maybe more). As to the user who was asking why you would want to do this sort of thing emulators are a perfect example. Basilisk II is a stellar example IMHO, my mid-range PC's run 68k code under Basilisk II much faster than any real 68k mac I've ever seen
  • by Anonymous Coward

    Actually, it sounds like what DEC developed for translating legacy binaries from VAX to ALPHA.

    Think of it this way: A regular compiler converts source code into object code, this product takes object code for machine A and outputs object code for machine B. It's basicly a compiler that parses object code input.

    Normal emulation parses object code and then pretends to be the machine it was written for, this has to done every time you want to run the program. With a translator like this, you translate the object code just once.

    Of course, if you have the source, you don't need this, at worst you need a cross compiler.

  • by Anonymous Coward
    since the first powermacs, macos has been translating 68k code to ppc code on the fly. also doing some optimization.
  • by Anonymous Coward
    There are many examples of this, with source code, at http://www.cybervillage.co.uk/acorn/emulation/dynr comp.htm [cybervillage.co.uk]

    So far nobody's sued for patent infringement, and there should be plenty of prior art if anyone does. Of course, that won't stop assholes like TechSearch from harassing people anyway.

  • by Anonymous Coward
    A DAG is not the same thing as a tree. Try this [uwaterloo.ca] for a refresher.
  • NeXTStep/OpenStep, Darwin and Mac OS X support cross-platform binaries that let you take a single app and run it across processors. You do (generally) have to run it on the same operating system, because of the APIs, but as long as the APIs are there, it works.

    In reality, the compiler produces multiple binary executable files, but they're all contained within an application bundle (or package, I can never keep the terminology straight), which is really a folder containing lots of files, but which appears in the GUI to be a single double-clickable application. Localization can be done this way too; you just include all your text strings in each language, and choose whatever's appropriate based on the user's OS-wide preferences.

    And by the way, the old Mac ROM is no longer an issue on Mac OS X, and on classic Mac OS it uses a file on the hard drive instead of querying the actual ROM whenever possible. This was one of the changes made when the original iMac was released.

    --

  • If POSIX was so great, why do we have autoconf?

    To pick up the small shit, obviously. And to deal with systems that predate posix. Autoconf is not, fortunately, smart enough to make porting to Really Stupid systems possible.

    Don't bother answering that one, anyone who has written "portable" software knows the Real Answer.

    The implication that I haven't is false. The point is valid: the vast majority of non-mainframe vendors (even Apple, finally) are within some fairly small delta of POSIX; for sake of demonstration let's say 1 meter. On the same scale, Microsoft is somewhere near Mars. If they don't want to follow standards, there's no reason to bother developing software for their systems, since they've elected to make it unnecessarily difficult to do so. There's just no excuse for that. Be fucked its developers over and went bankrupt because the developers walked away. No reason the same thing won't happen to our favourite Keeper of Evil.

  • Agreed, the whole exercise is pointless. Anybody who doesn't even know what kind of CPU and OS his or her computer is running is 99.99999% certainly using some DOS variant on a peecee. Anybody using a non-peecee can simply look at the label on the front. And it's not too hard to know what OS you're using - Unix users have a pretty good idea, as do mac users (at leasy they can say "macos"). DOS users can safely be assumed to support a certain subset of functionality, and so on.

    Let's put this in perspective. People are generally capable of going out and downloading the appropriate binary for their system when presented with a list of options. It's not much of a leap from there to downloading an appropriate compiler for their platform. And typing "make" ain't too tough. Bottom line: Even inexperienced computer users are generally capable of reading and following simple directions.

  • Sometime around 1995 Apple's macintosh system went from the 680X0 architechture to the RISC base PPC achitecture. Since there were no applications incuding the MacOS itself that was written for this new architecture, at the PCC macs introduction, just about nothing was powerMac native. Despite this annoying problem, most casual software ran faster and the OS fealt far more responsive. Apple manged to do this by re-writting only a few routines in the MacOS. Basicaly they followed the 90/10 rule and re-wrote the most used 10% of the code.

    This gave the first initial boost.

    Later the gurus over at Connectix created this program called SpeedDouble which aimed to speed up you Mac by 2x. On power macs, one of the ways this was done was by using a super cool dynamic, re-compiling emulator that would save decoded instructions in a sort of cache and simply execute those instead of going through the emulation process all over again. This product wa smarketed as speeding up your Mac by 2X and while it didn't, the speed increase it offered was still remarkable enough to justify purchasing it (or at least pirtating it :-). I Still run its decendent to this very day on my old 5200. ) for most powerMac users. Later on Apple included its own version inside the so called PCI PowerMacs and up.

    This was the second speed boost.

    Over the course of the next few years, Apple embarked on an agressive OS development schedual which, among other things, brought more PPC code to the system. It was funn because the new PPC code would almost always cancel out the fact that the OS was slower because of these new features but if you read the littterature the OS was getting faster with each revision by leaps of bounds. If you follow the posts, Apple's OS should have become self aware sometime around 1999.

    The funny things is, to this day, the MacOS stilkl have an absurd amount of 68k code left in it, yet it's faster than MacOS X by leaps and bounds despite the fact that MacOS X is 68k free...

    Anyways.. As a memeber of the mac community up until recently, I've heard all sort of promissing emulation technologies. They never work. The best I've ever gotten i real world performance is 1/8 the speed. Forget it. This technology is not worth following. Go visit my web page to see new promissing technology :-)
  • I thought about something similar. Perhaps it was brought up some months ago when HP started telling people about Dynamo. I was thinking how interesting it could be to apply this as a post-compilation optimization technique. There is every reason to believe that the same techniques can be used to analyze and improve native code performance too.

    In fact, this could be an interesting kernel module development project. Allow the Linux kernel to optimize running executables on the fly. Create permanent entries in the filesystem to store optimized binaries and perhaps a running binary diff so that the optimizer could undo something if need be. If enough of the research is in the literature and someone in the audience is frantically searching for a PhD research project...
  • My opinion is that the whole history of computing has been moving mainframe technology down to the personal computer. Virtual Memory and emulation combined together seamlessly will in time be the norm on personal computers, at least I hope so. The idea is that the user can just click on an old Macintosh application or an old DOS application and automatically boot into the appropriate virtual machine with emulated CPU.

    Once upon a time Detroit said that what people wanted was more and more power out of their automobiles. Then the Japanese came up with the astounding brilliant innovation that what people really wanted was reliability and fuel economy. Right now the computer industry more or less says that we need more and more CPU power, but I think they are going to eventually find what consumers want is reliaibility and compatibility. They don't want to be told "Program X will not run on your computer". Instead they will sacrifice a little performance to have something that will be able to run all of their older games and other programs.
  • Digital was doing this with a tool called VEST in about 1990-1991. VEST would translate VAX binaries into Alpha binaries... When the first version of OpenVMS/Alpha came out, several of the OS tools were VESTed, rather than ported. The EDT editor is one I can think of.


    there are 3 kinds of people:
    * those who can count

  • A DAG is not the same thing as a tree. Try this [uwaterloo.ca] for a refresher.

    What the AC said (not having any mod points at the moment). (All trees are DAG's, but not all DAG's are trees - a node in a DAG can have more than parent (e.g. Java inheritance puts classes in a tree, but C++ is a DAG)).
    --
  • OH SORRY.

    FX86! or whatever it was that DEC and MS co-developed to run x86 apps on NT on Alpha's running NT
  • The project leader Cristina Cifuentes

    So that's where she got to. She taught me first year CompSci at Tas Uni (Hobart) in 1996. Small world.

  • If you were a friend who sent me a neat program that you'd written and wanted to show it off (complete with !!!!!'s in the title), I'd just delete it.

    Remember that rule about not running attachments, even those sent to you by friends.

    Subject: Check out this great program I wrote

    Dood, this is leet. Run this.

    Attachment: leetapp.exe

    P.S. I wrote this in leetspeak first because it was funny, I thought, and Slashdot said the following:

    Lameness filter encountered. Post aborted.

    Reason: Junk character post.

    Maybe this should be cross linked to all those comments about Microsoft adding McHappyLinks to your web site in their browser, or censorship.

    Personally I think it's quite reasonable to use 'leetspeak' sarcastically in an attempt at humour, but obviously the autofilters of slashdot don't agree. Bastion of free fucking speach indeed.

  • You can do this w/ SGI's native compiler. They have a tool called SpeedShop. Using a command like

    ssrun -ideal foo

    (or something like that: books @ work) and then using another tool with a command like:

    prof -feedback_somethin foo.ideal.m1234

    where the foo.ideal.m1234 is a file created as the process (w/ pid 1234) ran, you get some feeback file called foo.fdb.

    Then, you recompile your app using a switch where you give it the name of the feedback file. Voila! Unfortunately, I only have tried it with a toy executable, and performance actually decreased slight over using -Ofast=ip32_12k and other options. I dunno if gcc has something similar. I certainly don't remember seeing it in any of the documentation.

  • No, my point was to develop something that
    would translate the OS and all other binaries
    as a whole. Think of ALL your binaries as one
    system. Now binary cross-compile. And yes,
    you'd have to identify BIOS calls and do something
    equivalent on another system. The reason why I
    chose Windows as an example was because it is a
    huge mess of cross-dependent code and judging
    by its stability, not all the code is cosher.
    If your binary-cross-compiler can handle Windows
    it can be presumed to be good.
  • Anyone else have more to say about FX!32? I'd be interested in more info.

    FX!32 we pretty cool. It was made up of 3 basic components: an emulator, a binary translator, and a system call hack.

    The emulator was used the first time an x86 program was run. It was pretty dog slow, as most instruction-level emulators are.

    The binary translator was used to recompile that binary code into native alpha code, eliminating the emulation overhead. This was done on the fly; chunks of code that were emulated were translated and stored to eliminate the need for further emulation.

    Finally, the system calls were trapped and mapped to the native Alpha system calls, meaning you took basically no performance penalty for any windows API calls.

    Together, this let you run x86 applications at about 70% of native speed (once you were out of emulation). It worked really well. Unfortunately, those NT Alphas weren't cost effective when compared to their x86 counterparts, except for in some pretty specialized, floating point intensive fields.

  • You're somewhat correct. I'm really not satisfied with the optimisation matrix they present at that (pr-heavy) site. The only optimisation flag they used was -O. I'd like to see the impact of the runtime-bbt engine on hairier optimised code. I've not dug deeply enough to find out if other information has been released.

    However, this is tangental to the real purpose of Dynamo. It's primarily intended as the second stage of a dynamic instruction translation engine. The fact that they could run native code on it and see performance improvements was a bit of cake. :-)
  • I think you are wrong. The horsepower costs battery life and THAT is what the consumer wants more of when using a cellphone.

    The second reason is that JVM's are not that universal now either in normal computers. Has anyone seen browsers, real games (not tic-tac-toe or equal) or compilers in Java? I don't think that cellphone manufacturers are going to leap ahead of their desktop predecessors. More effective use of things as MIMD, multiple cores etc etc.

    I think Java is bloated, just like C++. What we need is better compilers so we can more effectively use the available processing power.
  • They did a really clever thing of identifying long "runs" of code that nobody ever jumped into the middle of, then they treated them as one big instruction with a lot of side effects, and optimized them as a block. (Not one instruction at a time, but the whole mess into the most optimal set of new platform instructions they could.)

    "Basic-block analysis", which is basically what this is, is a common technique that all good compilers perform on source code or intermediate representations of source code for optimization. This technique has probably been around since the 1960's.

    It was quite clever. It's also quite patented, and has been since before the Power PC came out. (And in a sane world those patents would have expired by now, but with patent lengths going the way of copyright...)

    The patent application must have read "basic-block analysis ...but on machine code!", to match all of the "...but on the web!" patents that have been granted in recent years. Innovation is truly dead.
  • I believe Alphas did something like this. The only real problem is that unless the underlying hardware is very much like that on the software's native platform, the os likely won't boot. To make stuff like this work you'd either have to do the whole motherboard in software (like virtualpc) or sell it as an add-on(orangemicro and apple used to do this), neither solution offers comperable performance to native hardware.
  • You know a DAG isn't a tree, right?
    --
  • I would have though that endianness (sp?) would be one of the easier problems to solve with this approach...
  • No, I think you'll find a directed acylic graph is a directed graph with no cycles. Nothing particularly to do with GOTOs.

  • VMWare is an emulator - kind of. I'm running it now, and it emulates most of the hardware (you only have to look at the Control Panel - "Display: VMWare"), but not the processor itself.

  • One of the point in this statement stated that this would free us from depending on existing architectures. Sure but this revolution would just put us in the hand of one company instead of the few that are there today. This really doesn't seem like any kind of freedom, does it?
  • The algorithms which you program a computer with form a general data-dependency DAG. This is what is at issue here. A tree is not sufficient to store this data, nor do the relevant algorithms work on a tree.
  • They're using a directed acyclic graph representation to to register dependency and scheduling analysis. There are a series of transformations which you can apply to such graphs which correspond to loop unrolling and other optimizations, and you can use this to find faster programs. But I suppose you don't care about things like that. You just want to make fun of a technology you could never develop yourself.
  • Frankly, I must disagree with you -- Debian *does* have mechanisms for automated source (as well as binary) distribution, completed with dependancy handling and the like.

    Indeed, apt-get has wowed a great many of my (formerly) Windows-using friends. No installers needed! No library conflicts! etc...

    If by "Unix" having a "great, working system" you mean all major unices having such a system, you're quite right that they don't. However, there are some excellent solutions in place.
  • Nonsense. Why not? This solution doesn't attack issues of competing API standards; it attacks the problem of different CPU architectures -- a problem which open source /does/ solve. You're right to claim that open source is not a silver bullet -- but it certainly is another solution to the primary problem which this development addresses.


    (I'll grant you that this is *not* true for everything else the poster mentioned -- Java, .NET, etc).

  • Why would anyone want this? Insist on source-available applications and you'll never get burned by this. You can just rebuild your applications from source on whatever system you happen to be using, and as an added bonus you'll be using a compiler that understands the target platform rather than relying on hacks.

    There is more information content in the original code for an optimizer to make use of then there is in a binary (or assembly). If this were not the case, would not optimizers run *after* the assembly translation is done? In fact, all reasonable compilers run the vast majority of their optimizations *before* the translation occurs, and only a few small peephole optimizations are done on translated or nearly-translated code. The unfortunate (for them) facts are that:

    • Optimizations done after translation has finished are of limited value and generally produce only very small performance gains
    • There is no reason to translate binaries, with all the difficulty this entails, when it's much easier to simply recompile.
    • In many cases simple binary translation is ineffective anyway, since other properties of the systems are likely to differ (for example, different operating systems use different system calls, syscall numbers, or calling conventions). This requires a great deal of effort (consider replacing one 5-instruction system call with 582 instructions to make 7 different syscalls and include a large chunk of compatibility code to substitute for a system call that the target lacks) to work around, and it's difficult to get it completely right.

    The verdict: don't fall for this. Even if it works, and even if it has no effect on performance in the common case, there's no benefit. The only useful things that can come of this are the magic peephole optimizations they might be using, which should go into general-purpose compilers.

  • some of my friends and relatives (gasp!) DON'T EVEN HAVE A COMPILER INSTALLED ON THEIR COMPUTER!!!

    Why not? High-quality compilers are available, with source if desired, at zero cost.

    Your arguments regarding optimization also apply to distributing files as Java byte code, but the simple fact is, for most applications, nobody gives a damn about optimization anymore anyway!

    People who love to brag about their leet computers think this. Anybody who actually has to do work on them does not. Java is suckass-slow to the point of uselessness, and there's no excuse for wasting CPU power just to be lazy.

    For the few cases in which cycles are that critical, shouldn't the code be written in hand-optimized assembly and made available in system libraries anyway?

    Of course. Unfortunately I believe, unlike you, that there are more shades of code than just "performance-critical" and "non-preformance-critical." The 90-10 rule is quite valid in most cases, and inner loops and such should be optimized, the best algorithms available used. But what about the parts of the application that aren't in the system libraries? What if my need for speed isn't just in strcmp(3) but also an AVL tree, an XML parser, and (insert foo here)? These should be written in a compiled high-performance language like C (never Java and almost never C++). It isn't any harder to do this right.

    Have you tried lately to write a non-trivial application where the same source compiles on both Linux and Windows lately?

    No, why would I? Dozens of vendors got together several years ago to define standard (heard of POSIX?) to make sure this could be done with a minimum of pain. I can't help it if Microsoft was too busy (drawing mustaches on Larry Ellison|calling Scott McNealy a liar|embracing and extending its mother) that week. Want to run useful code? Use a real OS; there are plenty to choose from.

  • "directed acyclic graph", abbreviated "DAG", is a generic term describing a data structure. You might also call it a "tree". That their internal data is stored in such a structure is almost implied by the nature of the software. That is how compilers generally store their internal data.

    That's what this is: a compiler. Its input is a machine language instead of a high level language. This is interesting, but not necessarily all that useful. It solves a piece of the problem, but not the hardest piece.

    The ability to take an existing piece of code and run a static optimizer on it might be interesting, but I suspect that such a device would exercise enough previously undiscovered bugs in the targeted software to make its use as anything but a testing & debugging tool rather impractical.

    The idea you suggest resurfaces every few years. A while back it was called "thin binaries"; in the late nineties it was called "Java". In any case, it never takes off quite as well as everyone seems to think it should, simply because the processor is only one part of the machine, and not the hardest one to emulate.

    -Mars
  • Well... yes. But I didn't want to spend half an hour discussing the issue. How would *you* have explained it?

    -Mars
  • FX!32 was Digital's software for the Alpha that allowed you to run software written for x86 on Alpha processors under NT.

    It used dynamic recompilation of the sort mentioned here, and from what I've heard, was at a pretty acceptable speed. It also did run-time optimization, or as Transmeta would put it, code morphing.

    I believe there was also a FX!32 compatability layer for Digital Unix and later Linux, although support was slightly more sketchy. If I remember correctly, this was around the time that Digital made it possible to use libraries compiled for Digital Unix under a Linux environment.

    Anyone else have more to say about FX!32? I'd be interested in more info.

  • Dynamo dynamically optimizes binaries; an equivalent in the Java world is IBM's Jalapeno VM. Unfortunately, the Dynamo approach is only feasible on the HP architecture, because the PA-RISC chip has an absurdly large i-cache (extremely aggressive in branch prediction.
  • Nobody made NT on Alpha software

    Well, Microsoft had ported their entire BackOffice suite (Exchange, SQL, etc) over to Alpha. There were also versions of IIS, VisualBasic (for DCOM), Netscape Enterprise, Oracle, Lotus Domino, and so on.

    So, people made the (server) software, just that nobody bought it.

    (My theory was that NT networks generally scale out across multiple boxes rather than up to larger boxes, meaning that few NT shops needed more than a 4-way Intel box, which was the only point the Alpha's started to get price competitive.)

    We seriously considered Alpha/NT servers at one job I worked at back in 1995-6. It met our software checklist, but the DEC sales engineers couldn't even get their damn server to boot on two successive visits. Then the Pentium Pro shipped. Game Over.
    --
  • and don't forget dosemu, and wine, while you are at it as both of thouse are using some sort of 'binary transualtion' as well.

    I think bochs is great as it allow intel binaries to run on all sort of other platforms. You just need a super fast PC to get some performance out of it..

    I don't want a lot, I just want it all!
    Flame away, I have a hose!

  • The comment that this is not suitable for
    hand-optimized loops in DSPs plainly
    means that this is an emulator.
    What would be cool instead is if someone
    made a binary cross-compiler so it would go
    through you harddisk's binaries and convert
    them from, say, x86 to PPC so that you could
    take your hard drive, take it from an x86
    system, put it on a Mac and have Windows
    boot natively (modulo ROM issues on Macs).
    All without access to Windows source code.
  • You can only get so far with narrow-vision algorithmic optimization, as proven by the failure of 40 years of research. (Failure, only as defined as producing code as good as a human can).

    For certain classes of problems, compilers do a hell of a lot better job than people do. I don't think most people are that great at optimizing assembly code by doing things like properly sharing registers. In general, when it comes to optimizing memory access, compilers will beat 99% of all programmers out there.

    But for picking a better algorithm, no, compilers suck. But I don't know of any active research in having a compiler change the algorithm implemented by the programmer, so you're using a straw man here. Then again, it's been 4 years since I've done any compiler research...

    -jon

  • Users primarily want functionality. Besides, I doubt the use of Java will have a large impact on it. Most energy in e.g. mobile phones is used when actually using the connection.

    As for your questions, I've seen a mpg player written in Java, I believe the java compiler is a java program, I have seen a few nice games in Java although it isn't quake of course. I have just spent my afternoon hacking away in netbeans, agreat Java IDE and of course written in Java.

    I just think you should revise your opinion regarding bloatedness. The very least you could consider is wondering why the heck all these mobile hardware guys are deploying Java despite your argument. Presumably they know what they are doing and maybe your arguments are not valid?
  • How about footprint? From the article it sounds like this emulation will require a couple of MB.

    I'm in no way an embedded expert, but I was of the impression that RAM is expensive in the embedded and small form-factor world.

    I can see that this technology might be a time-to-market saver if you have a load of assembler written for one embedded CPU and want to move it quickly to a new platform.

    Hmm, how about the interface with support and I/O chips? This thing is, from what I understand, only a cpu emulator/translator. If you change the platform you will probably have to write new drivers for the other chips.
  • I wonder if the specs for DAG will be open so that code can be compiled directly to it, optimized, and then distributed, saving the first two steps in the process. I can see commercial software vendors being all over this idea.

    Target CPU neutral binary formats have been around for a while. OSF has ANDF [opengroup.org] (Architecture-Neutral Distribution Format). Also check SDE [inf.ethz.ch] (Semantic Dictionary Encoding).

    A hardware neutral distribution format is not the complete solution, though. The target platform that you want to run the executable on has to provide the software environment and APIs that the executable needs.

    So, it is only suitable for distributing CPU-neutral but OS/environment-dependent user-space applications. What it really does is to save you the job of recompiling an application for PPC/x86/SPARC/whatever-Linux. This would certainly make life easier for non-x86 Linux users, but it is not a general solution for making applications platform-independent.

    If you want a run-anywhere solution you also need to define a runtime environment, which is exactly what Java does.
  • Isn't that _exactly_ what FX32 did?
    AFAIK it dynamically translated binary
    code for Pentium to Alpha processors
    with runtime optimazation.


    Yep, apart from it did it for x386 and up --> Alpha.

    I think it was released in or before 1997.

    Before - 1996 or 1995, IIRC.

    Simon
  • Why not? High-quality compilers are available, with source if desired, at zero cost.

    Because >80% of users wouldn't know how to install cygwin or djgcc if you gave them a manual and a CS degree? Because even if they did get them installed, they wouldn't know how to use them?

    almost never C++

    XML parses do tend to be written in C++, though. Believe it or not, it can be a fast language.

    heard of POSIX?

    Yeah, heard of GUIs? POSIX != cross-platform applications. Even terminal-oriented Unix-specific apps aren't always source portable without #ifdefs. (As I recently discovered while trying to move some user-management apps from Linux to Solaris.)

    Want to run useful code? Use a real OS; there are plenty to choose from.

    Well there's a brilliant response. He makes the valid point that most people won't benefit from having the source, so you propose that they switch to a "real OS"? So not only do not like Windows, and not only do you think other people shouldn't use it, but you're actually opposed to people trying to it easier to develop cross-platform apps.

  • Why not recompile it natively?

    Ah, that is the question.

    Some believe that dynamic optimization can do things that static optimization can't do. For instance, you can straighten code that used to have a lot of branches in it. You can't do that statically because you don't know which branches will be taken most often. You could do every possible straightening and include them all with the binary, but that would probably be prohibitively large. You could profile the code and use that to direct a recompilation, but then that's nothing more than really-slow dynamic compilation.

    So, once dynamic optimization technology has advanced, it may outperform statically-optimized code even in the same architecture.

    Now, what happens if you dynamically optimize the dynamic optimizer...
    --
  • The problem is caused by binary distribution.

    The solution is source distribution.

    Compilers know more about the program than translators do, and they also allow linking to native libraries. Can a translator do that?

    Yet another proprietary solution to fix another problem caused by proprietary solutions.

  • Mike Van Emmerik is still working on the project, as is 4 students. The project leader Cristina Cifuentes is currently doing research at Sun Labs on commercial extensions of this work. There will be an open source project at the end of the year apparently, the code has already been released under a BSD style license but it is not publically available as yet. The funding from Sun was a gift to Dr Cifuentes simply because they liked what she was doing. I was just a happy employee when I wrote that broken backend.
  • Uhh... Are you a troll, or are you just really stupid? Have you actually TRIED gzipping a file more than once?

    Original Tar file: 30720 bytes
    Gzipped tar file: 6895 bytes
    Gzipped gzipped tar file: 6923 bytes

    Basically, you get ONE chance to properly discover all of the redundant data. After that, it's pretty much an uphill battle.
  • I'm not sure if EEtimes is oversimplifying, or if Transitive technologies is filling heads with BS.

    "...Translation, sometimes called software emulation..."
    Translation != emulation

    "...Crusoe specifically takes X86 code...In contrast, Transitive's...[fluffy adjectivies]...can, in theory, be tailored for many processor pairs.."
    Crusoe isn't X86 specific, and it can be tailored for many processor pairs in reality, not just in theory.

    "...We have seen accelerations of code of 25 percent..." doesn't mean that everything runs 25% faster. I don't even hear Transitive technologies saying that it does.

    I wonder how many more companies will come up with new and innovative techniques like this now that Transmeta has become very noticable? I wonder how long before the cash-strapped Transmeta starts filing patent infringement suits? (Please Linus, make them play fair!)
  • I completely agree about the dynamic content bit -- and I don't think the right answer is for us to have to mail around tarballs of C source and makefiles.

    And while you're correct in spirit, in that converting from Java byte codes to native code is fundamentally the same problem as converting from native Platform A code to native Platform B code, the latter is orders of magnitude more difficult. Java was designed for just such a conversion, and so is about as simple as you can get.

    Real Programs, on the other hand, are unbelievably complex to the point that simply deciding "Is this byte code or data (or both)?" is literally unsolvable in the general case. For comparison, the distinction between code and data is obvious in Java. Plus, for "real" emulation you have to emulate all the weird I/O ports and other basic hardware, none of which is trivial.

    Comparing a Java VM to serious hardware emulation because they are both emulators is is like comparing "Hello World" to MacBeth because they are both English text -- technically correct, but not a really meaningful comparison.
  • by jasno ( 124830 )
    Now if they could figure out a way to deal with endianness, and the other 99% of the platform specific stuff in most code, it might be worth something...
  • VMWare isn't an emulator.
  • I believe that the technologies involved in the Dynam(o/ite) projects could speed up Java to the point where it would become a feasible possibility to write Java-only applications.

    And it isn't already?

  • Yeah, the compiler only needs to read the programmers mind to find out his intentions when doing I/O with binary data. Other than that, it's very simple...
  • If you're thinking in terms of desktop systems and software written in high-level languages, you're right. But the target market of this company is the embedded systems world, where the code is typically hand-optimized assembly and even custom-made instruction sets for systems that are built from heterogeneous proprietary systems. Some proprietary chips are better than others, and often you don't know which is the best solution until you've already implemented the whole thing.

    For the telecom industry, this solution, if it works, is a very good one.
  • Honestly, you can't just say recompile the code. It's not practical, and it doesn't work. If it did, RISC would rule the world, but it doesn't. NT was even put onto various RISC architectures, and it didn't work. Translation is the only way to give processors a chance for legacy code base.

    If you say otherwise, you're ignoring history. RISC processors rock for most application. Look at Transmeta, a 700 MHz Crusoe can act as, worst case, 300 MHz Pentium III, using a lot fewer transistors and a lot less power. If MP actually worked, you could get such an advantage based on silicon space of performance/power.

    Opinions?

  • I wonder if they could run the optimizer without the translation layer (or make a ChipX-to-ChipX dummy translation), and squeak some extra performance out of code on any platform?
    This is actually done by the Dynamo Projekt [hp.com] by HP. From their page:
    The motivation for this project came from our observation that software and hardware technologies appear to be headed in conflicting directions, making traditional performance delivery mechanisms less effective. As a direct consequence of this, we anticipated that dynamic code modification might play an increasingly important role in future computer systems. Consider the following trends in software technology for example. The use of object-oriented languages and techniques in modern software development has resulted in a greater degree of delayed binding, limiting the program scope available to a static compiler, which in turn limits the effectiveness of static compiler optimization. Shrink-wrapped software is shipped as a collection of DLLs (dynamically linked libraries) rather than a single monolithic executable, making whole-program optimization at static compile-time virtually impossible. Even in cases where powerful static compiler optimizations can be applied, the computer system vendors have to depend on the ISV (independent software vendor) to enable these optimizations. But most ISVs are reluctant to do this for a variety of reasons. Advanced compiler optimizations generally slow down compile-times significantly, thus lengthening the software development cycle. Furthermore, a highly optimized binary cannot be debugged using standard debugging tools, making it difficult to fix any bugs that might be reported in the field. The reluctance by ISVs to enable advanced machine specific optimizations puts computer system vendors in a difficult position, because they do not control the keys to unlock the performance potential of their own systems!
  • The more deeply the optimiser is run the BIGGER the percentage speedup Dynamo gives. The reason is that the speedups are different speedups than the ones found by the compiler, and the less time spent in the rest of the code, the more significant the speedups are.

    e.g. the compiler can't optimise into a DLL, but Dynamo can

    e.g. Dynamo can profile the code and optimise virtual function calls in ways that the compiler can't
  • "Our claim is that we can run 1:1 or [even] better than native speeds"

    Bullshit.

    Wake me when these guys go out of business. Been here, seen this. The x86 emulator guys made the same claims for their Mac-based emulators, almost word for word. (I won't even get into Transmeta's claims that have turned out to be similar bullshit).

    This is just a special case of an optimizing compiler, which Java run-time optimizers also fall into.

    These claims, as well as the claims for the "magic compiler" that can produce code better than humans, will never happen until we have real human-level AI that can "understand" the purpose of code. You can only get so far with narrow-vision algorithmic optimization, as proven by the failure of 40 years of research. (Failure, only as defined as producing code as good as a human can).


    --

  • this is not either an evolution or a revolution. it is a quick fix the legacy problems of "modern" computers.

    but still...

    a good idea. for example, the article states that a CISC to RISC translation would still be inefficient, how much so?? would a 1.4ghz athlon be equivilent to a 500mhz PPC or would it be better? could this all a much more usfull form of emulation, as in i cant afford a g5 MAC for macOSX so ill just use my athlon?.

    also, with the claim of possible speed improvements accross RISC to RISC translation this may light a bit of a fire under the arses of some of the big players(intel-IBM) to build a new arcitecture with these optimizations in hardware.

    this could be used as a tool for competition with transmeta with some good hardware backing it up. a CPU could be made as a base and the translation hardware could be pre-programmed to emulate multiple platforms. people would no longer have to worry about which arcitecture their WinCE apps are compiled for because their chip would run MIPS or ARM at nativelike speeds.
  • Look, you're all missing the point here. What I was saying is that using the term DAG is an incredibly poor way of explaining the technology to a layperson.

    Speculating on the basis of what I know of optimization in compiler theory, the data structures in question probably consist of a root node for the start of the program, with mutually independent paths of execution as branches on the graph. They join up eventually, but if you showed the general topology of the graph to a non-specialist, they'll go "Oh, that's a tree."

    Now, at this point, you have two choices. You can say, "Well, actually, it's not a tree per se but a type of data structure known as a 'directed acyclic graph', or DAG for short, because...", with the result that you'll lose your audience completely, or you can say, "Yeah, it does, doesn't it", and get on with your explanation of code translation.

    Unfortunately, when you're explaining a difficult concept to a non-specialist audience, you make a few sacrifices in accuracy in order to try and convey some sense of understanding. The trick is determining how much distortion of the truth through economy is acceptable. Science educators make this kind of compromise all the time, particularly in the physical sciences.

    I think worrying about the subtle (to a layperson) differences between trees and DAGs is unnecessary and an impediment to explaining the concept (code translation) in question.

    And in response to the AC who started this thread, I got a GPA of 6 in Data Structures.

  • No, the issue here is: why bother using all the syllables in 'directed acyclic graph' when just 'tree' (with maybe the qualifier 'dependency' in front) will do nicely, thank you. Nice, ordinary language that the common man has a slightly better chance of understanding, particularly when it comes with a quick explanation of the dependency issues involved (e.g. preventing pipeline stalls by putting an instruction that reads a register as far downstream from the last instruction that wrote the register as possible).

    I fully support /. posters making fun of people mindlessly using jargon to impress clueless journalists. Death to jargon as a means of maintaining the technocratic order. Or something.

  • In most Australian universities, 7 is the highest possible GPA.
  • Just table based translation with few ifs engrained
    in the code is not good justification for hype,
    however some companies survive just that way.
    Anyhow, they managed to emulate other chips in
    hardware. Thats like carb in a car that can work
    on the same fuels, alas the fittings are not same,
    so they cannot be integrated into a engine
    environment, with some heavy modificaions.

    Being able to to run code from penitum on
    your chips that just modifes registers and adress
    ranges is interesting challenge, but its just
    that.
    Drivers written for 'common' enviroments
    surrounding chips would not work on new platforms,
    and if they will that will mean that new platform
    is just an old one with new processor, that
    to externals is just like plain pentium chip.
    Feat like VMWare is more admirable, thanks to
    those CISC commands that allow for multiputer
    based technologies.

    New statements like Apple made way ago, and so
    as Sun did with their hardware are more forward
    thinking than that mere table lookup embedded in
    hardware.

    Remember some companies survive on hype, hyping
    old or new technology. Transmeta has firmly placed
    itself in that market share, so it will be tough
    for this company in near term.
  • I appreciate the claim made here, and in fact am excited by the possibility of a "renaissance" that is spoken of. But much like Transmeta, how much of this is true? Are there third parties doing the testing on this yet? If so, where are there results and conclusions?

    1. is this.....is this for REAL? [mikegallay.com]
  • We aren't tied to the ancient x86 architecture because of legacy, we're tied to it because of manufacture contracts with x86's owner -- Intel. It's the industry, not software which is holding things up for cross-platform capabilities. Bring in a company with Intel's power and reach and we might get somewhere, but another emulator or translator isn't going to make a difference.

  • I think that I shall never see
    A program lovely as a directed acyclic graph

  • IBM was doing this exact same thing with DAISY, although the scope seems a bit narrower: http://www.research.ibm.com/daisy/ It's very interesting that we're just now talking about this stuff. It may get to a point where PC architectures will be able to do something similar to what an AS/400 does....the application is insulated from the hardware completely, and when transported to a new architecture, it automatically translates to run on the new architecture, fully able to exploit the abilities of that architecture.

  • I believe that some of you guys are missing the point. The quote was:

    ...the subject program in the form of what Transitive calls "directed acyclic graphs."

    The author of the article makes it sound as though Transitive has invented DAGs. That's what is funny. Durinia is not a weenie making fun of their technology, and it's not necessarily an attempt on Transitive's part to dazzle.

  • Ars Technica did an article on this topic a year ago. Check this link [arstechnica.com] for the article.
  • by um... Lucas ( 13147 ) on Monday June 11, 2001 @12:16PM (#159758) Journal
    We've had processor and machine emulators and processor independance for so long now...

    SoftPC, Soft Windows, Virtual PC, XF86, Virtual Playstation, Java, WINE, Wabi, MAME, and so many others...

    Why should this one be the news?

    While Java was basically the only one that's tried to dislodge x86, they've all shown that while it's feasible to run another architecture's binaries ontop of a CPU, it's not the preferred way of doing things.

    YAE (yet another emulator)

    And big deal if it only translates a program from one binary arch. to another... Without an equivalent OS, the calls have nothing to be translated into...

    And i could lead into the slashdot mantra of if all programs were opensource, we wouldn't need somethng as sloppy as an emulator anyhow...

    Or am i missing something about the significance?
  • by jilles ( 20976 ) on Monday June 11, 2001 @11:07PM (#159759) Homepage
    Well cell phones are no longer the limited machines they used to be. They have quite a lot of processing power, can be equiped with several MB of memory. Once you have that, coding C is a waste of time and time is the difference between profit or loss in the mobile market. A two month delay can literally make the difference. A company like Nokia produces dozens of different mobile phone types each year. That's why they love cross platform and couldn't care less that they would have to spend a few dollars more on the hardware. Besides, Moore's law also applies to the mobile market. Mobile hardware is doubling in speed just as fast as desktop and server hardware. Current mobile architectures such as applied in pda's, mobile phones etc. have plenty of horsepower and most of these architectures already have JVMs running on top of them.

    Java programs are crossplatform. In the mobile market this means that once you have a JVM ported to your phone, you can run a rapidly growing number of programs without any change. That cuts back development time dramatically. C doesn't give you the same advantages because you have to recompile, test and debug before you can expect even the most portable C code to run without a hitch.


  • by WNight ( 23683 ) on Monday June 11, 2001 @06:48PM (#159760) Homepage
    Exactly.

    This is what that company that was mentioned a few months back was doing...

    We all know that if you zip a file again, it gets smaller again, but it takes exponentially more time.

    Couple that with the sort of exponential speed increase you get with repeated recompilation, and you get almost zero-sized files in a fixed ammount of time.

    The exponential increase I speak of, is that if you get a 22% each time, you've got 48% increase after the second pass, and 81% after the third. It just keeps getting better.

    The drawback with this is that because each recompilation of the program is a different binary (or it wouldn't be faster) it takes a new memory block. This means that the ram requirements approach infinity as well. Kinda nasty.

    But, the patented part of this was that the company was going to use a Ram Doubler(tm?) type technology to compress the program in RAM, as well as the file. This then gets nearly infinite compression in a little over twice the time taken for single compression (there's some overhead) and about three to five times the RAM (there's more overhead in storage) required for just a standard 1-pass 30% compression algorithm.

    The neat thing is it doesn't require quantum computing or anything, it's all off-the-shelf stuff, just linked in a neat way.

    This'll revolutionize the market when they release it... we think MP3s are small! An 80GB HD will offer nearly endless storage.
  • by TheTomcat ( 53158 ) on Monday June 11, 2001 @12:27PM (#159761) Homepage
    if everyone wrote assembler, and didn't ever depend on anyone else's libraries (including OS, and BIOS), this would work.

    But this won't work for the same reasons that DOS software won't run natively on Linux. There's too much dependance on general-use code (like OS based interupts (21h, f'rinstance)). (not that that's a bad thing, just in this circumstance, it makes straight-up translation impossible).
  • Every PCI PowerMac has a 68K (CISC) to PPC (RISC) dynamic recompilation emulator in it that it uses for executing 68K code. And MHz for MHz, the execution speed of the 68K code when dynamically recompiled as PPC code, is roughly comparable (plus or minus 50%?) to the speed of the original 68K code on a 68K processor.

    The very first PowerMacs (NuBus based) used instruction-by-instruction emulation to run all the old 68K Mac code, including some parts of the OS that were still 68K.

    The second generation PowerMacs (PCI based) included a new 68K emulator that did "dynamic recompilation" of chunks of code from 68K to PowerPC, and then executed the PPC code; this resulted in significantly faster overall system performance.

    Connectix later sold a dynamic recompilation emulator ("Speed Doubler") for Nubus PowerMacs, that did, in fact, double the speed of those machines for many operations, mainly because so much of the OS and ROM on the first-gen PowerMacs was still 68K code.

    I think that dynamic recompilation has a bright future; x86 may eventually be just another "virtual machine" language that gets dynamically recompiled to something faster/more compatible/etc at the last moment.

    -Mark
  • by El ( 94934 ) on Monday June 11, 2001 @12:54PM (#159763)
    As opposed to BASIC, which is an extremely stupid solution to the same problem?

    I don't see how the problem that I'd like to send you dynamic content via email without requiring you to be running the same CPU as I am is caused by closed standards. On the contrary, it seems to be an inevitable side effect of competition in the processor market. Yes, it is an obvious solution: given that I can do on-demand translation to the Java Virtual Machine, how much harder is it to do on-demand translation to the instruction set of a real CPU?

  • by zpengo ( 99887 ) on Monday June 11, 2001 @12:07PM (#159764) Homepage
    It's funny how things are heading these days. Java, .NET, and dynamically-translating processors are all "brilliant solutions" to a problem that was caused by closed standards in the first place.
  • You're right in the case of the desktop and applications world. However, in the embedded world, such as cellphones and 802.11, this is VERY useful. The problem of multiple proprietary platforms is the current bane of the telecom industry, which this company is clearly targeting.
  • by CraigoFL ( 201165 ) <slashdot@kanook. n e t> on Monday June 11, 2001 @12:16PM (#159766)
    From the article:
    "Translating CISC to RISC is bit like pushing uphill, but we can get close to parity in performance assuming the same clock speed," he said. "That's because we work the 90:10 rule on the fly. The software spends 90 percent of its time in 10 percent of the lines of code. That means for RISC-to-RISC and CISC-to-CISC translations, we are able to make improvements. We have seen accelerations of code of 25 percent."
    I wonder if they could run the optimizer without the translation layer (or make a ChipX-to-ChipX dummy translation), and squeak some extra performance out of code on any platform?
  • by Spy Hunter ( 317220 ) on Monday June 11, 2001 @02:19PM (#159767) Journal
    GCC might know a lot about processor architecture but it can't know about what tasks you will be asking the compiled application to do, so it can't optimize for that.

    Hmmmm, I just had a crazy idea. What if you could compile your GCC application in a special way, then run it under simulated normal working conditions and have it log performance data on itself, just the kind of data that these run-time optimizers gather. Then, you could feed GCC this collected data along with your application's source and recompile it and GCC would be able to turbo-optimize your app for actual usage conditions! If it can be done on-the-fly at run-time, it can be done even better at compile time with practically unlimited processor time to think about it.

    Even if the end-user used the application in a nonstandard way it might still provide a performance benefit because there are lots of things that a program does the same way even when it is used in a different way.

    Would this be feasible? Would it provide a tangible perfomance benefit? (like HP's Dynamo?) Comments please!

  • by Anonymous Coward on Monday June 11, 2001 @12:22PM (#159768)
    Yeah, that's why star office runs so well on SGI's. It's been open-sourced for almost a year now, and still hasn't been compiled successfully on mips.
  • by landley ( 9786 ) on Monday June 11, 2001 @01:01PM (#159769) Homepage
    MetroWerks here in Austin did the emulation layer for Apple's M68K->power switchover. They did a really clever thing of identifying long "runs" of code that nobody ever jumped into the middle of, then they treated them as one big instruction with a lot of side effects, and optimized them as a block. (Not one instruction at a time, but the whole mess into the most optimal set of new platform instructions they could.)

    It was quite clever. It's also quite patented, and has been since before the Power PC came out. (And in a sane world those patents would have expired by now, but with patent lengths going the way of copyright...)

    Eventually, when the patents expire, this sort of dynamic translation will be one big science with Java JITs, code morphing, and emulation all subsets of the same technology. And somebody will do a GPL implementation with pluggable front and back ends, and there will be much rejoicing.

    And transmeta will STILL be better than iTanium because sucking VLIW instructions in from main memory across the memory bus (your real bottleneck) is just stupid. CISC has smaller instructions (since you can increment a register in 8 bits rather than 64), and you expand them INSIDE the chip where you clock multiply the sucker by a factor of twelve already, and you give it a big cache, and you live happily ever after. Intel's de-optimizing for what the real bottleneck is, and THAT is why iTanic isn't going to ship in our lifetimes.

    Rob
  • by philj ( 13777 ) on Monday June 11, 2001 @12:23PM (#159770)

    Here's the homepage for the company - Transitive Software [transitives.com]
    (Apologies for the Karma whoring)
  • by dutky ( 20510 ) on Monday June 11, 2001 @12:49PM (#159771) Homepage Journal
    I'm not saying that their product doesn't work (though I seriously doubt that they can get an improvement in speed from anything other than their hand-picked benchmarks) but that they are probably just trying to spin an established (but under-reported) technology in order to attract venture capital.

    There is no rennaissance in computing that will be ushered in by this product. We have already seen it's like with DEC's FX32 (intel to Alpha) and Apple's synthetic68k (M68k to PowerPC) as well as a number of predecessors (wasn't there something like this on one or another set of IBM mainframes) and current open source and commercial products (Plex86, VMware, Bochs, SoftPC, VirtualPC, VirtualPlaystation, etc.), all of which use some amount of dynamic binary translation, and none have set the world on fire. They are mildly usefull for some purposes, but the cost of actual hardware is low enough to kill their usefullness in most applications.

    I wish these guys luck, but I doubt anyone will be too enthusiastic about this product. They might have stood a chance if they'd pitched this thing a year or two earlier (when there was lots of dumb money looking to be spent) but they are probably toast today.

  • by saurik ( 37804 ) on Monday June 11, 2001 @12:19PM (#159772) Homepage
    This was being worked on a few years ago by some people at The University of Queensland. Unfortunately, they got tired of the project (and, if I remember correctly, that they weren't getting much popular support).

    Their website is at :
    http://www.csee.uq.edu.au/~csmweb/uqbt.html [uq.edu.au]

    "UQBT - A Resourceable and Retargetable Binary Translator"

    To note, they mention that they got some funding from Sun for a few years. (Likely either causing or due to their work on writing a gcc compiler back-end that emits Java byte-codes.)
  • by El ( 94934 ) on Monday June 11, 2001 @01:58PM (#159773)
    Why bother? Suppose I come up with a neat program on my SparcStation, and I want to email to all my friends to show it off. Now, maybe recompiling from source isn't a problem for you and your small circle of friends, but truth be told, some of my friends and relatives (gasp!) DON'T EVEN HAVE A COMPILER INSTALLED ON THEIR COMPUTER!!! My only choice is to send them an executable. Again, maybe you have such a small circle of friends that you can keep track of what kind of computer each of them is running. But quite frankly, some of my relatives, when asked "do you have an x86, PowerPC, 68000, or Sparc chip in that there puppy" can only respond with "huh?!?"

    Your arguments regarding optimization also apply to distributing files as Java byte code, but the simple fact is, for most applications, nobody gives a damn about optimization anymore anyway! Let's see, even if your favorite text editor were 100, or even 1000 times slower, would you be able to type faster than it can buffer input? I don't think so! For the few cases in which cycles are that critical, shouldn't the code be written in hand-optimized assembly and made available in system libraries anyway?

    Your argument that straight binary translation is useless, and that you also need to re-create the entire run time environment is a good point. This, however, is an argument in favor of using Java (or som equivalent), and is an argument AGAINST distributing everything as source. Have you tried lately to write a non-trivial application where the same source compiles on both Linux and Windows lately? (It can be done, but it is EVEN LESS FUN THAN HERDING CATS!) Fact is, this whole discussion is fairly pointless because run-time environment compatibility is both much more important and much harder to acheive than mechanical tranlation of one machine's opcodes to another machine's opcodes.

  • by egomaniac ( 105476 ) on Monday June 11, 2001 @02:58PM (#159774) Homepage
    Yes, and look how well that's worked for the Unix camp.

    I'm not dissing open source -- I'm just pointing out the realistic view that it doesn't instantly solve all your problems. I realize that just about everybody on Slashdot will freak out about this, but I actually don't like Linux. I don't use it. I think it's just another Unix, and much as I dislike Windows I don't have to spend nearly as much time struggling with it just to (for example) upgrade my video card. (Cue collective gasps from audience). Unix has its place, and I think that place is firmly in the server room at this point in time.

    (Disclaimer: This is not intended to be a troll. Please don't interpret "I don't like Linux" as "I think Windows is better than Linux", because I don't like Windows either. I think they're both half-assed solutions to a really difficult problem, and I think we can do better. What I mean is more along the lines of "If you think the open source community has already created the Holy Grail of operating systems, you've got to get your heads out of your asses and join the real world")

    So, if your thinking is that Linux should be the only platform because it represents the One True Way -- I answer that by saying that you sound an awful lot like a particular group in Redmond, WA that also thinks their platform is the One True Way.

    This industry cannot exist without competition, open *or* closed. Saying that these problems exist because your platform is not the only one in existence is incredibly childish.
  • by poot_rootbeer ( 188613 ) on Monday June 11, 2001 @12:18PM (#159775)
    Let's say my source architecture uses interrupt-based I/O. My target uses memory-mapped. Will this translator be able to handle that?

    To be honest, translating one CPU's version of 'CMP R1, R2' to another's doesn't sound like it will user in a renaissance of anything.

    -Poot
  • by BlowCat ( 216402 ) on Monday June 11, 2001 @12:30PM (#159776)
    If we are talking about open source:
    How many people would want to run a "translated" web server? Database? Scientific appliction? How reliable can it be? Why not recompile it natively?

    If we are talking about closed source:
    The same questions except the last one plus lack of technical support for non-native architectures at least by some vendors (e.g. Apple).

  • by The Monster ( 227884 ) on Monday June 11, 2001 @02:00PM (#159777) Homepage
    I wonder if they could run the optimizer without the translation layer (or make a ChipX-to-ChipX dummy translation), and squeak some extra performance out of code on any platform?
    I had exactly the same thought. The article says [as always emphasis mine]
    The Dynamite architecture is based around a translation kernel, with a front end that takes code aimed at a source processor and a back end that aims the translation at a new target. The front end acts as an instruction decoder, building an abstract, intermediate representation of the subject program in the form of what Transitive calls "directed acyclic graphs." The kernel can then perform abstract, machine-independent optimizations on this representation.
    So, x86 => DAG => x86 should work just fine. In fact, x86 => DAG => x86 => DAG => x86 should produce exactly the same code on the second iteration; I wouldn't be surprised if Transitive is doing exactly this to test whether the optimizer were working correctly. At this point, Dynamite sounds conspicuously like Dynanmo.

    I wonder if the specs for DAG will be open so that code can be compiled directly to it, optimized, and then distributed, saving the first two steps in the process. I can see commercial software vendors being all over this idea.

  • by Anonymous Coward on Monday June 11, 2001 @12:30PM (#159778)
    This sounds an awful lot like the dynamic recompilation of MIPS to x86 done in many emulators (such as UltraHLE [ultrahle.com], Nemu [nemu.com], Daedalus [boob.co.uk] and PJ64 [pj64.net]).

    I've been working on the dynarec for Daedalus for about 2 years now, and currently a 500MHz PIII is just about fast enough to emulate a 90MHz R4300 (part of this speed is attributable to scanning the ROM for parts of the OS and emulating these functions at a higher-level). Of course, optimisations are always being made.

    After reading the article, I'd be very interested to see if they can consistently achieve the 25% or so speedups that they claim (even between RISC architectures).

    For those interested, the source for Daedalus is released under the GPL.
  • by woggo ( 11781 ) on Monday June 11, 2001 @12:39PM (#159779) Journal
    whoops! that's supposed to say
    Dynamo dynamically optimizes binaries; an equivalent in the Java world is IBM's Jalapeno VM. Unfortunately, the Dynamo approach is only feasible on the HP architecture, and it is only feasible on HP-PA because the PA-RISC chip has an absurdly large i-cache (greater than 1 mb). Look at HP's Dynamo site for more information, but IIRC the problem is dynamo's
    extremely aggressive branch prediction.

    damn slashcode...

  • by lostguy ( 35444 ) on Monday June 11, 2001 @01:02PM (#159780) Homepage
    ahem [hp.com]. Ignorance does not equal proof.

    To quote:

    The performance results of Dynamo were startling. For example, Dynamo 1.0 could take a native PA-8000 SpecInt95 benchmark binary, created by the production HP PA-8000 C compiler using various optimization levels, and sometimes speed it up more than 20% over the original binary running standalone on the PA-8000 machine.

    That's binary translation from/to the same machine.

    This is basically run-time instruction block reorganization and optimization, which can definitely improve a given binary on a given machine, over compile-time optimizations. Admittedly, a native binary, run through this kind of profile-based optimizer, will probably be faster than a translated-then-optimized binary, but neither you or I can state that with any authority.
  • by Durinia ( 72612 ) on Monday June 11, 2001 @12:23PM (#159781)
    ...the subject program in the form of what Transitive calls "directed acyclic graphs."

    Wow! What innovative technology! I wonder when they will patent this so-called "directed acyclic graphs". And they picked such a cool name! It sounds so mathematical!

    Okay, enough laughing at the expense of clueless reporters...

  • by dmoen ( 88623 ) on Monday June 11, 2001 @12:19PM (#159782) Homepage
    This sort of technology has been around a long time. HP's Dynamo [hp.com] project has been running since 1995. When Dynamo is run on an HP PA-RISC and is used to emulate HP PA-RISC instructions, speedups of up to 20% are seen. That's pretty astonishing: you would think that emulating a processor on that processor would be slower, not faster.

    Doug Moen.

  • by Chairboy ( 88841 ) on Monday June 11, 2001 @12:23PM (#159783) Homepage
    While this is fascinating sounding technology, it sounds more like a solution in search of a problem. There are already software solutions for emulation (SoftPC, VMWare, etc). There are already cross platform language solutions (Java, etc) and so on. Despite this, the market for massively cross platform applications has not really developed. It isn't as if a 25% performance increase is whats holding back the 'rennaissance' the author speaks of.
  • by egomaniac ( 105476 ) on Monday June 11, 2001 @01:00PM (#159784) Homepage
    I realize that Slashdotters love to trumpet the Open Source horn, but this comment is absurd. "Open Source" != "runs on all platforms".

    The amount of work necessary to get a complicated X app running on many different flavors or Unix is certainly non-trivial, and that's just *one* family of operating systems. And it either requires distributing umpteen different binaries or requiring endusers to actually compile the whole damned program. All well and good for people whose lives are Unix, but do you *seriously* expect Joe Computer User to have to compile all his applications just to use a computer? ("how hard is it to type 'gmake all'?" I hear from the audience... as if you'd expect your grandma to do it, and you've *never once* had that result in forty-six different errors that you had to fix by modifying the makefile. make is not the answer)

    The problem of having a program run on multiple platforms is not "caused by closed standards in the first place" as you state. It is caused simply by having multiple standards -- closed or open makes no difference. SomeRandomOpenSourceOS (TM) running on SomeRandomOpenSourceProcessor (TM) would have just as much trouble running Unix programs as Windows does. This is a great solution to a real problem; don't knock it just because you have a hardon for Linux.

You can be replaced by this computer.

Working...