Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Programming Software Hardware Linux

ARM64 Vs ARM32 -- What's Different For Linux Programmers? (edn.com) 102

New submitter DebugN writes: When ARM introduced 64-bit support to its architecture, it aimed for Linux application compatibility with prior 32-bit software on its architecture. But for Linux programmers, there remain some significant differences that can affect code behavior. If you are a Linux programmer working with — or will soon be working with — 64-bit, you might want to know what those differences are, and this useful EDN article says it all.
This discussion has been archived. No new comments can be posted.

ARM64 Vs ARM32 -- What's Different For Linux Programmers?

Comments Filter:
  • I'm not using ARM until we get something over ARM9000.
    • Re: (Score:2, Funny)

      by Anonymous Coward

      Vegeta, what does the scouter stay about his ARM level?

    • Re: (Score:3, Funny)

      by Anonymous Coward

      I'm wrestling with how to answer all these ARM questions.

  • by Anonymous Coward

    But if the answer isn't "32bit exactly", i will be really confused...

    • by OrangeTide ( 124937 ) on Monday October 26, 2015 @09:12PM (#50807023) Homepage Journal

      The instruction set for ARM64 is a bit more like MIPS than ARM. I don't think it would matter to an end user, but this new 64-bit mode has made for a lot of extra work for compiler developers as many of the optimizations developed over the years are no longer effective for 64-bit code. And the bits of operating system and library code that was written for 32-bit or Thumb has to be rewritten as if ARM64 was a totally alien architecture.

      It came as a big surprise really, because years ago when ARM Thumb was added as an extension it was somewhat compatible with the old instruction set, at least in the assembler syntax. That made it relatively easy to port between the two modes. Thumb mode instructions that are half as big, but still operate on 32-bit values, but those instructions are more limited in terms of what they can do and what arguments they will accept. On the Game Boy Advance, the thumb operations could run twice as fast as the non-thumb operations because the 16-bit bus for cartridges was relatively slow compared to the CPU performance.

      ps - I prefer to call it aarch64 or ARMv8.

      • The instruction set for ARM64 is a bit more like MIPS than ARM

        That's not really true. MIPS has no condition codes, has branch delay slots, has no PC-relative addressing and no complex addressing modes. About the only thing where AArch64 is more like MIPS than AArch32 is in having a non-architectural PC, which is a decision made by most modern architectures as making the PC a possible destination register complicates a lot of things microarchitecturally. A few similarities between AArch32 and AArch64 that are not shared with MIPS:

        • They both have a set of condition
        • The instruction set for ARM64 is a bit more like MIPS than ARM

          That's not really true. MIPS has no condition codes, has branch delay slots, has no PC-relative addressing and no complex addressing modes.

          I think that OrangeTide meant that the experience of porting to ARM64 is like the experience of porting to MIPS. He mentions specifically that there are major code changes (i.e. in optimizations) that need to be done which weren't necessary when porting to i.e. Thumb. Those code changes make the port experience similar to porting to another architecture altogether, i.e. MIPS.

          • Right. A lot of the conditional instructions have been dropped. And a lot of the semantics of existing instructions have changed. It is true that operations still modify status flags to handle conditions, versus MIPS which has branches that perform the comparisons themselves. Obviously the new architecture is not like MIPS, SPARC or POWER, but I'd argue it's not much like ARM either. Aarch r31 is a bit like MIPS r0 though, in that it is a zero value if you read it (rsp). Although it's a bit overloaded on AR

  • by viperidaenz ( 2515578 ) on Monday October 26, 2015 @02:51PM (#50805151)

    It's mostly nothing to do with ARM and much to do with "Moving to a later Linux kernel", implying ARM 32bit CPU's don't run on the latest kernels. But they do.

    • Re:tl;dr; (Score:5, Interesting)

      by Guy Harris ( 3803 ) <guy@alum.mit.edu> on Monday October 26, 2015 @03:16PM (#50805289)

      It's mostly nothing to do with ARM and much to do with "Moving to a later Linux kernel",

      You're thinking of the third item in their list.

      The first item in their list does have to do with ARM; its register set is different, and OS APIs for debugging have platform dependencies - in particular, the Linux kernel handled A64 differently from A32 - and those particular developers happen to be using ptrace() and had to handle A64 differently.

      The second item in their list has to do with the C library doing more atomic load/store operations on A64 for some reason; they speculate that it's "to better support multiprocessor systems."

      The problem here is that the article had a misleading title; it was "ARM64 vs ARM32 -- What's different for Linux programmers" when it should have been "ARM64 vs ARM32 -- What's different for people working at a company whose core technology is a record and replay engine, which works by recording all non-deterministic input to a program and uses just-in-time compilation (JIT) to keep track of the program state". What Undo Software [undo-software.com] are doing is rather specialized and system-softwareish, and they run into issues that wouldn't affect the majority of programmers; those are the issues they're talking about.

      • Of course the title is misleading! For a user-space programmer, the ISA is completely hidden by the compiler and the system libraries (for example, synchronization). Still, the document makes interesting points, such as different behaviour of the compiler (which apparently removes locks in ARM32 but not in ARM64) which could impact performance, especially for highly concurrent applications.

        • Of course the title is misleading! For a user-space programmer, the ISA is completely hidden by the compiler and the system libraries (for example, synchronization).

          Unless you're one of the user-space programmers writing the compiler or the system libraries. :-)

          The programmers at Undo are programmers like that, not typical user-space programmers writing code for which the ISA doesn't matter.

          (And a fair bit of kernel programming is done with the ISA hidden by the compiler and lower-level kernel code, for that matter.)

        • For a user-space programmer, the ISA is completely hidden by the compiler and the system libraries

          There are a few places where it might matter. For example, AArch64 can do atomic operations on pairs of pointers. This is particularly important for some of the RCU code in the Linux kernel, but will be similarly important for some userspace atomic operations. If you're writing a userspace threading library (fairly niche) then the different structure of ucontext_t will be important. It can also be important if you're looking at the ucontext_t that's delivered as an argument to signal handlers (though m

    • I'm starting to wonder what really makes an architecture 32- or 64-bit. So far as I can tell, it's just the pointer size. 32-bit architectures can have 64-bit word sizes (double-word); the processor can internally carry 256-bit cache lines and have enormous data buses and still be an 8-bit 6502.

      Seriously, you could make that: a 6502 with pin-outs for 256-bit data buses, but 8-bit addressing. Prefetches 32 words at once through one cycle into CPU cache. It would be nonsense. The Athlon 64 architectur

      • Re: (Score:3, Interesting)

        by AReilly ( 9339 )

        Only people who don't actually use processors at the instruction set level are uncertain about whether or not a processor is "32 bit" or "64 bit". If you look at the architecture, it is usually very easily apparent. Not always, but usually.

        Does it take more than one instruction to shift a 64-bit value? It probably isn't a 64-bit processor.

        • Now that was funny, first sentence making some kind of absolute statement about people's understanding, but then you hedge in later sentences with "usually" and then "probably".

          the i860 was marketed by intel as a 64 bit processor because of the 64 bit graphics registers, but it had 32 of the 32 bit general purpose registers and 32 bit bus.

        • There are many instructions in x86 that operate on values larger than 32 bits.
          MMX comes with 8 64bit registers
          The movq instruction can move a 64bit value between memory and the MM registers.

        • by Anonymous Coward

          How would you classify a CPU like the Z80?
          It has an 8-bit databus and 8/16-bit registers but internally it is performing all instructions through a 4-bit ALU.
          That why even a register-register addition takes 4 cycles, not counting the memory refresh.

          It can do 16-bit additions (or left-shift if you so wish) with a single instruction, but I sure wouldn't call it a 16-bit CPU.

        • It's only easy on most modern ISAs because they have unified integer and address registers. Usually we use the size of an address register and implicitly assume that it's the same as the size of an integer register. This gets a bit confusing in some cases: I played with a highly specialised DSP a few years ago that had 64-bit integer registers but only 256-bit address registers (it was intended for a streaming computation that had very few temporary values, though the addresses were all to 64-bit words [i
      • by mikael ( 484 )

        With 8-bit and 16-bit systems, this was the width of the data bus, while memory was mostly limited to 64Kbytes, but there were all sorts of funky paging that could swap in and out 16K blocks. With 32-bit and 64-bit systems, it's the theoretical maximum amount of uniquely addressable memory; 32-bit = 4 Gbytes, 64-bit = 1 Exabyte of storage. But CPU performance is being improved by using 128-bit and 256-bit wide data bus architectures to support SIMD instruction sets like AVX and 3Dnow!

      • Seriously, you could make that: a 6502 with pin-outs for 256-bit data buses, but 8-bit addressing

        I'm not seeing what you're saying here. How would you be able to address that memory from within the 6502 instruction set, with 8-bit addressing?

        • You'd say, "Get the memory at 0x0c00".

          It would load the memory at 0x0c00-0x0c0F into CPU cache, and copy the 0x0c00 address into a register. If you then tried to add 0x0c0A to that register, it would read it from cache.

          • Oh, so still more than 64k wouldn't be accessible to the programmer
            • Right, just the internal architecture of an 8-bit processor is for some reason now 64-bit.
              • Oh. I guess I think of "32" or "16" bit in terms of the ISA, not so much the other stuff.
                • Which also doesn't make much sense, since there's not much advantage and a lot of disadvantage to 64-bit ISA. The x86-64 ISA adds extra registers and other instructions which would be useful to 16-bit processors, but it also makes word-sized values 64-bits wide by default.
                  • Which also doesn't make much sense, since there's not much advantage and a lot of disadvantage to 64-bit ISA.

                    The primary advantage for consumer devices is that you can access more RAM.

                    The x86-64 ISA adds extra registers and other instructions which would be useful to 16-bit processors

                    That's true, they should have been added years ago.

                    but it also makes word-sized values 64-bits wide by default.

                    Kind of......for a lot of instructions, the default is actually 16 bits. For a mov command, the default is to move 64bits, but you can also specify 8/16/32 bits. I'm not sure it matters too much practically what the 'default' is, though.

                    The x86-64 is indeed a hybrid monstrosity, whose only redeeming feature is backwards compatibility.

    • Re: (Score:2, Interesting)

      by Anonymous Coward

      Will people *please* stop saying that hardware "runs on" software: it's the other way round!

      (If you want to say that a piece of hardware "runs" some software, that's perfectly acceptable -- but not "runs on".)

  • That's easy (Score:5, Funny)

    by pushing-robot ( 1037830 ) on Monday October 26, 2015 @02:52PM (#50805157)

    One has a lot more arms.

  • So in summary (Score:4, Interesting)

    by NaCh0 ( 6124 ) on Monday October 26, 2015 @02:52PM (#50805163) Homepage

    There are no changes for programmers in general. Only the compiler writers need to care. (as usually happens with new cpu architectures)

  • by Anonymous Coward on Monday October 26, 2015 @02:54PM (#50805175)

    This will have absolutely no effect to the majority of programmers that use a higher level language such as Java or Objective-C.

    As the article shows examples, only Assembly and C have changes from the 32 bit version, which are to be expected. Not a big surprise for anyone. I am sure my Python code will run the same as it did on ARM32.

    • by Anonymous Coward

      I am sure my Python code will run the same as it did on ARM32.

      Unless the guys who write CPython get tripped by these differences

    • by Lisias ( 447563 )

      Guys that assume sizeof(int) as 32 bits, instead of using "sizeof(int)" explicitly will have a bad time.

      And believe me, they're at loose even nowadays.

      • by Anonymous Coward

        That should be (sizeof (int) * CHAR_BIT). POSIX requires CHAR_BIT to be 8, so you can reasonably choose to also make that assumption on Unix-ish systems. But on some architectures CHAR_BIT can be 16 or greater, e.g. on a DSP which only has floating point arithmetic. It could emulate 8-bit chars, but it'd be quite slow.

        Portable C programming is actually quite easy, but you just have to stay mindful. It helps if you stick to unsigned arithmetic when practical, which is usually the case because unsigned is the

    • by fnj ( 64210 )

      This will have absolutely no effect to the majority of programmers that use a higher level language such as Java or Objective-C ... only Assembly and C have changes from the 32 bit version

      OMG, what drivel. Objective C is a superset of C, and as such shares all of C's characteristics.

    • I am a FORTRAN programmer on Linux, you insensitive clod!
  • by Anonymous Coward

    ARM64 has been coming for so long one wonders if it really matters anymore.

    Yes, it has reached cell phones and tablets - but nobody is running anything but iOS or Android on those.

    ARM64 was supposed to bring us standard motherboards with a standard, documented boot system so that Linux could treat ARM64 just like AMD64 for booting purposes - one standard bootloader instead of custom stuff for every board that quickly becomes a support nightmare.

    Yet years after announcing, just like Power8's move to the mass

    • The market for mobile devices which might run ARM is far bigger than the market for laptops and desktops.

      • by Tablizer ( 95088 )

        Perhaps, but also less profitable.

      • Every modern car with lane detection, pedestrian detection, sign detection, break assistance, collision detection (and in ten ears that will be "state of the art" and become mandatory for every car by law, just like ABS and ESB(? ESS?) became a year ago) has a dozen of ARMs and a few DSPs.
        So yes: the simple idea of that Lady in the late 1980s is now the most sold high level processor on the planet.
        Nevertheless there are still odd 8 bit and 16 bit processors sold as well. After all if you save 1$ in a device

    • We have a few ARMv8 boxes. One from AMD has a fairly standard layout motherboard and will fit in most PC cases. One from Cavium is in a standard rack-mount case (and, with two sockets and 48 cores per socket, is serving as a very useful test bed for lock contention in a variety of parts of the kernel).
  • by Anonymous Coward

    It's not 32. You see, most blokes, you know, will be playing at 32. You're on 32 here, all the way up, all the way up, all the way up, you're on 32 on your arm. Where can you go from there? Where?

    - Nigel Tufnel

  • Details (Score:5, Funny)

    by Galaga88 ( 148206 ) on Monday October 26, 2015 @03:35PM (#50805411)

    You see, when you have ARM32 vs ARM64 you have to remember that 64 is at least twice as much as 32. So you're going to need to use larger instructions in your program or you're going to have a lot of empty space. Because your functions can go twice as far, you're going to need more data highways to get there without all the congestion. It's like moving from a crowded boulevard to an expressway.

    When it comes to mobile apps, which is where you're going to be programming the ARM, these wider highways occupy valuable space on your mobile board, but it's worth it to reduce congestion by at least a half. Also, because you have larger bits, you can get more numbers in your apps without having to stress the fixed point unit. This means fonts take up less space and as such you can use more serifed typefaces.

    This answer brought to you by That Guy Who Clearly Bullshitted Through His Interview and Got Promoted To Manager Last Week.

    • You see, when you have ARM32 vs ARM64 you have to remember that 64 is at least twice as much as 32. So you're going to need to use larger instructions in your program

      You're joking, but that's actually true. Most AArch32 code these days uses Thumb-2, where most common instructions are 16 bits, a few are 32 bits. There's no Thumb-3 (yet) for AArch64, so all instructions are 32 bits. There almost certainly will be Thumb-3 at some point, though it is likely to be a little while coming and involve profiling a lot of code to determine what the most common instructions are (and there's no point in doing that until the compiler back ends have stabilised a bit).

  • For those of you who are interested, Undo Software [http://undo-software.com] now supports 64-bit ARM. The press release says that it is particularly useful for developers porting code to new architectures http://undo-software.com/press... [undo-software.com]

Vital papers will demonstrate their vitality by spontaneously moving from where you left them to where you can't find them.

Working...