x86 Assembler JWASM Hits Stable Release 209
Odoital writes "January 2010 is an exciting month for x86 assembly language developers. Software developer Andreas Grech, better known to the x86 assembly language community and the rest of the world by his handle "japheth," has released another version of JWASM — a steadily growing fork of the Open Watcom (WASM) assembler. The main benefit of JWASM, arguably, is the nearly full support of Microsoft's Macro Assembler (MASM) syntax. As those in the assembly language community may already know, Microsoft's desire to continually support the development of MASM has been dwindling over the years — if only measurable by a decreasing lack of interest, updates and bug fixes — and thus the future of MASM remains uncertain. While Intel-style syntax x86 assemblers such as NASM have been around for a while, JWASM opens up a new possibility to those familiar with MASM-style syntax to develop in the domains (i.e. other than Windows) in which assemblers such as NASM currently thrive. JWASM is a welcomed tool that supplements the entire x86 assembly language community and will hopefully, in time, generate new low-level interests and solutions."
Japheth's Other Projects! (Score:2, Informative)
Japheth has a number of rather interesting projects that extend the functionality of DOS.
JEMM, which is his EMM386 replacement: http://www.japheth.de/Jemm.html
HX DOS Extender, which adds Win32 PE & basic API support to DOS to allow the execution of a whole array of apps: http://www.japheth.de/HX.html
Programming from the Ground Up (Score:4, Informative)
Programming from the Ground Up [igsobe.com]
Re:Just for some perspective... (Score:4, Informative)
I'm also a big YASM fan. YASM can generate object files for Windows, OS X, and Linux. That, combined with its macro features, let you write a single x86 file that can be used on all three platforms.
I'll certainly take a look at JWASM, though!
Re:xor my heart (Score:2, Informative)
No, that snippet compares AL and DX, leaving AX with 1's everywhere except the first different bit, which will hold a 0.
Ex:
AL= 00001101
DX= 00000010 00011001
CWD just puts a 0 in AH
AX= 00000000 00001101
DX= 00000010 00011001
XOR ax,dx computes ax ^= dx
AX=00000010 00010100
DX unchanged
SUB ax,dx computes ax -= dx
AX=1111111111111011
DX unchanged
The XOR swap algorithm to swap ax and dx is:
xor ax,dx
xor dx,ax
xor ax,dx
Re:xor my heart (Score:4, Informative)
I believe it sets dx to the MSb of ax and ends up leaving ax unchanged.
oops! I guess I'm getting my AT&T syntax and my Intel syntax confused. If it's Intel syntax, then:
cdw ;; copy MSb of ax to all bits of dx ;; if MSb of ax was 1 then flip bits of ax, otherwise, no effect ;; if MSb was originally 1, this will add 1 to the flipped bits. otherwise, no effect
xor ax, dx
sub ax, dx
So, assuming Intel syntax, this computes to absolute value of ax and sets all the bits of dx to be the sign bit
Re:xor my heart (Score:1, Informative)
Assuming that it is Intel syntax (as opposed to AT&T syntax) then this computes the absolute value of the signed value in ax and stores it in ax, while clobbering dx: // Note that with respect to 16 bit arithmetic, -x == ~x+1 for all x; the arithmetic negation of x is equal to one plus the bitwise negation of x for all 16 bit x. // Splats (repeats) the sign bit of ax to every bit of dx. // Do nothing if the sign bit of ax was zero; bitwise negate every bit of ax if the sign bit was one. // Do nothing if the sign bit of ax was zero; add one to ax if the bit was one.
cwd
xor ax, dx
sub ax, dx
Re:And how does it differ ? (Score:5, Informative)
And how does its syntax differs from NASM and AT&T ?
Intel syntax doesn't feel like it was designed by a sadist.
More seriously, this site [imada.sdu.dk] link covers some differences. Among the things I like much more about Intel syntax: there's no need to add a ton of visual noise with what-should-be-extraneous $ and % symbols, and things like memory indirection is much easier to learn. Compare "[ebx+ecx*4h-20h]" to "-0x20(%ebx,%ecx,0x4)"; the former almost tells you what it does even if you're not at all familiar with the syntax, the latter definitely doesn't.
The main benefit that AT&T syntax has is that they "hungarian notation" their instructions: movb works on 1 byte, movw on 2 bytes, movl on 4. Most of the time this is extra visual noise (I don't need the 'l' to tell me that 'mov eax, ebx' works on 4 bytes), but it does make memory dereferences more concise. With Intel syntax you'll get a lot of 'dword ptr' stuff lying around to tell how much should be brought in from memory.
Re:why? (Score:4, Informative)
Not quite. There are always situations when writing an operating system where you need assembly. For example, impelmenting the actual 'guts' of a context switch requires fine tuned control over what is in each register.
(C programs tend to assume the stack is available. But in the middle of a context switch, it might not. Assembly gives that level of control).
Re:And how does it differ ? (Score:2, Informative)
To answer your other question about benefits, most of the benefit comes from your toolchain. If you're using a toolchain that is designed to work with AT&T syntax, like GCC, then no, there's no benefit. If you want to interoperate with MSVC, there's a ton of benefit. (In particular, if you want to use inline asm in a MSVC program, it uses Intel syntax.)
Re:I'll ask it (Score:5, Informative)
Wikiwars (Score:5, Informative)
Be warned -- JWASM's Wikipedia article was nominated for deletion [wikipedia.org], as it was thought that notability was not sufficiently asserted. The flame war there might spill over here as well. :-(
Re:xor my heart (Score:3, Informative)
The modern equivalent in 32-bit mode is cdq
These is absolutely necessary instruction when performing division, because on the x86 line division takes a double wide numerator from dx:ax in 16-bit, or edx:eax in 32-bit.
Re:xor my heart (Score:2, Informative)
Incorrect. It converts AX to the absolute value of AX.
Re:xor my heart (Score:3, Informative)
Don't think that's quite right. As I said in my other reply, I had to look up CWD in the Instruction Set Reference:
So this means if AX was originally positive, nothing happens, and if AX was originally negative the XOR flips the bits of AX, then the SUB subtracts minus one from it (which is the same as adding one). This is the same as the two's complement unary minus operation. So the snippet computes the absolute value of AX, and stores the result in AX.
Re:why? (Score:5, Informative)
LLVM (Score:3, Informative)
Frankly, optimizing assembly code is a PITA, since there are so much different flavors.
For example, AMD and Intel processors have different types of optimization.
If I were to code in assembly nowadays, I'd prefer to use something like LLVM: http://llvm.org/ [llvm.org] which should be able to generate good optimized code for any kind of processors, without the hassle of maintaining one routine per processor.
In some very extreme cases (like coding a RC5 decoder or multiprecision routines), it's still useful to use assembler, but in most other cases, I'm sure that LLVM is able to generate code much better than you could achieve manually in the same amount of time.
Re:xor my heart (Score:3, Informative)
the absolute value of your original AX
Come to think of it, that’s just in AX. DX contains the sign of AX.
I.e. it’s this, in terms of pseudo-code:
DX = sign(AX)
AX = abs(AX)
Re:xor my heart (Score:1, Informative)
CWD is Convert Word to Doubleword, and the 16 bit era used DX:AX as the doubleword. GP was thinking of CBW, Convert Byte to Word, which sign extends AL to AX. But yeah, you got it, and are almost ready to start learning how to code for the 386.
Re:why? (Score:2, Informative)
And this is precisely why Facebook requires 30,000 servers.
They might need 30.000 servers, but at least they don't need 30.000 programmers and another 30.000 testers.