# ------------------------------------------------------------------- # INSTRUCTION SET (c) Copyright 1996 Nat! & KKP # ------------------------------------------------------------------- # These are some of the results/guesses that Klaus and Nat! found # out about the Jaguar. Since we are not under NDA or anything from # Atari we feel free to give this to you for educational purposes # only. # Thanks to NEUROMANCER for most of the information contained in # here. # # Please note, that this is not official documentation from Atari # or derived work thereof (both of us have never seen the Atari docs) # and Atari isn't connected with this in any way. # # Please use this informationphile as a starting point for your own # exploration and not as a reference. If you find anything inaccurate, # missing, needing more explanation etc. by all means please write # to us: # nat@zumdick.rhein-main.de # or # kkp@gamma.dou.dk # # If you could do us a small favor, don't use this information for # those lame flamewars on r.g.v.a or the mailing list. # # HTML soon ? # ------------------------------------------------------------------- # 1997/03/30 02:27:14 # ------------------------------------------------------------------- This is not complete!! Assume for the following code that n, c, z are the GPU flags, which we'll just give the type 'flag', which is an unsigned kind of integer (1 bit long). Rn will be of type 'slword', and assumed to be 32 bit long. The instruction operations is given in "C-code" (hopefully correct). For people who don't know C: x & y x logical and y x | y x logical or y x ^ y x logical eor y x != y x not equal to y Result: 0 for equal, 1 for not equal x == y x equal to y x << y shift x left y times x >> y shift x right y times x ? y : z if x then y else z (lword) x x is treated as an lword the rest should be straightforward. slword: signed 32 bit lword: unsigned 32 bit flag: unsigned 1 bit flag z, n, c; /* three status flags */ slword Rn, Rm; /* two general purpose registers */ slword acc; /* internal accumulator register */ ------------------------------------------------------------------------ Instructions ABS Rn: ~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |010110|00000|nnnnn| +------+-----+-----+ (22) n = 0 c = Rn < 0 Rn = (lword) Rn > 0x80000000 ? -Rn : Rn; z = Rn == 0 Takes the absolute of the 32 bit twos-complement value. There's a bug, that 0x80000000 is not handled correctly. Examples: 0xFFFFFFF -> 0x00000001 0x7FFFFFF -> 0x7FFFFFFF 0x8000000 -> 0x80000000 (special case) ADD Rm,Rn: ~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |000000|mmmmm|nnnnn| +------+-----+-----+ (0) c = (lword) (Rn + Rm) < (lword) Rn Rn += Rm z = Rn == 0 n = Rn < 0 Just adds both registers. ADDC Rm,Rn: ~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |000001|mmmmm|nnnnn| +------+-----+-----+ (1) c = (lword) (Rn - Rm + c) < (lword) Rn Rn += Rm + c z = Rn == 0 n = Rn < 0 Just adds both registers plus the carry flag, that might have been leftover from a previous addition. Example: ; 64 bit add: R0/R1 LSL/MSL of x ; R2/R3 LSL/MSL of y ; x += y add r2,r0 addc r3,r1 ADDQ #immediate,Rn: ~~~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |000010|iiiii|nnnnn| +------+-----+-----+ (2) c = (lword) (immediate ? immediate : 32) + Rn < (lword) Rn Rn += immediate ? immediate : 32 z = Rn == 0 n = Rn < 0 Add with immediate data value contained in the instruction. The immediate value can be in the range from 1 to 32. ADDQMOD #immediate,Rn: DSP ONLY ~~~~~~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |111111|iiiii|nnnnn| +------+-----+-----+ (63) c = (lword) (immediate ? immediate : 32) + Rn < (lword) Rn Rn = (Rn + (immediate ? immediate : 32)) & MODULO; z = Rn == 0 n = Rn < 0 Add with immediate data value contained in the instruction. The immediate value can be in the range from 1 to 32. The result is finally ANDed with the contents of the modulo register. You can easily setup a circular buffer this way: movei #D_MODULO,r10 ;; address of MODULO register movei #buffer,r11 ;; address of our circular buffer movei #buffer_len,r12 ;; size in bytes of our buffer subq #1,r12 ;; prepare for register not r12 store r12,(r10) ;; set it up loop: addqmod #2,r11 ;; go round in circles jr t,loop ADDQT #immediate,Rn: ~~~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |000011|iiiii|nnnnn| +------+-----+-----+ (3) Rn += immediate ? immediate : 32 Like ADDQ except that the status flags aren't affected. AND Rm,Rn: ~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |001001|mmmmm|nnnnn| +------+-----+-----+ (9) Rn &= Rm z = Rn == 0 n = Rn < 0 Bitwise AND of the two registers. Examples: 0xAACC3355 & 0xFFFFFFFF -> 0xAACC3355 0xAACC3355 & 0x00000000 -> 0x00000000 0xAACC3355 & 0xFF00FF00 -> 0xAA003300 BCLR #immediate,Rn: ~~~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |001111|iiiii|nnnnn| +------+-----+-----+ (15) Rn &= ~(1UL << immediate) z = Rn == 0 n = Rn < 0 c = ? Clears the specified bit in the designated register. Bits are numbered from least significant bit to most significant bit. Example: MOVEI #$FFFFFFFF,r0 ; R0: 0xFFFFFFFF BCLR #0,r0 ; R0: 0xFFFFFFFE BCLR #31,r0 ; R0: 0x7FFFFFFE BSET #immediate,Rn: ~~~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |001110|iiiii|nnnnn| +------+-----+-----+ (14) Rn |= (1UL << immediate) z = Rn == 0 n = Rn < 0 c = ? Sets the specified bit in the designated register. Bits are numbered from least significant bit to most significant bit. Example: SUB r0,r0 ; R0: 0x00000000 BSET #0,r0 ; R0: 0x00000001 BSET #31,r0 ; R0: 0x80000001 BTST #immediate,Rn: ~~~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |001101|iiiii|nnnnn| +------+-----+-----+ (13) z = ! (Rn & (1UL << immediate)) n = Rn < 0 c = ? Sets the status register flags according to the state of the specified bit in the register. CMP Rm,Rn: ~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |011110|mmmmm|nnnnn| +------+-----+-----+ (30) c = (lword) (Rn - Rm) > (lword) Rn z = (Rn - Rm) == 0 n = (Rn - Rm) < 0 Compares Rm to Rn. This is just the same as a subtraction except that the registers aren't modified. CMPQ #immediate,Rn: ~~~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |011111|iiiii|nnnnn| +------+-----+-----+ (31) immediate is signextended! c = (lword) (Rn - (immediate > 16 ? 0xFFFFFFF0 + immediate : immediate)) > (lword) Rn z = (Rn - Rm) == 0 n = (Rn - Rm) < 0 Compares with an immediate value in the range of -16 to +15. DIV Rm,Rn: ~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |010101|mmmmm|nnnnn| +------+-----+-----+ (21) REMAIN = Rn - ((Rn / Rm) * Rm) Rn /= Rm Divide two 32 bit registers. The remainder will be available in the REMAIN register. IMACN Rm,Rn: ~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |010100|mmmmm|nnnnn| +------+-----+-----+ (20) ACC += ((Rm & 0xFFFF) | 0xFFFF0000) * ((Rn & 0xFFFF) | 0xFFFF0000) z = Rn == 0 n = Rn < 0 Both registers bottom 16 bits are used in the multiplication code only. This is a signed multiply and add. The result of the multiplication is added to an internal register, which is only accessible via RESMAC IMULT Rm,Rn: ~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |010001|mmmmm|nnnnn| +------+-----+-----+ (17) Rn = ((Rm & 0xFFFF) | 0xFFFF0000) * ((Rn & 0xFFFF) | 0xFFFF0000) z = Rn == 0 n = Rn < 0 Both registers bottom 16 bits are used in the multiplication code only. This is a signed multiply. IMULTN Rm,Rn: ~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |010010|mmmmm|nnnnn| +------+-----+-----+ (18) ACC = ((Rm & 0xFFFF) | 0xFFFF0000) * ((Rn & 0xFFFF) | 0xFFFF0000) z = Rn == 0 n = Rn < 0 Both registers bottom 16 bits are used in the multiplication code only. This is a signed multiply. The result is stored in an internal register. (see RESMAC) JR cc,relative: ~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |110101|ccccc|nnnnn| +------+-----+-----+ (53) c0: z == 0 c1: z == 1 c2: c4 ? (n == 0) : (c == 0) c3: c4 ? (n == 1) : (c == 1) if( (! c0 || z == 0) && (! c1 || z == 1) && (! c2 || (c4 ? (n == 0) : (c == 0))) && (! c3 || (c4 ? (n == 1) : (c == 1)))) { PC = PC + 2 + (immediate > 16 ? 0xFFFFFFF0 + immediate : immediate) * 2; } Because of the pipelined architecture the CPU will execute the instruction following the jump instruction before the jump has an effect. Therefore: sub r0,r0 ;; clear r0 jr t,foo addqt #1,r0 foo: ;; r0 will be 1 Branch relative to the current program counter. There are a few condition code patterns that are of more use than others, namely: %00000: T always %00100: CC carry clear (less than) %01000: CS carry set (greater or equal) %00010: EQ zero set (equal) %00001: NE zero clear (not equal) %11000: MI negative set %10100: PL negative clear %00101: HI greater than JUMP cc,(Rn): ~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |110100|ccccc|nnnnn| +------+-----+-----+ (52) c0: z == 0 c1: z == 1 c2: c4 ? (n == 0) : (c == 0) c3: c4 ? (n == 1) : (c == 1) if( (! c0 || z == 0) && (! c1 || z == 1) && (! c2 || (c4 ? (n == 0) : (c == 0))) && (! c3 || (c4 ? (n == 1) : (c == 1)))) { PC = Rn } Jumps to the address contained in the Register. See JR for more details. Example: ;; endless loop movei #routine,r0 routine: nop jump t,(r0) nop LOAD (Rm),Rn: ~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |101001|mmmmm|nnnnn| +------+-----+-----+ (41) if( Rm & 0x3) error( "bug"); Rn = MEMORY[ Rm] Just fetches a long word from memory. The address should be long word aligned also. LOAD (R14+m),Rn: ~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |101011|mmmmm|nnnnn| +------+-----+-----+ (43) if( R14 + ((m ? m : 32) << 2) & 0x3) error( "bug"); Rn = MEMORY[ R14 + ((m ? m : 32) << 2)] LOAD (R15+m),Rn: ~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |101100|mmmmm|nnnnn| +------+-----+-----+ (44) if( R15 + ((m ? m : 32) << 2) & 0x3) error( "bug"); Rn = MEMORY[ R15 + ((m ? m : 32) << 2)] LOAD (R14+Rm),Rn: ~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |111010|mmmmm|nnnnn| +------+-----+-----+ (58) if( R14 + Rm & 0x3) error( "bug"); Rn = MEMORY[ R14 + Rm] LOAD (R15+Rm),Rn: ~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |1101011|mmmmm|nnnnn| +------+-----+-----+ (59) if( R15 + Rm & 0x3) error( "bug"); Rn = MEMORY[ R15 + Rm] LOADB (Rm),Rn: ~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |100111|mmmmm|nnnnn| +------+-----+-----+ (39) if( Rm >= INTERNAL_RAM_START && Rm < INTERNAL_RAM_END) Rn = MEMORY[ Rm]; else Rn = ((byte *) MEMORY)[ Rm]; LOADW (Rm),Rn: ~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |101000|mmmmm|nnnnn| +------+-----+-----+ (40) if( Rm >= INTERNAL_RAM_START && Rm < INTERNAL_RAM_END) Rn = MEMORY[ Rm]; else Rn = ((word *) MEMORY)[ Rm]; LOADP (Rm),Rn: GPU ONLY ~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |101010|mmmmm|nnnnn| +------+-----+-----+ (42) if( Rm >= INTERNAL_RAM_START && Rm < INTERNAL_RAM_END) Rn = MEMORY[ Rm]; else { Rn = MEMORY[ Rm]; HIDATA = MEMORY[ Rm + 4]; } MIRROR Rn: DSP ONLY ~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |??????|00000|nnnnn| +------+-----+-----+ z = Rn == 0 n = Rn < 0 c = ? Just flips all the bits around. bit 0 goes to bit 31, bit 1 to bit 30 etc. I am too lazy to figure out the C code at the moment. Supposedly not only useful for simple graphic tricks, but also for doing FFT operations (butterfly addressing I believe). Example: movei #$A000010,r0 movei #$0800005,r1 mirror r0 sub r1,r0 ; result will be zero MMULT Rm,Rn: GPU ONLY ~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |110110|mmmmm|nnnnn| +------+-----+-----+ (54) Matrix multiplication MOVE Rm,Rn: ~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |100010|mmmmm|nnnnn| +------+-----+-----+ (34) Rn = Rm MOVE PC,Rn: ~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |110011|00000|nnnnn| +------+-----+-----+ (51) This supposedly does take prefetching, and pipelining into account to give the 'true' PC for this instruction. Rn = PC MOVEFA Rm,Rn: ~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |100101|mmmmm|nnnnn| +------+-----+-----+ (37) Rn = OTHERBANK[ Rm] MOVEI #immediate,Rn: ~~~~~~~~~~~~~~~~~~~~~~ 16 10 5 0 16 0 16 0 +------+-----+-----+ +-------------------+ +-------------------+ |100110|00000|nnnnn| | LSW | | MSW | +------+-----+-----+ +-------------------+ +-------------------+ (38) Rn = LSW + ((lword) MSW << 16) MOVEQ #immediate,Rn: ~~~~~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |100011|iiiii|nnnnn| +------+-----+-----+ (35) Rn = immediate MOVETA Rm,Rn: ~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |100100|mmmmm|nnnnn| +------+-----+-----+ (36) OTHERBANK[ Rn] = Rm MTOI Rm,Rn: ~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |110111|mmmmm|nnnnn| +------+-----+-----+ (55) ??? z = Rn == 0 n = Rn < 0 c = ? MULT Rm,Rn: ~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |010000|mmmmm|nnnnn| +------+-----+-----+ (16) Rn = (lword) (Rm & 0xFFFF) * (lword) (Rn & 0xFFFF) z = Rn == 0 n = (Rn & 0x80000000) != 0 NEG Rn: ~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |001000|mmmmm|nnnnn| +------+-----+-----+ (8) Rn = (lword) (Rm & 0xFFFF) * (lword) (Rn & 0xFFFF) z = Rn == 0 n = Rn < 0 NOP: ~~~ 16 10 5 0 +------+-----+-----+ |111001|00000|00000| +------+-----+-----+ (57) NORMI: ~~~~~ 16 10 5 0 +------+-----+-----+ |111000|mmmmm|nnnnn| +------+-----+-----+ (56) z = Rn == 0 n = Rn < 0 Rn = 0; for( i = 31; i >= 0; i) if( Rm & (1UL << i)) { Rn = Rm; break; } This works by returning a number which value is the position of the most significant bit. Apparently useful for handling 32bit IEEE FP numbers. (hmm) NOT Rn: ~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |001100|mmmmm|nnnnn| +------+-----+-----+ (12) Rn = ~Rn z = Rn == 0 n = Rn < 0 OR Rm,Rn: ~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |001010|mmmmm|nnnnn| +------+-----+-----+ (10) Rn |= Rm z = Rn == 0 n = Rn < 0 PACK Rn: GPU ONLY ~~~~~~~~ 16 10 5 0 +------+-----+-----+ |111111|00000|nnnnn| +------+-----+-----+ (63) (0) Rn = ((Rn & 0x03C00000) >> 10) | ((Rn & 0x0001E000) >> 5) | ((Rn & 0x000000FF)) The idea behind this instruction and its companion UNPACK seems to be that you can do something like this: movei #bitmap-2,r0 ; get a 256x256 Cry pixmap movei #destination-2,r1 ; reduce it to 128x128 loop: addqt #2,r0 loadw (r0),r2 ; fetch a pixel addqt #2,r0 loadw (r0),r3 ; and a second pixel unpack r2 ; get into "addable" form unpack r3 ; both add r2,r3 lsr #1,r3 ; adjust back addqt #2,r1 storew r3,(r1) Also I am not quite sure, what this will do to your colors. Probably this will work out nicely though. See UNPACK for some more details... RESMAC Rn: ~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |010011|00000|nnnnn| +------+-----+-----+ (19) Rn = ACC ROR Rm,Rn: ~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |011100|mmmmm|nnnnn| +------+-----+-----+ (28) c = Rn & 0x80000000 != 0 Rn = ((lword) Rn >> (Rm & 0x1F)) | ((lword) Rn << (32 - (Rm & 0x1F))) z = Rn == 0 n = Rn < 0 RORQ #immediate,Rn: ~~~~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |011101|iiiii|nnnnn| +------+-----+-----+ (29) c = Rn & 0x80000000 != 0 Rn = ((lword) Rn >> immediate)) | ((lword) Rn << (32 - immediate)) z = Rn == 0 n = Rn < 0 SAT8 Rn: GPU ONLY ~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |100000|00000|nnnnn| +------+-----+-----+ (32) Rn = Rn < 0 ? 0 : (Rn > 0xFF ? 0xFF : Rn) z = Rn == 0 n = 0 c = ? SAT16 Rn: GPU ONLY ~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |100001|00000|nnnnn| +------+-----+-----+ (33) Rn = (Rn < 0) ? 0 : (Rn > 0xFFFF ? 0xFFFF : Rn) z = Rn == 0 n = 0 c = ? SAT24 Rn: GPU ONLY ~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |111110|00000|nnnnn| +------+-----+-----+ (62) Rn = (Rn < 0) ? 0 : (Rn > 0xFFFFFF ? 0xFFFFFF : Rn) z = Rn == 0 n = 0 c = ? SAT16S Rn: DSP ONLY ~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |100001|00000|nnnnn| +------+-----+-----+ (33) Rn = (Rn < -0x7FFF) ? -0x7FFFF : (Rn > 0x7FFF ? 0x7FFF : Rn) z = Rn == 0 n = 0 c = ? SAT32S Rn: DSP ONLY ~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |100001|00000|nnnnn| +------+-----+-----+ (33) if( HIDATA != 0 && HIDATA != 0xFF) { if( HIDATA == 0x80) Rn = 0x80000000; else Rn = 0x7FFFFFFF; } z = Rn == 0 n = 0 c = ? SH Rm,Rn: ~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |010111|mmmmm|nnnnn| +------+-----+-----+ (23) c = Rm < 0 ? (Rn & 0x80000000 != 0) : (Rn & 0x1 != 0) Rn = Rm < 0 ? ((lword) Rn << -Rm) | ((lword) Rn >> Rm) z = Rn == 0 n = Rn < 0 SHA Rm,Rn: ~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |011010|mmmmm|nnnnn| +------+-----+-----+ (26) c = Rm < 0 ? (Rn & 0x80000000 != 0) : (Rn & 0x1 != 0) Rn = Rm < 0 ? (Rn << -Rm) | (Rn >> Rm) z = Rn == 0 n = Rn < 0 SHARQ #immediate,Rn: ~~~~~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |011011|iiiii|nnnnn| +------+-----+-----+ (27) c = Rn & 0x1 != 0 Rn = Rn >> (immediate ? immediate : 32) z = Rn == 0 n = Rn < 0 SHLQ immediate,Rn: ~~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |011000|iiiii|nnnnn| +------+-----+-----+ (24) c = Rn & 0x80000000 != 0 Rn = Rn << (32-immediate) z = Rn == 0 n = Rn < 0 SHRQ #immediate,Rn: ~~~~~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |011001|iiiii|nnnnn| +------+-----+-----+ (25) c = Rn & 0x1 != 0 Rn = (lword) Rn >> (immediate ? immediate : 32) z = Rn == 0 n = Rn < 0 STORE Rn,(Rm): ~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |101111|mmmmm|nnnnn| +------+-----+-----+ (47) if( Rm & 0x3) error( "bug"); MEMORY[ Rm] = Rn STORE Rn,(R14+m): ~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |110001|mmmmm|nnnnn| +------+-----+-----+ (49) if( R14 + ((m ? m : 32) << 2) & 0x3) error( "bug"); MEMORY[ R14 + ((m ? m : 32) << 2)] = Rn STORE Rn,(R15+m): ~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |110010|mmmmm|nnnnn| +------+-----+-----+ (50) if( R15 + ((m ? m : 32) << 2) & 0x3) error( "bug"); MEMORY[ R15 + ((m ? m : 32) << 2)] = Rn STORE Rn,(R14+Rm): ~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |111100|mmmmm|nnnnn| +------+-----+-----+ (60) if( R14 + Rm & 0x3) error( "bug"); MEMORY[ R14 + Rm] = Rn STORE Rn,(R15+Rm): ~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |111101|mmmmm|nnnnn| +------+-----+-----+ (61) if( R15 + Rm & 0x3) error( "bug"); MEMORY[ R15 + Rm] = Rn STOREB Rn,(Rm): ~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |101101|mmmmm|nnnnn| +------+-----+-----+ (45) if( Rm >= INTERNAL_RAM_START && Rm < INTERNAL_RAM_END) MEMORY[ Rm] = Rn else ((byte *) MEMORY)[ Rm] = Rn STOREW Rn,(Rm): ~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |110000|mmmmm|nnnnn| +------+-----+-----+ (48) if( Rm >= INTERNAL_RAM_START && Rm < INTERNAL_RAM_END) MEMORY[ Rm] = Rn else ((word *) MEMORY)[ Rm] = Rn STOREP Rn,(Rm): GPU ONLY ~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |101110|mmmmm|nnnnn| +------+-----+-----+ (46) if( Rm >= INTERNAL_RAM_START && Rm < INTERNAL_RAM_END) MEMORY[ Rm] = Rn; else { MEMORY[ Rm] = Rn; MEMORY[ Rm + 4] = HIDATA; } SUB Rm,Rn: ~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |000100|mmmmm|nnnnn| +------+-----+-----+ (4) c = (lword) (Rn - Rm) < (lword) Rn Rn -= Rm z = Rn == 0 n = Rn < 0 SUBC Rm,Rn: ~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |000101|mmmmm|nnnnn| +------+-----+-----+ (5) c = (lword) (Rn - Rm - c) < (lword) Rn Rn -= Rm - c z = Rn == 0 n = Rn < 0 SUBQ #immediate,Rn: ~~~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |000110|iiiii|nnnnn| +------+-----+-----+ (6) c = (lword) (immediate ? immediate : 32) + Rn < (lword) Rn Rn -= immediate ? immediate : 32 z = Rn == 0 n = Rn < 0 SUBQMOD #immediate,Rn: DSP ONLY ~~~~~~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |100000|iiiii|nnnnn| +------+-----+-----+ (32) c = (lword) (immediate ? immediate : 32) + Rn < (lword) Rn Rn = (Rn - (immediate ? immediate : 32)) & MODULO; z = Rn == 0 n = Rn < 0 SUBQT #immediate,Rn: ~~~~~~~~~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |000111|iiiii|nnnnn| +------+-----+-----+ (7) Rn -= immediate ? immediate : 32 UNPACK Rn: GPU ONLY ~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |111111|00001|nnnnn| +------+-----+-----+ (63) (1) Rn = ((Rn & 0x0000F000) << 10) | ((Rn & 0x00000F00) << 5) | ((Rn & 0x000000FF)) Unpacks CrY pixels. 32 28 24 20 16 12 8 4 0 +--------^---------^---------^--------^--------^--------^--------^--------+ | unused | ColMSB | ColLSN | luminance | +-------------------------------------+--------+--------+-----------------+ 31................................16 15...12 11...8 7.............0 unpacks to 32 28 24 20 16 12 8 4 0 +--------^---+-----^----+----^-----+--^------+--^-------^--------^--------+ | 0 0 0 0 0 0| ColMSB | 0 0 0 0 0| minor | 0 0 0 0 0| luminance | +------------+----------+----------+---------+----------+-----------------+ 31.......26 25....22 21....17 16...13 12.....8 7.............0 See PACK for some usage info for this instruction. Look into CrY for some general info about CrY pixels. XOR Rm,Rn: ~~~~~~~~~~~~ 16 10 5 0 +------+-----+-----+ |001011|mmmmm|nnnnn| +------+-----+-----+ (11) Rn ^= Rm z = Rn == 0 n = Rn < 0 ------------------------------------------------------------------------ Nat! (nat@zumdick.rhein-main.de) Klaus (kkp@gamma.dou.dk) 1997/03/30 02:27:14