# ------------------------------------------------------------------- # 68K (c) Copyright 1996 Nat! & KKP # ------------------------------------------------------------------- # These are some of the results/guesses that Klaus and Nat! found # out about the Jaguar with a few helpful hints by other people, # who'd prefer to remain anonymous. # # Since we are not under NDA or anything from Atari we feel free to # give this to you for educational purposes only. # # Please note, that this is not official documentation from Atari # or derived work thereof (both of us have never seen the Atari docs) # and Atari isn't connected with this in any way. # # Please use this informationphile as a starting point for your own # exploration and not as a reference. If you find anything inaccurate, # missing, needing more explanation etc. by all means please write # to us: # nat@zumdick.rhein-main.de # or # kkp@gamma.dou.dk # # If you could do us a small favor, don't use this information for # those lame flamewars on r.g.v.a or the mailing list. # # HTML soon ? # ------------------------------------------------------------------- # 68k.html,v 1.11 1997/03/30 02:27:11 # ------------------------------------------------------------------- Preface: There isn't much we need to tell you about the 68K. First you already know the chip since ten years probably, and secondly there are enough reference books available in case your memory is failing you. Let's just look at the way the processor is bound into the system and some things to watch out. IRQs: =-=-= IPL Name Vector Control ---------+---------------+---------------+--------------- 2 VBLANK IRQ $100 INT1 bit #0 2 GPU IRQ $100 INT1 bit #1 2 HBLANK IRQ $100 INT1 bit #2 2 Timer IRQ $100 INT1 bit #3 Note: Both timer interrupts (JPIT && PIT) are on the same INT1 bit. and are therefore indistinguishable. A typical way to install a LEVEL2 handler for the 68000 would be something like this, you gotta supply "last_line" and "handler". Note that the interrupt is auto vectored thru $100 (not $68) V_AUTO = $100 VI = $F004E INT1 = $F00E0 INT2 = $F00E2 IRQS_HANDLED=$909 ;; VBLANK and TIMER move.w #$2700,sr ;; no IRQs please move.l #handler,V_AUTO ;; install our routine move.w #last_line,VI ;; scanline where IRQ should occur ;; should be 'odd' BTW move.w #IRQS_HANDLE&$FF,INT1 ;; enable VBLANK + TIMER move.w #$2100,sr ;; enable IRQs on the 68K ... handler: move.w d0,-(a7) move.w INT1,d0 btst.b #0,d0 bne.b .no_blank ... .no_blank: btst.b #3,d0 beq.b .no_timer ... .no_timer: move.w #IRQS_HANDLED,INT1 ; clear latch, keep IRQ alive move.w #0,INT2 ; let GPU run again move.w (a7)+,d0 rte As you can see, if you have multiple INT1 interrupts coming in, you need to check the lower byte of INT1, to see which interrupt happened. Superstitions / Things to watch out for: =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= It looks like word/byte accesses to ROM space don't work. Looking at some code in the Jaguar Server indicates that the MEMCON registers come into play here. I have a hunch that RWM cycles (like CLR.W (a0)) on TOM registers aren't 100% safe. NEUROMANCER adds: NEVER do a clr.l (a0) into GPU/DSP memory you must do a move.l #0,(a0) or a move.l d0,(a0). The special thing about a CLR (on the 68000, fixed in the 68010 and onwards I believe) is, that the processor does a source read before doing a destination write. It could be that this buggy read is done in a slightly incompatible fashion to the other RMW instructions like TAS , BCLR ,, ASL et.c. Otherwise you must refrain from using any RMW instruction on GPU/DSP memory. If the 68K does not soak up leftover cycles, but does use up valuable bus resources its best to put it to sleep with HALT #2000 so it will sleep until the next IRQ wakes it up again. ADDENDUM: ========= Timing: =-=-=-= A few timing session got us the following results. Note that the timing was done with the video system, the GPU and the DSP shut down. See the addendum for part of the timing routine. [ This could be all bullshit of course ] total instr I R W min max avg min max avg sus ------------------------------------------------+---------------------- 1 8x moveq #0,d0 28 132 81 4 17 10 1 8x move.w d0,d0 28 132 81 4 17 10 1 8x move.l d0,d0 28 132 81 4 17 10 2 8x move.w #$FFF0,d0 108 212 162 14 27 20 1 1 8x move.w (a0),d0 172 276 223 22 35 28 1 1 8x move.w d0,(a0) (+/-) 188 292 243 24 37 30 34 3 8x move.l #$3FFF0,d0 188 292 243 24 37 30 2 1 8x move.w $3FF0,d0 252 356 308 32 45 39 1 2 8x move.l (a0),d0 252 356 309 32 45 39 42 2 1 8x move.w d0,$3FF0 268 372 324 34 47 41 1 2 8x move.l d0,(a0) (+/-) 284 388 341 36 49 43 46 3 1 8x move.w $3FFF0,d0 332 436 390 42 55 49 3 1 8x move.w d0,$3FFF0 348 453 406 44 57 51 3 2 8x move.l $3FFF0,d0 412 516 471 52 65 59 3 2 8x move.l d0,$3FFF0 444 548 503 56 69 63 3 1 1 8x move.w $1000,$1004 492 596 552 62 75 69 5 1 1 8x move.w $30000,$30004 652 756 716 82 95 90 3 2 2 8x move.l $1000,$1004 668 772 732 84 97 92 8x mulu.w d1,d0 700 784 754 88 98 94 5 2 2 8x move.l $30000,$30004 828 932 894 104 117 112 1 2 4x move.l (a0),d0 100 204 154 25 51 39 1 2 8x move.l (a0),d0 252 356 309 32 45 39 1 2 32x move.l (a0),d0 1164 1268 1236 36 40 39 ------------------------------------------------+---------------------- I: instruction words R: data words read W: data words written avg: average min: minimum encountered max: maximum encountered sus: approx. sustained average (doing 16 mio accesses) cycle times in 26.591 Mhz cycles ----------------------------------------------------------------------- 4 cycles for 8x move.l d0,d0 looks weird at first. This result can happen if the 'reference value' was off. The maximum number could happen if the 'reference value' is OK and the timing 'value' is off. If one looks closely then the difference between min and max is 104 cycles on a measurement basis, therefore the average value should be about right. Due to the apparent preference for immediate data, it would appear that the I/O Latch also acts as a small read cache (64 bit probably) for the 68000. Technically though, this sounds like a riscy idea for a multi- processor system, because there's no bus snooping to be expected. Data writes on the average are a bit slower than data reads. This is a bit strange, because the timings suggest that for every write of the 68K an indivisible read modify write cycle is done, effectively using two bus cycles for a write. Of course architecturally this would be very stupid. It would seem that the memory interface acknowledges to the 68000 only when the data has indeed been written (doesn't buffer). The 2 cycles slower average on the timings suggest that happening. The sustained measurement was done with a simple C, doing 16 times move d0,(a0)+ or move (a0)+,d0 and this for 1 million iterations. (not very accurate, because the loop code was not filtered out) The results with VIDEO OFF: access time mio bytes/s cycles/move ---------------+--------+------------+------------- 16 bit writes 20.6s 1.6 34 32 bit writes 27.8s 2.3 46 32 bit reads 25.3s 2.5 42 The code: =-=-=-=-= ;; can't use D6+D7 .macro TESTCODE .rept 8 move.l (a0),d0 .endr .endm code: movem.l d1-a6,-(a7) lea $3FFF0,a0 moveq #23,d1 moveq #7,d0 moveq #-1,d5 .punt: move.w d5,PITLO move.w PITLO,d6 nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop move.w PITLO,d7 sub.w d6,d7 bcc.b .ok neg.w d7 .ok: move.w d7,-(a7) ; reference lea $3FFF0,a0 move.w d5,PITLO move.w PITLO,d6 nop nop nop nop nop nop nop nop TESTCODE nop nop nop nop nop nop nop nop move.w PITLO,d7 sub.w d6,d7 bcc.b .ok2 neg.w d7 .ok2: sub.w (a7)+,d7 bcs .punt moveq #0,d0 move.w d7,d0 movem.l (a7)+,d1-a6 rts ------------------------------------------------------------------------ Nat! (nat@zumdick.rhein-main.de) Klaus (kkp@gamma.dou.dk) 1997/03/30 02:27:11