Chapter 4
Assembly Language Applied To Game Design

The words, Machine language and/or Assembly language, evoke visions of indecipherable code to the novice BASIC programmer. The code looks unfamiliar. But so was BASIC when you were first learning it. While BASIC has its roots in the English Language and algebraic expressions, Assembly language appears to consist of unfamiliar op codes or mnemonics that are used in conjunction with an unfamiliar base 16 number system called hexadecimal.

It is our intent in this chapter to teach you the fundamentals of Assembly language programming by comparing it to similar code written in BASIC. Rather than teach you all aspects of the language, we will concentrate only on the operations needed to do simple game graphics.

A good Assembler is needed to write Assembly language programs. An assembler merely translates mnemonics like JMP, which is equivalent to a GOTO, into hexadecimal opcodes that the computer understands. Most Assemblers have an editor, an Assembler, and a debugger. The editor allows you to enter Assembly language code usually by line number and later edit, delete, or insert particular lines. The Assembler portion converts your source listing into Machine Code in a two-pass operation. Since any line of code can have a label in its first field, the Assembler will automatically calculate the branches or GOTOs to lines referenced with these labels. Also, if you want to store a variable called ZAP, the Assembler which assigns a memory storage location for the variable will automatically furnish the correct memory address for any subsequent store or load operations using that variable. Last, there is a Machine language monitor or debugger that helps locate errors. It allows you to examine and change both memory and internal registers. It also includes step and trace features that allow you to step through your code one instruction at a time.

Readers who already own assemblers may use the one they have. We have provided a translation table in the Appendix in the back of this book to aid you in converting our SYNASSEMBLER source code to that used in your assembler. We chose SYNASSEMBLER when we began this book in the Spring of 1983 because it was co-resident (screen editor, assembler, and debugger are in memory simultaneously) and was available in cartridge form.

For those of you who are new programmers, or are unhappy with their present assembler, we recommend either the F-S Macro Assembler 40/80 from Stanton Products (See coupon in back of book), an enhanced disk version of the now discontinued SYNASSEMBLER, or MAC 65 from Optimized Systems Software. Both of these assemblers are fast (2000 lines/minute), are co-resident assemblers, allow source files to be chained, and offer a choice of assembling to either disk or memory. Both of these are professional packages and are used as development tools in various software houses. The F-S Macro Assembler 40/80 and the discontinued SYNASSEMBLER are both derived from the S-C family of assemblers on the Apple II computer. Whereas the new F-S Macro Assembler 40/80 is compatible with the new XL series of computers, unpatched versions of SYNASSEMBLER are not. The F-S Macro Assembler 40/80 is completely compatible with SYNASSEMBLER source files with the exception of the way it handles ATASCII string data. A simple global replace will suffice. (see note in Appendix on assemblers differences.)

Our readers will certainly want to know why we don't use the more popular Atari Editor Assembler cartridge. First, it is very, very slow, often taking ten minutes to assemble a 1000 line program. Second, it doesn't allow chaining of files, nor assembly to the disk. Third, it is full of bugs. It remains popular mostly to beginner programmers who want to try to write a very short Assembly language subroutine that will interface to their BASIC programs.

Basic Assembly Language

The Atari computers contain a central processing unit (CPU), a 6502A microprocessor that operates at 1. 8 Mhz. It accepts instructions to perform various operations, like taking a value and storing it somewhere in memory, adding a number to another number located in one of its internal registers, or comparing two values. What makes programming in Assembly language rather difficult (or at least tedious) is that the computer can only execute one tiny instruction at a time, and only perform its operations in three internal registers. These three addressable registers are known as the X register, Y register, and Accumulator. Each can hold eight binary digits called bits, which are individually valued at 0 or 1. The eight bits, collectively called a byte, have values ranging from 0 to 255 decimal or ($00 to $FF in hexadecimal notation).

Essentially, the computer, which is an eight-bit microprocessor, can manipulate data whose values range from all eight bits off (00000000) to all eight bits on (11111111). The average person has great difficulty in thinking of values represented by 0's and 1's. Fortunately, someone invented a number system called hexadecimal, which is base 16 instead of binary or base 2.

Hexadecimal Numbers

Since 16 is 2x2x2x2, we can divide our eight bits into two four-bit groups. If you determine each of the decimal equivalents of all the combinations of base two representations, you obtain the following table. These values range from 0 to 15 decimal. In the hexadecimal numbering system, values above 9 are represented by the letters A-F. In order to prevent confusion between decimal and hexadecimal numbers, hexadecimal numbers are preceded by a "$".

 	BINARY	  DECIMAL  HEXADECIMAL
	 0000	     0	       $0
	 0001	     1         $1
	 0010	     2	       $2
	 0011	     3	       $3
	 0100	     4	       $4
	 0101	     5	       $5
	 0110	     6	       $6
         0111	     7	       $7
         1000        8         $8
         1001        9         $9
         1010       10         $A
	 1011	    11	       $B
	 1100	    12	       $C
	 1101	    13	       $D
	 1110	    14	       $E
	 1111	    15         $F

Hexadecimal numbers are very much like decimal numbers. They can be added and subtracted in like manner. The only difference is that instead of having units, tens, hundreds, etc, the hexadecimal numbers have units, sixteens, 256's, and so forth. Each successive digit is sixteen times the position to the right instead of ten times as in our decimal system.

      DECIMAL        HEXADECIMAL
      1 6 5          $ 1 3 A
      
      1 HUNDRED        1-256
        6 TENS           3 SIXTEENS
           5 ONES           A ONES

	1 x (100) = 100     1 x (256) = 256
      + 6 x (10) = 60      +3 x ( 16) = 48
      + 5 x( 1) = 5        +A x (  1) = 10
      
               165 DECIMAL             $13A = 312 DECIMAL

Hexadecimal numbers are used to address the Atari's 48000+ memory locations. Each group of 256 bytes ($00 - $FF) is called a page, starting with page zero. In 48K Atari computers, memory is directly addressable from locations $0000 to $BFFF (0 -49151). Locations above $BFFF are also addressable, but these locations don't contain RAM. The area from $D000 to $D7FF contain custom hardware chips such as the GTIA, POKEY, PIA, and ANTIC microprocessor. Some of these hardware locations can be read and some written to. The area above that, $D800 to $FFFF, contain the 10K operating system ROMS.

Memory Considerations in Assembly Language

The bottom of RAM, pages 0 thru 5 ($0000 - $05FF) are generally off limits for program storage. Zero page ($00 - $FF) is a very special area. There are a number of zero page addressing instructions that execute faster because they require only two instructions instead of the usual three. This is because they only need to address a memory location from $00 to $FF instead of $0000 to $FFFF. These locations are used extensively by the Operating System.

Only the last few bytes of zero page are available to the user. In fact, if you are using Synassembler only locations $F0 - $FF are totally free. You can also use $D6 - $EF, if you don't mind if your data is altered by the floating point package each time arithmetic operations are performed by BASIC. And if you are writing a subroutine to be accessed from BASIC, only locations $CB - $D1 (203-209) are available. $D4 and $D5 can be used to send variables back to BASIC via the USR function.

Page one of memory ($100 - $1FF) is reserved for the stack. It is used by a special purpose register in the 6502A microprocessor for keeping track of return addresses when calling subroutines. This scratch area for the Stack Pointer is sometimes used for temporary register Storage.

Pages two and three are used for various I/0 operations, and operating system shadow registers, page four for the cassette buffer, page five for the keyboard buffer, and pages seven through twenty-eight for DOS 2.0's file management system. Essentially the area below 7420 ($1CFC), with the exception of page six, is off limits to programmers using DOS. However, if DOS isn't resident, you can begin storing at 1792 ($700) safely. A pointer to the low end of memory, MEMLO, can be read at locations 743,744 ($2F7,2F8).

Program Counter & Program Status Word

When a microprocessor processes a Machine language program, it keeps track of which instruction it is executing with an internal 16-bit register called the program counter. The program counter contains the current address of the instruction that is being processed. When the computer finishes with an instruction, it sets a flag or condition in a 7-bit, Program Status Word, which is another register. For example, if you want to test if a value in the Accumulator is equal to zero, you compare the value in the Accumulator to zero. If this value is equal to zero, the zero flag will be set and the next instruction, Branch Equal to Zero (BEQ), will be executed. Other flags that can be set are the carry flag, and the negative flag. A diagram of the Program Status Word is shown below.

(figure)

OP Codes

The 6502A microprocessor accepts only Machine language instructions. These are called opcodes. When the computer encounters a $4C, it performs an equivalent to a GOTO in BASIC. The Machine language instruction $4C 00 08 tells the computer to jump to memory location $800. (Remember, addresses require two bytes. The low order byte in this case contains $00 and the high order byte, $08--in effect, the reverse order of the actual values.) Unfortunately, Machine language is difficult to remember, so programmers invented a substitute called Assembly language, wherein each opcode is assigned a mnemonic such as JMP, BRK, or LDA. The above example looks like this: JMP $0800.

If you were to type the following Machine Code into the monitor in your Assembler, you would see how the monitor disassambler interprets the code, as in the following example:

4000: A9 30 8D 00 41 CE 00 41 AD 00 41 C9 00 D0 F6 60 [CR]

If you enter a 4000L from the Synassembler monitor you will see the following:

4000: A9 30    00030	  LDA #$30
4002: 8D 00 41 00040	  STA $4100
4005: CE 00 41 00050	  DEC $4100
4008: AD 00 41 00060	  LDA $4100
400B: C9 00    00070	  CMP #$00
400D: DO F6    00080	  BNE $4005
400F: 60       00090	  RTS

The disassembler translates the Machine Code to more easily understood mnemonics. In the first line of code, LDA is the mnemonic for Load Accumulator. It is the instruction for the 6502 to load the Accumulator with an immediate value--in this case, $30. The # sign signifies that it is an "immediate" instruction; the ($30) is the data portion of the instruction. The STA in line two is an "absolute" instruction. It specifies the address in memory for storing the byte of data that is in the Accumulator.

The difference between "immediate" and "absolute" instructions is an important point. Let us take the example LDA #$30. In this "immediate" instruction, the computer takes the operand ($30) as a value and places it in the Accumulator. However, LDA $30 is an "absolute" instruction, so the computer takes the operand as an address from which to load data into the Accumulator. In both cases, we get a value in the Accumulator. You can tell the modes apart because "immediate" instructions have a # sign before the operand.

You might wonder, what does this code do? It is a time delay subroutine. It puts a decimal 48 in memory location $4100. Line two stores it there, then the value stored at that memory location is decremented by one in line three. It is then reloaded into the Accumulator to be compared against the value zero. If it is zero it falls through to the return-from-subroutine instruction and ends; but if it isn't zero it branches back to memory location $4005. That location tells the computer to decrement the value in $4100 once again. The code will perform this small loop until the value in $4100 becomes zero. At that time, the test for a zero becomes true and the program returns to the line after the JSR in the program that called it.

Does it work? First type 400E:00 to change the RTS to a BRK. This will return us to the monitor when we are finished. Then type 4100:AA to place something in that memory location so that if you look at it later you will believe the program did something. Finally, do type 4000G to start the routine. The code returns you back to the monitor when it finishes a split second later. Now type 4100 and a 00 is returned. This is the value in memory location $4100. You can do a 4000S and an S each time to watch the code single step, or you can trace the entire operation by typing a 4000T . The strange numbers that appear below each line of code are the values in the internal registers. A is for Accumulator, X for X register, Y for Y register, P is the Program Status Word, and S is the Stack Pointer.

This program has a direct analogy to the following BASIC program:

10 X=48 
20 X=X-1 
30 IF X<>0 THEN 20 
40 RETURN 

The major differences between the two programs is that in Assembly language there are no line numbers used within the code (line numbers are used only by the editor to place your text in order, and you have to take care of every minute detail. BASIC automatically assigns the storage locations of all variables and the location of each instruction in memory. In Assembly language programming, we have to assign the X variable to memory location $4100, and have to calculate the relative branch or GOTO so that it references the memory location $4005. This is done by branching back $F6 bytes or -8 bytes to the proper address. Yet many of these details can be greatly simplified if we use an Assembler to do our programming.

The same program using an Assembler looks like the following:

LINE LABEL INSTRUCTION COMMENT
         FIELD FIELD	FIELD

00010	      .OR $4000	 ;ASSEMBLE CODE AT $4000
00020 X	      .EQ $4100	 ;X IS STORED AT $4100
00030	      LDA #$30
00040	      STA X
00050 LOOP    DEC X	 ;X=X-l
00060	      LDA X
00070	      CMP #$00	 ;DONE?
00080	      BNE LOOP
00090	      RTS

The Assembler generates identical Machine Code, but many of the tedious details are simplified. Once X is equated to the memory location in line 2, references to that variable in lines 4 through 6 are handled automatically. If X were assigned to a different memory location because we lengthened our program, you would only have to change line 2. Also, labels act like line numbers in BASIC. Since the Assembler assigns the line of code labeled LOOP to a particular memory location, it can calculate the correct branch automatically when it encounters line 8 during assembly. The .OR in line 1 is a pseudo-op, understood only by the Assembler. This does not generate code but tells the Assembler where the code is to be run and stored. The pseudo-op .TF causes the generated code to be stored to the disk rather than to memory.

Addressing Modes

Now that you have had a taste of Assembly language programming and have seen that it isn't as bad as you thought, there are a number of fundamental operations that must be learned. The most important operation is to move numbers from one memory location to another. This can be accomplished by loading a value into any one of three internal 6502 registers--the Accumulator, X, or Y registers--and storing that number somewhere in memory. A LDA (Load Accumulator) instruction can be carried out in several different ways depending on its addressing mode. First we can load the Accumulator with a real hexadecimal value (LDA #$05). This is called Immediate Mode Addressing. Sometimes we need to be able to load the Accumulator with a variable stored in a memory location (LDA $4100). This is called Absolute Addressing.

The only other addressing method that we will discuss for the time being is the Indexed Addressing mode. It takes the form of LDA $4100,X or LDA $4100,Y depending on whether the X or Y register is used as an index. If, for example, the X register contains a #$05, then the instruction above loads the value from location $4100 + $05 or $4105. This addressing mode is used primarily for indexing into tables stored at particular memory locations. There is no problem with the tables crossing page bounderies. For example, if your table began at $4080 and the X-register contained a $90, then the instruction LDA $4080,X would fetch the value in memory location $4080 + $90 or $4110.

EFFECTIVE ADDRESS = ABSOLUTE ADDRESS + X

EFFECTIVE ADDRESS = ABSOLUTE ADDRESS + Y

Store operations are similar to load operations. You can store a value into an "absolute" memory location, or you can store indirectly into a memory location, offset by the value contained in either the X or Y register.

In summary, the table below shows the various load and store operations.


        ACCUMULATOR     X REGISTER 	Y REGISTER

LOAD	LDA #$05	LDX #$05	LDY #305
        LDA $4100       LDX $4100       LDY $4100
	LDA $4100,X     LDY $4100,X
	LDA $4100,Y	LDX $4100,Y

STORE	STA $4100	STX $4100	STY $4100
	STA $4100,X                     STY $4100,X *
	STA $4100,Y	STX $4100,Y *

*Both indirect operations involve zero page addressing only.

Incrementing & Decrementing

Sometimes it is necessary when counting cycles, or looping through code to increment or decrement a value directly similar to a FOR-NEXT loop in BASIC. In Assembly language, either the X and Y registers or any memory location can be incremented or decremented. If the X register contained a $FE, then it would contain $FF when incremented. But if it contained a $FF, it would wrap around to become $00. The computer informs you by setting a zero flag in its Program Status Register.

                ACCUMULATOR	X-REG   Y-REG   MEMORY LOCATION

INC BY 1	NOT AVAILABLE	INX	INY	INC $4100
DEC BY 1	NOT AVAILABLE	DEX	DEY 	DEC $4100

Stack Instructions

There is a special area in the computer ($100 - $1FF) that is used quite frequently by an internal register called the Stack Pointer. The computer uses this area to save return addresses when handling either interrupts or subroutines. The stack is like a dish dispenser. Bytes are pushed on the stack in order, and pulled off in reverse order. The first byte stored is the last byte to be pulled off. The Stack Pointer always points to the next free byte in the stack. Since the stack is only 256 bytes long, only 128 address pairs can be stored at any one time.

Normally the stack would be of little interest to programmers except that it can also be used to temporarily store data. If you were worried about your three registers being altered in a subroutine, you could push all three values onto the stack before calling the subroutine, and then pull them back off when you return from the subroutine. BASIC also uses the stack to transfer data in the USR function when calling a Machine language subroutine. The top byte in the stack contains the number of variables being passed. The values follow in two byte pairs in hi byte low byte order.

Two basic Machine language instructions provide key tools for using the stack. PHA pushes the value in the Accumulator on the Stack. PLA pulls the top value of the stack and places it in the Accumulator. Since these instructions only involve the Accumulator, you would need to transfer the value in the X register to the Accumulator (TXA) in order to save the X register on the stack. Similarly you would transfer the Y register to the Accumulator (TYA) first before a PHA to the stack. Be careful when working with the stack. For instance, if you push data onto the stack while in a subroutine and don't pull it back off, when the subroutine reaches the RTS instruction it will return to the main program at the wrong address.

Altering Program Flow

Program flow can be altered, as in BASIC, with instructions that resemble GOTO, GOSUB, and IF ... THEN statements. The JMP instruction is equivalent to a GOTO statement; it can transfer control to any location in the machine to continue executing code. JMP $8D6C instructs the computer to continue executing code beginning at address $8D6C. The GOSUB statement is identical to a JSR (jump Subroutine) in Machine language. When the computer reaches the instruction $5A83, it pushes the two-byte memory address of the instruction onto the stack, so that when it returns from the subroutine via an RTS (ReTurn from Subroutine), it will know the address where it will continue the program. When it returns, it pulls the return address off the stack and increments it by one so that it points to the next executable instruction.

The IF ... THEN statement is analogous to a number of branch instructions which test the Program Status Register to see which flags are set. Usually, you use compare operations to set flags. You can compare a value against the value stored in either the Accumulator, the X or the Y Registers. The mnemonics are CMP, CPX and CPY, respectively. For example,

        LDA $4100 ;LOAD ACCUMULATOR WITH VALUE AT $4100 
        CMP #$05

Different flags are set depending on the result.

Branch instructions are very similar to a JMP instruction (which is an unconditional branch), except that only under certain circumstances will they cause program flow to continue at a different location. For example, if we were to test for that wraparound case when we incremented the X-register that contained $FF, we would want to test the Zero Flag with a Branch Equal Zero (BEQ) instruction, and go to some label if the condition is true.

           LDX $4100    ;LOAD X REGISTER WITH VALUE IN MEMORY
	   INX	        ;INCREMENT X - REGISTER
	   BEQ SKIP	;TEST IF 0, AND IF TRUE GOTO SKIP
	   RTS	        ;RETURN TO MAIN PROGRAM
SKIP	   LDA #$04
            .   .
            .   .

This short example loads a value from the memory location into the X register, then increments it. If wraparound occurs, the test for a zero flag causes the program to jump to a label called SKIP, and the code does not return to the program that called it via the RTS. There are numerous tests on each of the flags in the Program Status Register. A summary is shown below.

BCS	-	Branch if the carry flag is set.	      C = 1 
BCC	-	Branch if the carry flag is clear.	      C = 0 
BEQ	-	Branch if the zero flag is set.	              Z = 1 
BNE	-	Branch if the zero flag is clear.	      Z = 0 
BMI	-	Branch if minus.	                      N = 1 
BPL	-	Branch if plus.	                              N = 0 
BVS	-	Branch if overflow is set.	              V = 1 
BVC	-	Branch if overflow is clear.	              V = 0

Most Assemblers offer alternative mnemonics for BCC and BCS. Since, during comparisons, the carry flag is set when the value in the appropriate register is equal or greater than the value compared, BCS might be called BGE (Branch Greater or Equal). Likewise, BCC is equivalent to BLT (Branch Less Than). Why use these alternatives? Because they are easier to remember and visualize, and they make it clear that you are doing logical comparisons, rather than testing the results of an addition or subtraction.

There is one other important concept that should be understood when doing comparisons. I implied that the subsequent branch was like a GOTO in BASIC or like a JMP in Assembly language. This is not entirely true, since the range of the branch cannot exceed -126 to +129 bytes. This is because the branch instruction is only two bytes long. The first byte is the instruction code and the second the relative address. It takes a two byte address to branch to any place in memory (Except Page Zero). The JMP instruction has the advantage that it is three bytes long. In most cases, this limitation will not cause problems. But if a "branch out of range error" occurs, you must reverse the test so that it will reach the required destination via a JMP instruction.

Example: If BEQ SKIP is out of range then substitute the following:

BNE *+$5     or	    BNE B
JMP SKIP	    JMP SKIP
.		    B NOP
.                   .

This change causes the program to drop through the JMP instruction if the zero flag was set, and then jump to location SKIP. However, if the zero flag is not set, it will advance ahead five bytes to the instruction following the JMP. All other branch instructions work in a similar manner. This gives the equivalent of a Long Branch.

Addition & Subtraction

Simple addition and subtraction of unsigned numbers is easily accomplished in Machine language. All additions and subtractions must be performed one byte at a time. Thus, large numbers or multi-byte numbers (those that exceed $FF), must be added or subtracted one byte at a time, and the carry flag must be accounted for. It's actually not much different from addition of two multi-digit decimal numbers. Those numbers have a digit in the ones column, another in the tens, etc. If you add 65 to 78, you add the ones column first. Five plus eight equals 13. The value in the ones column is 3; you then carry the one "ten" into the tens digit column before you add the two numbers in the tens column. Hexadecimal addition is similar. You clear the carry before you add. If the sum of the two values exceeds $FF, the carry is set. Since you don't clear the carry when adding the next higher byte, the resultant answer will be the sum plus the previously computed carry, as in the following example:

EXAMPLE:	+CARRY
	            63    F4
	           +02   +16
	           ---   ---
                    66    0A ;SETS CARRY

The code for addition and subtractions is as follows:

ADDITIONS

 
CLC	         ;CLEAR CARRY
LDA #$F4	 ;LOAD LOW ORDER BYTE
ADC #$16	 ;ADD WITH CARRY
STA LOW	         ;STORE LOW BYTE
LDA #$63	 ;LOAD HIGH ORDER BYTE
ADC #$02	 ;ADD WITH CARRY (NOTE DON'T CLEAR CARRY)
STA HIGH	 ;STORE HIGH BYTE

SUBTRACTIONS

SEC	         ;SET CARRY FLAG
LDA #$F4	 ;LOAD VALUE
SBC #$16	 ;SUBTRACT WITH CARRY
STA VALUE        ;STORE RESULT

You should be aware that the rules for subtraction are different from the ones for addition. The carry must be set first. This is equivalent to a borrow in subtraction. After the subtraction operation, the carry will be clear if an underflow (borrow) occurred. The carry will be set otherwise. Setting the carry is very important, a step that many beginners forget. The results are invariably incorrect if this step is skipped--and possibly even "random," since the status of the carry flag can be on or off when the subtraction operation is performed. This can make debugging difficult.

Breakout Game (BASIC)

The "Breakout" game involves the simplest animation technique available on the Atari, moving individual pixels from one position to a new position. We have a Graphics 5 pixel-sized ball that bounces around the screen. It will ricochet off a movable paddle, the walls, or any of the 2 pixel-high by 5 pixel-wide colored bricks. Movement is accomplished by erasing the ball at its old position and redrawing it at its new position. The ball is very predictable. It changes direction only upon collision, and in all cases (except contact with the paddle) simply reverses direction. The point of contact with the joystick-controlled paddle determines the ball's direction. Balls striking the left end travel upwards and to the left at a 45 degree angle, while balls striking the inside left travel in the same direction but at a 60 degree angle. Balls striking the paddle's right side travel at similar angles, but to the right.

Once you have the design description, in this case a game that is an old classic, the next step is to translate it into a logical sequence of events and their consequences. This can best be accomplished by drawing a flow chart that shows the possible pathways for each module in the program. Each of these modules can be as small as a single statement, or can consist of entire subroutines. No matter how detailed or general you make it, the flowchart must accurately represent the game's logic. While it is a good tool for learning to think logically, a flowchart isn't necessary or required in all cases. Many good programmers have never drawn one. They obviously have the ability to flowchart unconsciously in their minds.

The game should be programmed in small steps rather than as a complete entity. This way you get to see results early. Besides, it is easier to debug a small section, such as the ball bouncing off the paddle and moving around the screen, than to attempt to debug a complete program that is full of errors. The most successful programmer will be one who can debug by watching what goes wrong on the screen.

(figure)

Paddle Position

Determining where the ball strikes the paddle is easy in our "Breakout" game. The paddle is always drawn two-pixels wide at row 36 decimal or $24, and the first pixel begins at PX, a variable controlled indirectly by the joystick. Actually the new paddle position is P = P + D where D depends on the direction of movement and whether the button is being pressed. If the joystick is pushed to the left, D=-1, while if it is pushed to the right, D=1. When the button is pushed, D=D*3, and the paddle moves at triple speed. The Boolean logic in line 230, ((P+D)>0 AND P+D<76) gives a value of true=1 or false=0 depending on whether the paddle has exceeded the screen bounds after movement. If it hasn't, the result is P=P+D*(1), and there is a new paddle position. If it has, the result is P=P+D(0) and the paddle remains stationary.

Ball's Position & Velocity

It is easy to compare the ball's new vertical position NX to that of the paddle's leftmost position PX. The difference NX-CX is C. You can use this value to index into a table to obtain the new horizontal velocity; DX = C(C). These values vary with position. The two outside blocks give a DX of + 1 or 1, and the two inside blocks give a DX of +1/2 or -1/2. The vertical velocity, DY is equal to -1 since the ball is always travelling upwards after striking the paddle.

(figure)

In order to update the ball's position, we take the old ball's position and add the change in position or its directional velocity. The format is:

NEW POSITION = OLD POSITION + CHANGE IN POSITION

NX = BX + DX
NY = BY + DY

Incrementing or decrementing the ball's position by 1/2 in the X direction is not physically possible since screen positions are whole numbers. The ball's position is truncated to the nearest integer value with the INT function. The result is that the ball remains stationary in the X direction during one frame, then moves one whole pixel position during the next frame or cycle.

Collisions with Bricks

As the ball bounces around the screen it will soon collide with one of the colored 2 by 5 pixel-sized bricks at the top of the screen. It is possible to test for a collision by using the LOCATE function. This function, which returns the color register at the ball's position, works only in BASIC Graphics modes 3-8. Non-zero values in this example indicate a collision with one of the three colored bricks (Playfields #1-3).

If there is a collision, the correct block needs to be removed. This is quite simple to calculate for the X direction:

C INT(NX/5)*5

You still need to determine if the ball hit the brick in an even or odd pixel row. It might appear that the ball would always collide with the bottom or odd row of pixels first, but if there are gaps between bricks as occurs later in the game, the ball can approach from the side and strike the brick along the top or even row of pixels. If the ball strikes the bottom row, you will need to adjust the position to the brick's top row in order to erase one complete brick. The test is a very simple Boolean function in line 320. For example if the ball's new vertical position, NY=9, then NY/2 <> INT(NY/2) would reduce to 9/2 <> 4 which is true. We would then decrement NY to an even number in order to erase the complete block. The top left corner of the 2 pixel by 5 pixel brick is C,NY. Five pixels are erased from C,NY to C+4,NY in each of its two rows.

The brick's score depends on its playfield color. SCORE = SCORE + SCORE(C), where C is the value returned by the locate function. The yellow (playfield #1) bricks at the top are worth ten points, the green (playfield #2) bricks in the middle are worth five points, and the blue (playfield #3) bricks at the bottom are worth only three points.

The ball's vertical direction of travel reverses upon collision with a brick. It continues in the horizontal direction until it reaches either the left or right playfield boundary at BX=0 or BX=79. It reverses direction there so that DX = -DX. If the ball reaches the top of the playfield at BY = 0, it will reverse its vertical direction. But if the ball reaches the bottom it is lost and we begin again with a new ball. The game will end when we have run out of either bricks or balls.

Download BREAKOUT.BAS (Saved BASIC)
Download / View BREAKOUT.LST (Listed BASIC)

(figure)

Breakout Game (Assembly Language)

The "Breakout" game is quite easy to translate into Assembly language once you understand how BASIC handles its graphics commands. The Operating System (OS) implements each of these commands through the CIO (Central Input/Output) subroutine located at $E456. When a program calls the OS through this location, the OS expects to be given the address of a properly formatted IOCB (Input Output Control Block). There are eight of these, each sixteen bytes long. These are located from $340 to $3BF. The appropriate IOCB number times 16 is passed to the subroutine in the X-register. The full details of how the internals actually work are really not important, especially to the beginning Assembly language programmer. Let's just say that we developed a set of graphics subroutines that mirror their BASIC language counterpart. We have commented on each of these in the listing for anyone who would like to study them.

Graphics Commands

The five graphics commands that we need for our game are: GRAPHICS #, POSITION H,V; PLOT H,V; DRAWTO H,V; and LOCATE H,V,Color. We set up each by inputting certain parameters into the Accumulator, X-register, and Yregister. Once you've set up the registers you need only JSR to that subroutine. The table below shows what you need to input into each of the registers.

Function	Accumulator	X-register 	Y-register

GRAPHICS        Mode #          -------         -------
POSITION        Vertical        Horizontal      Horizontal
                                High byte       Low byte
PLOT            Vertical        Horizontal      Horizontal
                                High byte       Low byte
DRAWTO          Vertical        Horizontal      Horizontal
                                High byte       Low byte
LOCATE          Vertical        Horizontal      Horizontal
                                High byte       Low byte
                Has color
                value on 
                return.

For example, if we wish to set up a Graphics 5 screen and draw a blue (playfield #3 default color) line from 10, 15 to 30,15 our program would be as follows:

LDA #$05	;GRAPHICS 5 SCREEN
JSR GRAPHICS
LDA #$03	;PLAYFIELD #3
STA COLOR
LDA #$0F	;VERTICAL=15
LDX #$00	;HORIZONTAL HIGH BYTE
LDY #$0A	;HORIZONTAL LOW BYTE
JSR PLOT	;PLOT PIXEL
LDA #$0F	;VERTICAL=15
LDX 000         ;HORIZONTAL HIGH BYTE
LDY #$IE	;HORIZONTAL LOW BYTE
JSR DRAWTO 	;DRAW LINE

Breakout Game

Once you understand the simplicity of duplicating the BASIC graphics statements in Assembly language you can proceed with developing the game.

The "Breakout" game is a very close translation of the BASIC version with a few subtle differences. One of the problems in working with Assembly language is that all numbers are whole integer numbers. In the BASIC version the ball's horizontal direction (DX) became +1/2 or -1/2 when it hit the inner portion of the paddle. Since incrementing the ball's position by +1/2 would be impossible in Assembly language, DX and BALLX, a temporary value for the ball's horizontal position, are doubled in value. If we then divide BALLX by two before plotting the ball's true position, TX, the fractional part, will vanish. In essence the ball will move horizontally every other frame.

BALLX = BALLX + DX (doubled values)
TX = BALLX / 2

ASL and LSR Instructions

Multiplication and division by powers of two is easy in Machine language. The mnemonic ASL is used for multiplication by two. The Arithmetic Shift Left (ASL) instruction shifts all of the bits in the Accumulator one position to the left. Thus, bit 0 is shifted into bit 1, bit 1 into bit 2, etc. Bit 7 is shifted into the carry bit so that you can use the BCC and BCS instructions to test for overflows. For example, if only bit 2 was on (4 decimal) and we did an ASL, the bit would be shifted to bit 3 (8 decimal). Thus, it is easy to multiply by powers of two by performing repeated ASL instructions.

Conversely, division is performed by the Logical Shift Right (LSR) instruction. Bits are shifted to the right and the bit 0 is shifted into the carry. This is equivalent to dividing by two with loss of the fractional part.

(figure)

       LDA #$05    ;LOAD ACCUMULATOR WITH 5
       LSR         ;DIVIDE BY 2
       STA $4000   ;VALUE STORED IN $4000 IS 2

Ball's Direction After Paddle Collision

The table of directional values for the four possible collision positions with the paddle are stored in VX. The two negative values in the table are stored in their two's complement form because it is easier to add two positive numbers rather than to test for a negative number and subtract.

                                  0th   1st   2nd    3rd
                             VX   $FE   $FF   $01    $02

For example, #$FE (-2)+ #$03 = #$01. The offset position from the paddle's left edge is placed in the X register to get the new horizontal velocity.

LDA TX       ;COMPARE PADDLE HORIZ. WITH BALL HORIZ.
SBC PX       ;DIFFERENCE
TAX
LDA VX,X     ;FETCH VELOCITY VALUE FROM TABLE
STA DX       ;THIS IS DOUBLED VALUE

We calculate the ball's new position as follows;

CLC
LDA BALLX    ;OLD BALL POSITION (DOUBLED)
ADC DX       ;NEW HORIZ. VELOCITY DOUBLED
STA BALLX
LSR          ;DIVIDE BY 2
STA TX       ;BALL'S TRUE HORIZ. POSITION

Scorekeeping

The scorekeeping routine also deserves an explanation. It differs substantially from the routines used in the other Machine language games in this book. It takes advantage of the 6502's ability to work in a numbering system called Binary Coded Decimal, or BCD. This system uses the lower four bits or low-order nibble to represent the low-order decimal digit, and the high-order nibble to represent the high-order decimal digit. The advantage is that the numbering system resembles decimal. The disadvantage is that it requires some advanced programming technique to isolate the digits in order to print them to the screen.

  DECIMAL    BINARY	HEX
	               (BCD)

    07     0000 0111    $07
    10     0001 0000    $10
    16     0001 0110    $16
    42     0100 0010    $42

To get to this mode you must set the decimal flag with a SED (Set Decimal Mode) command. It remains in effect until it is cleared by a CLD (Clear Decimal Mode) command.

A pair of bytes, SCORE and SCORE+ 1 are used to store the four score digits. These are updated by adding POINTS,X to SCORE+ 1 each time a brick is removed. The X-register contains the color value of the block hit so that we need only index into a table of point values. We didn't clear the carry when we added #$00 to SCORE (highbyte). However, if there was an overflow in SCORE+1 (low byte) during the first addition, the carry would be included in the resulting value in SCORE. Each of the four nibbles must be separated, translated into an internal character #, and finally placed into the appropriate position in the text window. The byte's high nibble is first shifted to the low nibble by four successive LSR instructions and then translated into an internal character number. Digits in the internal character set begin at #$10. Internal character #16 decimal = 0, 17 decimal = 1, etc. The ORA #$10 instruction, which combines the individual bits in its operand with those in the Accumulator, is just a fancy way of adding $10 to the value of our digit. The value of the low nibble is isolated by ANDing it with #$0F. It is then ORed with #$10 to obtain the internal character and stored in the next screen position. We have effectively stored the thousands and hundreds digits in the screen window. The code loops back again to obtain the value for the two nibbles in SCORE+ 1. These contain the tens and units digits. All of the store operations are done using indirect indexed addressing of the form STA(WINDOW),Y. We will discuss this at greater length in later chapters. Meanwhile, it allows us to index rapidly into a memory area whose two-byte address is stored in zero page.

If you're confused or lost at this point, don't worry. Just read on. Our intention was merely to show how a simple game like "Breakout" could be translated into Assembly language using graphics subroutines. It is not necessary to understand all of the details but to be able to roughly follow the code as it pertains to the game's flow chart. Many of the subtle tricks we mentioned in the previous discussion we will discuss in much greater detail in subsequent chapters.

Download BREAKOT.EXE (Executable program)
Download BREAKOT.OBJ (Object code)
Download / View BREAKOT.LST (Assembler listing)
Download / View BREAKOT.S (Source file)
Download / View BREAKOT.RAW (As printed in book)


Return to Table of Contents | Previous Chapter | Next Chapter