Assembler When a source program is a assembly language and the target program is a numerical machine...

Post on 23-Dec-2015

222 views 3 download

Tags:

transcript

Assembler

• When a source program is a assembly language and the target program is a numerical machine language then the translator is called as a assembler.

• Assembly language is basically a symbolic representation for a numerical machine language

• Assembly language is a convenient language using mnemonics (symbolic names and symbolic addresses) for coding machine instructions

• The assembly programmer has access to all the features and instructions available on the target machine and thus can execute every instruction in the instruction set of the target machine

Problems with ALP

• Assembly language programming is difficult.

• Takes longer time.• Takes longer time to debug.• Difficult to maintain.

Why go for assembly language programming?

• Performance issues – For some applications, speed and size of the code are critical. An expert assembly language programmer can often produce code that is much smaller and much faster than a high level programmer can. Example – embedded applications

such as the code on a smart card, the code in a cellular telephone, BIOS routines, inner loops of performance critical applications etc.

2. Access to the machine – Some procedures need complete access to the hardware, something which is impossible in high level languages. Example – low level interrupt and trap handlers in an operating system

Skeleton of ALPHeader……………..start of ALP

Body ……………..ALP Statements

Footer……………..End of ALP

The body of ALP consists of Assembler Directives and Mnemonics

[label:] mnemonic [operands] [;comments]

Assembly language format

• The format of a typical assembly language program consists of

Label field – provides symbolic names for memory addresses which is needed on executable statements so that the statements can be jumped to. It also permits the data stored there to be accessible by the symbolic name. Example: TEMP,FORMUL etc

Operation field – contains a symbolic abbreviation for the opcode or a pseudo-instruction.Example: MOVE, ADD etc.

Operands field – specifies addresses or registers used by operands of the machine instruction. Example: D0,R1,R2 etc.

Comment field – is used for documentation purposes .

Types of ALP Statements

•  ALP statements can be broadly categorized into following 3 types:– Assembler Directive statements

• Ex: START 1000

– Declarative Statements• Ex ABC DB 1

– Imperative Statements• Ex: MOV A,B

Features of ALP  • Use of Mnemonics to specify Opcodes makes the assembly

language program much more readable and debugging is also easier.

 • Use of Symbols to specify Operands means that program can be

modified with no overhead. That is, if definition address of a symbol changes, the change in ALP is done only at the place where the symbol is declared; all the places where the symbol has been used need not be updated.

• Separation of Code and Data Segments allows the programmer to keep aside some portion of memory for the data to be used by the program.

Basic data Structures of an ALP

•  • Assembler needs following basic data structures (also called as databases) as input:•  ALP Source Code – 

START 4000 ; Start storing the program from location 4000

LOAD 1000 ; Load data at location 1000 into register A

MOV B,A ; Copy contents of Register A into register B

LOAD 2000 ; Load data at location 2000 into register A

ADD B ; Add content of register A with that of B, store the sum in A

STORE 3000 ; Store the sum from register A at location 3000.

END ; End of Program

Assembler design for Hypothetical machine

• MOT( Machine op-code table) and POT( pseudo op-code table) for Hypothetical machine( Supporting tables for assembler design)

• ALP ( input for design of assembler)

MOT Mnemonics opcode No. of Operands LOI

ADD 1 1 2

SUB 2 1 2

MULT 3 1 2

JMP 4 1 2

JNEG 5 1 2

JPOS 6 1 2

JZ 7 1 2

LOAD 8 1 2

STORE 9 1 2

READ 10 1 2

WRITE 11 1 2

STOP 12 0 1

POTPseudo opcode No. of Operands

DB 2

DW 2

EQU 2

CONST 2

START 1

ORG 1

LTORG 1

ENDP 0

END 0

ALPLine No. Label Pseudo code/ Mnemonic Operand

START 2000

1 READ N

2 LOAD ZERO

3 STORE COUNT

4 SRORE SUM

5 LOOP: READ X

6 LOAD X

7 ADD SUM

8 STORE SUM

9 LOAD COUNT

10 ADD ONE

11 STORE COUNT

12 SUB N

13 JZ OUTER

Line No. Label Pseudo code/ Mnemonic

Operand

14 JMP LOOP

15 OUTER: WRITE SUM

16 STOP

17 ENDP ; END OF CODE SEG

18 ZERO CONST 0

19 ONE CONST 1

20 SUM DB ?

21 COUNT DB ?

22 N DB ?

23 X DB ?

24 END ; END OF PROGRAM

forward reference problem

• In the above example one important point is worth noting. How does the assembler know what is stored in a location N. such a reference which is used even before it is defined is called a forward reference.

• The Forward Reference: is a reference which is used even before it is defined.

• Can be handled in two ways:

– Two pass assembler

– One pass assembler

• Each reading of the source program is called a pass. Any translator which reads the input program once is called a one-pass assembler and if it reads twice is called a two-pass assembler.

Two - pass assembler:

1. In pass-one of a two-pass assembler, the definitions of symbols, statement labels etc are collected and stored in a table known as the symbol table.

2. In pass-two, each statement can be read, assembled and output as the values of all

symbols are known.

This approach is thus quite simple though it requires an additional pass.

One - pass assembler:

The assembly program is read once and converted to an intermediate form and thereafter stored in a table in memory.

Uses the data structure FRT( Forward reference table)

Symbol Table (ST)

Symbol Type Address of Operands

OBJECT CODELOCATION COUNTER OUTPUT

Data structures – pass 1

• Mot• Pot• St• Potptr, motptr, stptr• Input_file, inter_file,location counter• Code flag

Data structures – pass 2

• Mot.• Pot.• St.• Potptr, motptr, stptr.• Input_file, inter_file,location counter.• Code flag.• Output file.

Functions –pass 1

• Readline(inputfile)• Writeline(interfile)• SearchMot(symbol)• SearchPot(symbol)• Searchst(symbol)• InsertSymST(symbol)• InsertattrSymST(symbol)• Initlc(loi)• GetattrfromST(symbol)

Functions –pass 2

• Readline(inputfile)• Writeline(outfile)• SearchMot(symbol)• SearchPot(symbol)• Searchst(symbol)• InsertSymST(symbol)• InsertattrSymST(symbol)• Initlc(loi)• GetattrfromST(symbol)

Selection of Data structure for Assembler design

• File• Array of Structure• Structures• Class

Algorithm: 2 Pass Assembler Design

• Refer to algo.doc

Example 2: Translate the ALP to M/C code and generate Symbol Table

ORG 1000 LOAD A ADD ONE STORE A STOP ENDPA DB ?ONE CONST 1 END

Example 3: Translate the ALP to M/C code and generate Symbol Table

LOAD ABACK : ADD ONE JNZ A STORE A JMP BACKA : SUB ONE STOP ENDPA DB ?ONE CONST 1 END

Literal Handling by Assembler

• The literal is an operand with

syntax : Literal name = ‘< value>’

Ex : ADD =‘4’

• Differs from constant Becoz its location can not be specified in ALP , which ensures that its value can not be changed during execution of program

• Its nothing but “ PURE CONSTANT”

Example : Literal Handling

LOAD A

ADD =‘4’

STORE A

SUB =‘4’

STORE B

ADD =‘5’

STOP

ENDP

A DB ?

B DB ?

END

Generate the object code for above Pgm along with ST and LT.

Symbol Table( After First Pass)

Name Type Address of symbol

A Var 0012

B Var 0013

Literal Table

Name of the Literal

Value of the Literal

Address of Usage of Symbol

Address of Defination of Symbol

‘4’ 4 0003,0007 0014

‘5’ 5 0011 0015

OBJECT CODELOCATION COUNTER OUTPUT

0000 08 -----0002 12 ----

Procedure HandlingPROC ADD2 ; PROC HEADER

ORG 1000

LOAD A

ADD B ; PROC BODY

RET

ENDP ; PROC FOOTER

LOAD A

CALL ADD2 : CALL TO A PROCEDURE

STORE B

ENDP

A DB ?

B DB ?

END

FEASIBILITY OF ONE PASS ASSEMBLER Find object code and symbol table for the following ALP

DATA SEGMENTX DB ?Y DB ?

CODE SEGMENT ORG 0000LOAD XBACK : ADD YJNEG FORWARD ; INSTANCE OF FORWARD REFERENCESUB XJZ FORWARD ; INSTANCE OF FORWARD REFERENCEADD XJMP BACK FORWARD : STORE YSTOPEND

Forward Reference Table / TII

Name of the Symbol

Attribute Address of Usage of Symbol

Address of Defination of Symbol

Forward Label 0005,0009 0014

OBJECT CODE IN ARRAY ( AT THE END OF FIRST PASS)

LOCATION COUNTER OUTPUT

0000 08 00000002 01 00010004 05 ---------------0006 02 00000008 07 ----------------0010 01 00000012 04 00020014 09 00010016 12

FINAL OBJECT CODE( WITH THE HELP OF FRT)

LOCATION COUNTER OUTPUT

0000 08 00000002 01 00010004 05 00140006 02 00000008 07 0014 0010 01 00000012 04 00020014 09 00010016 12

Sample Program: To multiply N1 * N2 by successive addition methodInput ALP Code:(memory location)

START 1000LOAD N2 ; Load value of N2 into register ASTORE COUNT ; Store value of N2 into COUNT variableLOAD ‘=0’ ; Load literal value 0 into register A

repeat: ADD N1 ; Add value of N1 to register ADEC COUNT ; Decrement COUNT variable by 1JNZ repeat ; Jump if-not-zero to’ repeat’STORE SUM ; Store sum value into SUM variableJNC down ; Jump if-not-carry to ‘down’INC SUM ; Increment SUM if there is carry

down: STOP ; program execution endsENDP ; code segment ends

N1 DB 07 ; Define a Byte for variable ‘N1’ with value 07 N2 DB 05 ; Define a Byte for variable ‘N2’ with value 05 COUNT DB ? ; Define a Byte for variable ‘COUNT’ with null value SUM DW ?? ; Define a word for variable ‘SUM’ with null value END ; End of program

Symbol Name Type Usage Address Definition Address

N2 VAR 1002 1022

COUNT VAR 1004,1011 1023

N1 VAR 1009 1021

repeat LABEL 1013 1008

SUM VAR 1015, 1019 1024

down LABEL 1017 1020

Literal Notation Value Usage Address Definition Address

‘=0’ 0 1006 1026

Forward Reference Table (FRT)Literal Table (LT)

Address Opcode Operand

1000 03 1022

1002 04 1023

1004 03 1026

1006 01 1021

1008 06 1023

1010 08 1008

1012 04

1014 09

1016 05

1018 10 -

Output File:

Error Handling Mechanism of Assembler

1) “ Unexpected END of FILE”

2) “ END of COMMENT not Found”

3) “UNDEFINED Symbol”

4) “UNDEFINED Label”

5) “Duplicate Label”

6) ““Duplicate Symbol”

7) “Missing EOF”

8) “ END of PROC not Found”

9) “ RESERVED word used as a SYMBOL”

10) “ Number of OPERANDS mismatch”

11) “INCORRECT Instruction”

SYMBOLS WHICH CAN NOT BE RESOLVED BY ASSEMBLER

• External symbols• External Procedures