of 17
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
1/17
ICT 444 COMPILERS AND TRANSLATORSICT 444 COMPILERS AND TRANSLATORS
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
2/17
Introduction to Compilers
Simply stated, a compiler is a programme thatreads a programme written in one language
the source language and translates it into anequivalent programme in another language the target language.
TargetSource
ProgrammeProgramme
RDAppiah
Messages
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
3/17
A Language Processing System
PREPROCESSOR
Skeletal source programme
Source programme
COMPILER
Target assembly programme
ASSEMBLER
Re-locatable machine code
LOADER/LINKEDITOR Library, re-locatable objectfiles
RDAppiah
so u e mac ne co e
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
4/17
Analysis of a source programme
n comp ng, ana ys s o e source programmeconsists of three phases:
1. Linear Analysis: This is the process inwhich the stream of characters making up thesource programme is read from left-to-right and
grouped into tokens which are sequences ofcharacters having a collective meaning.
2. Hierarchical Analysis: This is where the
stream of characters or tokens are groupedhierarchically into nested collections with
RDAppiahcollective meaning.
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
5/17
3. Semantic Analysis: This is where certain
components of a programme fit together
RDAppiah
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
6/17
Symbol-Table Management
the identifiers used in the source programme
of each identifier.
,
information about storage allocation of the, .
In the case of procedure names, they provide
such things as the number and types of itsarguments, the method of argument supply,
RDAppiahand the type returned, if any.
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
7/17
A Symbol Table is a data structure containing a
, wattributes of the identifier.
This data structure allows us to find the recordfor each identifier quickly and to store orretrieve data from that record quickly.
RDAppiah
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
8/17
Preprocessors: Preprocessors produce
following functions:
. allow a user to define macros that are
.
Macro processors deal with two kinds of: u .
Macro definition is normally indicated by some
unique character or keyword like define ormacro with formal parametersin it definition.
RDAppiah
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
9/17
The use of a macro on the other hand consists
parametersi.e. values for it formal parameters.
2. File Inclusion: This involves apreprocessor that includes header files in aprogramme text.
3. Rational processors: These areprocessors that augment older languages with
more modern flow-of-control and data-structuring facilities.
RDAppiah
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
10/17
4. Language extensions: These are
the language by what amounts to built-in.
RDAppiah
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
11/17
Assemblers
An assembly code is a mnemonic version ofmachine code, in which names are used
instead of binary codes for operations andnames are also given to memory addresses.
Some compilers produce assembly code, that is
On the other hand, other compilers perform the
o o e assem er us, pro uc ng re-locatable machine code that can be passed
RDAppiahrect y to t e oa er n e tor.
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
12/17
A typical sequence of assembly instruction:
MOV a, R1ADD #2, R1
MOV R1, b
address a into register 1, then adds the,
1 as a fixed-point number, and finally stores the
.Thus, it computes an expression like
RDAppiah= a + ; n a anguage e .
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
13/17
A Pass
A pass refers to the process of reading an input fileonce and in some cases, writing an output file.
Two-Pass Assembly
e s mp es orm o assem er ma es wo
passes over the input.In the first pass, all the identifiers that denotestora e locations are found and stored in asymbol table, which is usually different from thatof the com iler.
RDAppiah
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
14/17
Here, identifiers are assigned storage locations
.
After reading the illustration above, and on the
assumption that a word consist of four bytes,the symbol table might contain the followingentries:
IDENTIFIER ADDRESS
b 4
RDAppiah
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
15/17
In the second pass, the assembler scans the
.This time, it translates each operation code into
representing that operation in machine.
After this, it translates each identifier
for that identifier in the symbol table.
-locatable machine code, which indicates that it
RDAppiah
memory.
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
16/17
The following is a hypothetical machine code
translated:
0011 01 10 00000010
Here, we take it that the first four bits are theinstruction code, with 0001, 0010, and 0011standing for load, store, and add, respectively.
The next two bits designate a register, and o1refers to register 1 in each of the three
RDAppiahinstructions above.
8/8/2019 SLIDES_ICT444!1!17 [Compatibility Mode]
17/17
The two bits after that represent a tag with 00
,the last eight bits refer to a memory address.
e ag s an s or e mme a e mo e,where the last eight bits are taken literally as
e operan as n e secon ns ruc on .
RDAppiah