+ All Categories
Home > Documents > CS153-111017

CS153-111017

Date post: 15-Oct-2014
Category:
Upload: sethu-raman
View: 46 times
Download: 0 times
Share this document with a friend
29
CS 153: Concepts of Compiler Design October 17 Class Meeting Department of Computer Science San Jose State University Fall 2011 Instructor: Ron Mak www.cs.sjsu.edu /~mak
Transcript
Page 1: CS153-111017

CS 153: Concepts of Compiler DesignOctober 17 Class Meeting

Department of Computer ScienceSan Jose State University

Fall 2011Instructor: Ron Mak

www.cs.sjsu.edu/~mak

Page 2: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

2

Shomit Ghose

History of Computing Speaker Wednesday, Oct. 19, 6:00-7:00 PMAuditorium ENGR 189 Reception before the talk in

ENGR 294 at 5:00 PM “Micro-History:

An Examination of the Brief but Successful Life of a Silicon Valley Start-up”

Venture capitalist Partner, ONSET Ventures

Page 3: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

3

Midterm Solution: Question 1

1. List and describe five software engineering techniques we employed to make the code manageable and understandable.

Initial framework classes Validate the architecture early.

Partitioning language-dependent front end language-independent middle tier and back end The back end can be either an interpreter or a compiler.

Early initial end-to-end thread Always build on working code.

Design patterns strategy, factory, etc. “Code to the interfaces.” “Closed for modification, open for extension.”

Team development tools subversion source control

Page 4: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

4

Midterm Solution: Question 2

2. What is the purpose of the symbol table stack and how does it achieve its purpose? Purpose: Implement static scoping

Push a symbol table onto the stack whenever the parser enters a scope.

Pop the symbol table off the stack when the parser leaves a scope.

Search only the local (topmost) symbol table to determine if an identifier has been declared in the local scope.

Search the entire stack from top to bottom to determine if an identifier has been declared in the local or an outer scope.

Page 5: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

5

Midterm Solution: Question 3

3. What is the purpose of the runtime stack and how does it achieve its purpose?3. Purpose: To store runtime values according to the call chain

3. Push an activation record onto the stack whenever the main program or a procedure or function is called.

4. Pop the symbol table off the stack upon return.

4. The topmost activation record at level n contains the current values of the local variables and formal parameters of the currently active procedure or function at level n.

5. Use a runtime display to optimize accessing the appropriate activation record on the stack.

Page 6: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

6

Midterm Solution: Question 4

2. Implement the ternary conditional operator in Pascal using the keywords IF, THEN, and ELSE.

a. Modify the syntax diagrams.

factorvariable

number

factor

expression( )

NOT

string

conditional

conditional

expressionIF THEN expression ELSE expression

The result at run time of evaluating the conditional operator is a single value, the result of evaluating either <expression-2> or <expression-3>.

Therefore, a conditional expression must be a factor.

Page 7: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

7

Midterm Solution: Question 4

b. What type checking operations are necessary while parsing a conditional operator? <expression-1> must be boolean <expression-2> and <expression-3> must be type compatible with

the surrounding operators (preferably they should be the same type) or be assignment compatible with the target variable._

Page 8: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

8

Midterm Solution: Question 4

c. Draw a parse tree for the statement

k := i – j*IF m-n = 0 THEN m*n ELSE m+n

Note that the conditionaldoes not change any

precedence rules.

:=

k -

i

IF

-

m n

*

m n

+

m n

=

0

*

j

Page 9: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

9

Midterm Solution: Question 5

5. Describe the purpose of each of the following hash tables (or tree maps) and describe its keys (or give an example of a key).a. symbol table

Store the symbol table entries for the identifiers declared within given scope

Keys: Names of the identifiers

b. symbol table entry Store the attributes of an identifier Keys: Attribute enum constants such as ROUTINE_CODE

c. type specification object Store attributes about a data type Keys: Attribute enum constants such as ARRAY_INDEX_TYPE

Page 10: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

10

Midterm Solution: Question 5

d. parse tree node Store the attributes of a parse tree node Keys: Attribute enum constants LINE, ID, and VALUE

e. memory map Store the runtime values of the local variables and formal

parameters of a program, procedure, or function Keys: The names of the variables and parameters

_

Page 11: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

11

Midterm Solution: Question 6

6. How to implement the ENDALL reserved word? Front end

Modify the scanner to recognize ENDALL as a reserved word. Modify method CompoundStatementParser.parse() to

include ENDALL as a statement list terminator. Modify method StatementParser.parseList()

Stop looping if the global flag endAllFlag is true. Set endAllFlag to true after consuming the ENDALL keyword.

Modify method StatementParser.parse() Set endAllFlag to false after consuming the BEGIN keyword.

Middle tier No changes

Back end No changes

Page 12: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

12

Midterm Solution: Question 7

7. Classic Pascal included the WITH statement.

a. What must the Pascal parser do in order to parse a WITH statement?

After parsing the record variable following the WITH keyword, the parser must

Determine the record type of the variable. Push the record type’s symbol table onto the symbol table stack. When parsing the nested statements of the WITH statement, look

up identifiers first in the record type’s symbol table to determine whether or not they are record fields.

At the end of the WITH statement, pop off the record type’s symbol table._

Page 13: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

13

Midterm Solution: Question 7

b. What advantages would a WITH statement have at run time?

c. How would you implement a WITH statement in the interpreter’s back end?

None at all, if the WITH statement is considered to be shorthand for the programmer (“syntactic sugar”). However, if the parse tree contains a WITH node, then the record variable only needs to be evaluated once. This would be a performance optimization especially if the record variable is complicated, such ashaving subscripts, fields, and pointer dereferencing.

In the syntactic sugar case, do nothing.

In the WITH node case, the interpreter must allocate an extra slot in the activation record to store the value of the record variable.

Page 14: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

14

Minimum Acceptable Compiler Project

At least two data types with type checking. Basic arithmetic operations with operator precedence. Assignment statements. At least one conditional control statement (e.g., IF) At least one looping control statement. Procedures or functions with calls and returns Parameters passed by value or by reference. Basic error recovery (skip to semicolon or end of line). Sample source programs written in the source language. Generate Jasmin code that can be assembled. Execute the resulting .class file standalone (preferred)

or with a test harness. No crashes (e.g., null pointer exceptions) 70 points/100

Page 15: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

15

Ideas for Programming Languages

A language that works with a database such as MySQL Combines Pascal and SQL for writing database applications Not PL/SQL – use the language to write client programs Compiled code makes JDBC calls hidden from the programmer

A language that can access web pages Statements that “scrape” pages to extract information

A language for generating business reports A Pascal-like language that combines report writer features

A string-processing language Combines Pascal and Perl for writing applications that involve

pattern matching and string transformations

Page 16: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

16

Can We Build a Better Scanner?

Our scanner in the front end is relatively easy to understand and follow. Separate scanner classes for each token type.

However, it’s big and slow. Separate scanner classes for each token type. Create lots of objects and make lots of method calls.

We can write a more compact and faster scanner. However, it may be harder to understand and follow.

_

Page 17: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

17

Deterministic Finite Automata (DFA)

Pascal identifier Regular expression: <letter> ( <letter> | <digit> )* Implement the regular expression with a finite automaton

(AKA finite state machine):

1 2 3letter

letter

digit

[other]

start state accepting state

transition

This automaton is a deterministic finite automaton (DFA). At each state, the next input character uniquely determines which

transition to take to the next state.

Page 18: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

18

State-Transition Matrix

Represent the behavior of a DFA by a state-transition matrix:

1 2 3letter

letter

digit

[other]

Page 19: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

19

DFA for a Pascal Number

6 9 104 7 11digit

digit

digit

digit digit

digit+ +

-

E

digit

digitE

.

5 8

12

[other] [other]

[other]3

-0

Page 20: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

20

DFA for a Pascal Identifier or Number

6 9 104 7 11digit

digit

digit

digit digit

digit

+

+

-

E

digit

digitE

.

5 8

12

[other] [other]

[other]3

-

digit

1 20 letter [other]

letter

private static final int matrix[][] = {

/* letter digit + - . E other */ /* 0 */ { 1, 4, 3, 3, ERR, 1, ERR }, /* 1 */ { 1, 1, -2, -2, -2, 1, -2 }, /* 2 */ { ERR, ERR, ERR, ERR, ERR, ERR, ERR }, /* 3 */ { ERR, 4, ERR, ERR, ERR, ERR, ERR }, /* 4 */ { -5, 4, -5, -5, 6, 9, -5 }, /* 5 */ { ERR, ERR, ERR, ERR, ERR, ERR, ERR }, /* 6 */ { ERR, 7, ERR, ERR, ERR, ERR, ERR }, /* 7 */ { -8, 7, -8, -8, -8, 9, -8 }, /* 8 */ { ERR, ERR, ERR, ERR, ERR, ERR, ERR }, /* 9 */ { ERR, 11, 10, 10, ERR, ERR, ERR }, /* 10 */ { ERR, 11, ERR, ERR, ERR, ERR, ERR }, /* 11 */ { -12, 11, -12, -12, -12, -12, -12 }, /* 12 */ { ERR, ERR, ERR, ERR, ERR, ERR, ERR },};

Negative numbersin the matrix are theaccepting states.

Notice how theletter ‘E’ is handled!

Page 21: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

21

A Simple DFA Scannerpublic class SimpleDFAScanner{ // Input characters. private static final int LETTER = 0; private static final int DIGIT = 1; private static final int PLUS = 2; private static final int MINUS = 3; private static final int DOT = 4; private static final int E = 5; private static final int OTHER = 6;

private static final int ERR = -99999; // error state

private static final int matrix[][] = { ... };

private char ch; // current input character private int state; // current state

...}

Page 22: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

22

A Simple DFA Scanner, cont’d

int typeOf(char ch) { return (ch == 'E') ? E : Character.isLetter(ch) ? LETTER : Character.isDigit(ch) ? DIGIT : (ch == '+') ? PLUS : (ch == '-') ? MINUS : (ch == '.') ? DOT : OTHER; }

Page 23: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

23

A Simple DFA Scanner, cont’dprivate String nextToken() throws IOException{ while (Character.isWhitespace(ch)) nextChar(); if (ch == 0) return null; // EOF? state = 0; // start state StringBuilder buffer = new StringBuilder(); while (state >= 0) { // not accepting state state = matrix[state][typeOf(ch)]; // transit if ((state >= 0) || (state == ERR)) { buffer.append(ch); // build token string nextChar(); } } return buffer.toString();}

This is theheart of thescanner.

Table-driven scannerscan be very fast!

Page 24: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

24

Simple DFA Scanner, cont’dprivate void scan() throws IOException{ nextChar(); while (ch != 0) { // EOF? String token = nextToken(); if (token != null) { System.out.print("=====> \"" + token + "\" "); String tokenType = (state == -2) ? "IDENTIFIER" : (state == -5) ? "INTEGER" : (state == -8) ? "REAL (fraction only)" : (state == -12) ? "REAL" : "*** ERROR ***"; System.out.println(tokenType); } }}

How do we know which token we just got?

Demo

Page 25: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

25

Backus Naur Form (BNF)

A text-based way to describe source language syntax. Named after John Backus and Peter Naur. Text-based means it can be read by a program ...

… such as a compiler-compiler that can automatically generate a parser for a source language after reading (and parsing) the language’s syntax rules written in BNF.

Uses certain meta-symbols. Symbols that are part of BNF itself but are not necessarily part

of the syntax of the source language.

::= “is defined as”

| “or”

< > Surround names of nonterminal (not literal) items

Page 26: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

26

BNF Example: U.S. Postal Address <postal-address> ::= <name-part> <street-part> <city-state-part> <name-part> ::= <first-part> <last-name> | <first-part> <last-name> <suffix><first-part> ::= <first-name> | <capital-letter> . <street-part> ::= <house-number> <street-name> | <house-number> <street-name> <apartment-number> <city-state-part > ::= <city-name> , <state-code> <ZIP-code> <suffix> ::= Sr. | Jr. | <roman-numeral><first-name> ::= <name><last-name> ::= <name><street-name> ::= <name><city-name> ::= <name><house-number> ::= <number><apartment-number> ::= <number><state-code> ::= <capital-letter> <capital-letter><capital-letter> ::= A|B|C|D|E|F|G|H|I|J|K|L|M |N|O|P|Q|R|S|T|U|V|W|X|Y|Z<name> ::= …<number> ::= …etc.

Page 27: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

27

BNF: Optional and Repeated Items

To show optional items in BNF, use the vertical bar |. “An expression is a simple expression optionally followed by an

relational operator and another simple expression.” <expression> ::= <simple expression>

| <simple expression> <rel op> <simple expression>

BNF uses recursion for repeated items. “A digit sequence is a digit followed by zero or more digits.” <digit sequence> ::= <digit>

| <digit> <digit sequence> <digit sequence> ::= <digit>

| <digit sequence> <digit>

Rightrecursive

Leftrecursive

Page 28: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

28

BNF Example: Pascal Number

<digit sequence> ::= <digit> | <digit> <digit sequence><unsigned integer> ::= <digit sequence><unsigned real> ::= <unsigned integer>.<digit sequence> | <unsigned integer>.<digit sequence> <e> <scale factor> | <unsigned integer > <e> <scale factor><unsigned number> ::= <unsigned integer> | <unsigned real><scale factor> ::= <unsigned integer> | <sign> <unsigned integer><e> ::= E | e<sign> ::= + | -

Repetition via recursion.

The sign is optional.

Page 29: CS153-111017

SJSU Dept. of Computer ScienceFall 2011: October 17

CS 153: Concepts of Compiler Design© R. Mak

29

BNF Example: Pascal IF Statement

<if statement> ::= IF <expression> THEN <statement> | IF <expression> THEN <statement> ELSE <statement>

It should be straightforward to write a parsing method from either the syntax diagram or the BNF._


Recommended