+ All Categories
Home > Documents > CSC 310 assignment 3 - Kutztown University of...

CSC 310 assignment 3 - Kutztown University of...

Date post: 18-Feb-2018
Category:
Upload: lamthuan
View: 214 times
Download: 0 times
Share this document with a friend
200
CSC 310, Spring, 2009, assignment 3, page 1 of 8 CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.
Transcript
Page 1: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 2: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 3: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 4: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 5: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 6: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 7: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 8: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 9: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 10: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 11: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 12: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 13: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 14: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 15: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 16: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 17: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 18: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 19: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 20: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 21: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 22: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 23: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 24: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 25: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 26: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 27: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 28: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 29: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 30: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 31: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 32: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 33: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 34: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 35: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 36: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 37: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 38: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 39: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 40: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 41: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 42: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 43: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 44: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 45: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 46: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 47: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 48: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 49: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 50: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 51: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 52: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 53: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 54: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 55: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 56: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 57: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 58: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 59: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 60: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 61: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 62: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 63: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 64: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 65: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 66: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 67: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 68: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 69: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 70: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 71: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 72: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 73: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 74: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 75: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 76: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 77: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 78: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 79: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 80: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 81: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 82: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 83: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 84: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 85: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 86: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 87: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 88: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 89: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 90: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 91: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 92: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 93: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 94: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 95: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 96: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 97: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 98: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 99: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 100: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 101: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 102: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 103: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 104: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 105: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 106: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 107: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 108: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 109: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 110: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 111: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 112: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 113: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 114: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 115: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 116: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 117: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 118: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 119: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 120: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 121: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 122: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 123: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 124: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 125: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 126: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 127: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 128: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 129: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 130: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 131: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 132: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 133: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 134: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 135: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 136: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 137: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 138: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 139: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 140: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 141: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 142: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 143: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 144: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 145: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 146: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 147: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 148: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 149: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 150: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 151: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 152: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 153: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 154: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 155: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 156: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 157: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 158: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 159: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 160: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 161: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 162: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 163: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 164: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 165: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 166: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 167: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 168: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 169: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 170: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 171: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 172: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 173: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 174: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 175: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 176: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 177: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 178: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 179: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 180: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 181: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 182: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 183: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 184: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 185: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 186: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 187: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 188: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 189: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 190: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 191: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 192: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409

Page 193: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 1 of 8

CSC 310 assignment 3 1 # THIS SOURCE FILE IS ASSIGNMENT 3 FOR CSC 310, SPRING, 2009, D. PARSON 2 3 # cp -pr ~parson/ProcLang/symtable_assignment3 ~/ProcLang 4 # cd ~/ProcLang/symtable_assignment3 5 # Modify file ‘parsestep.g’ according to STUDENT comments and 6 # discussion in class. Please attend class. 7 # 8 # DEADLINE: 11:59 PM April 3, 2009 9 # 10 # Run “gmake test” to verify that it tests correctly. 11 # 12 # Run “gmake turnitin” by end of April 3 to turn it in. 13 # 14 # The main parts added to assignment 2 are a symbol table, 15 # a proc stack to keep track of the scope of symbol definitions, a 16 # loopstack to make sure that ‘braek’ and ‘continue’ appear only 17 # in loop bodies, and an error message buffer for reporting semantic 18 # errors to the test driver. 19 # 20 # Search “Assignment 3” to find the changes. 21 # Search “STUDENT” to find your work. 22 23 # parsestep.g -- parser for the Forth-like STEP language. 24 # The grammar below is a YAPPS2 LL(1) grammar for the STEP language. 25 # See http://theory.stanford.edu/~amitp/yapps/ for YAPPS2. 26 # UPDATE 1 for assignment 3 -- add __symboltable__ and __loopstack__ 27 # and __procstack__ and __errorlist__ as documented below. 28 # 29 # CSC 310, Dr. D. Parson, Spring, 2009 30 31 # ANY PYTHON HELPER FUNCTIONS THAT YOU NEED TO CALL FROM THE GRAMMAR’S 32 # ACTIONS SHOULD BE DEFINED IN THIS UPPER SECTION. 33 34 # NOTE ON THE SCANNER’S REGULAR EXPRESSIONS: PUT THE GENERAL, WILDCARD 35 # PATTERNS AFTER THE KEYWORDS, AND MAKE SURE TO SET 36 # option: “context-insensitive-scanner” 37 # SO THAT RESERVED KEYWORDS ARE CONSISTENTLY SCANNED AS RESERVED KEYWORDS. 38 39 from copy import deepcopy 40 import re 41 42 # Each entry in the symbol table is a mapping from a symbol NAME to an 43 # unordered list of tuples, one tuple per scope for that symbol, with each 44 # tuple having the following fields. The entry defines a proc, constant, 45 # variable, array or table being defined. 46 # 47 # [0] A reference to the parse subtree returned by STEP_PROC or STEP_DATA. 48 # There is a symbol table entry for each proc, variable, array or table 49 # object. The first field of the parse subtree holds the type of the symbol 50 # (e.g., ‘variable’), and the second field holds its (‘symbol’,SYMBOL) 51 # name. This SYMBOL name is the key to the symtable list of scoped entries 52 # for this SYMBOL.

Page 194: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 2 of 8

53 # 54 # [1] The scope of the symbol as a tuple, with tuple () representing global 55 # scope, and (PROCNAME) representing nested scope within a proc of that name. 56 # Procs do not nest in STEP. If they ever do, this tuple with contain 57 # (PROCNAME0, PROCNAME1, ...), where PROCNAME0 is the innermost proc. 58 # Symbols must be unique *within a given scope*, so there must be an error 59 # check for redefined symbols within a scope when the symbol is defined. 60 61 __symboltable__ = {} # Track program-defined symbols. 62 63 # The __procstack__ keeps track of the proc currently being defined. It is 64 # empty when none is being defined. If nested procs are ever supported, 65 # element[0] is the top of stack (innermost proc being defined). 66 67 __procstack__ = [] # element 0 is innermost proc, list is [] for global scope 68 69 # The __loopstack__ currently tracks only DO tokens that have not 70 # been closed by ENDWHILE, and DOUNTIL tokens that have not been closed by 71 # UNTIL. A BREAK or CONTINUE token can be parsed only when this stack’s depth 72 # is > 0. 73 74 __loopstack__ = [] # element 0 is the innermost loop-being-comnpiled 75 76 # __errorlist__ collects semantic errors, [0] being the first. 77 78 __errorlist__ = [] 79 80 # Both the YAPPS2 scanner and the semantic tests need these reg. expressions. 81 82 __DECINTEGER__ = r’(-?[1-9][0-9]*)|0’ 83 __HEXINTEGER__ = r’0[Xx][0-9a-fA-F]+’ 84 __OCTINTEGER__ = r’0[0-7][0-7]*’ 85 86 # Assignment 3 functions for manipulating above data structures and 87 # for searching the symbol table for symbols. 88 def __initglobals__(): 89 global __symboltable__ 90 global __loopstack__ 91 global __procstack__ 92 global __errorlist__ 93 __symboltable__ = {} 94 __loopstack__ = [] 95 __procstack__ = [] 96 __errorlist__ = [] 97 98 def __add_to_symbol_table__(subtree): 99 “““ 100 Add the subtree’s symbol and its current proc scope to symbol table. 101 “““ 102 key = subtree[1][1]# This is the symbolic name of the object. 103 scope = tuple(__procstack__) 104 if (__find_binding__(key, scope, singleScope=True)): 105 __errorlist__.append(“symbol “ + key 106 + “ is multiply defined at scope “ + str(scope))

Page 195: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 3 of 8

107 else: 108 entry = [subtree, scope] 109 if (__symboltable__.has_key(key)): 110 __symboltable__[key].append(entry) 111 else: 112 __symboltable__[key] = [entry] 113 114 115 def __find_binding__(symbol, scope, singleScope=False): 116 “““ 117 Return the innermost __symboltable__ binding for symbol within scope 118 if there is one, else return None. When singleScope evaluates to True, 119 then only scope is checked (i.e., no outer scopes). 120 “““ 121 if (__symboltable__.has_key(symbol)): 122 myscope = deepcopy(scope) # we will modify it here. 123 allbindings = __symboltable__[symbol] 124 while (myscope != None): 125 for nextbinding in allbindings: 126 if nextbinding[1] == tuple(myscope): 127 return nextbinding 128 if myscope and not singleScope: # it is non-empty 129 myscope = myscope[1:] # pop out a level of scope 130 else: 131 myscope = None # we could not find a binding 132 return None 133 134 def __find_constant__(typeLexemePair, clientType, clientName): 135 “““ 136 Locate the mapping from typeLexemePair -> INTEGER within scope, where 137 typeLexemePair is either (‘integer’, INTEGER) or (‘symbol’, SYMBOL) and 138 SYMBOL is a constant already resoved to an INTEGER. If SYMBOL is not 139 already bound to a constant INTEGER, store an error message within 140 __errorlist__ and return 0. Otherwise return the bound INTEGER as an int. 141 Parameters clientType and clientName constitute the data construct using 142 the symbolic INTEGER (e.g., constant coo or array aoo or table too). 143 They appear in any error message stored within __error_list__. 144 “““ 145 errorsuffix = clientType + “ “ + clientName 146 if (typeLexemePair[0] == ‘integer’): 147 if (re.match(__HEXINTEGER__, typeLexemePair[1])): 148 return int(typeLexemePair[1], 16) 149 elif (re.match(__OCTINTEGER__, typeLexemePair[1])): 150 return int(typeLexemePair[1], 8) 151 else: 152 return int(typeLexemePair[1]) 153 if (typeLexemePair[0] == ‘symbol’): 154 binding = __find_binding__(typeLexemePair[1], __procstack__) 155 if (not binding): # presumably None, an invalid symbol 156 __errorlist__.append(“undefined symbol “ + typeLexemePair[1] 157 + “ in “ + errorsuffix) 158 return 0 159 subtree = binding[0] # the annotated subtree 160 if (subtree[0] != ‘constant’): # constants must resolve to constants

Page 196: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 4 of 8

161 # e.g., invalid variable foo in errorsuffix 162 __errorlist__.append(“invalid “ + subtree[0] + “ “ 163 + subtree[1][1] + “ in “ + errorsuffix) 164 return 0 165 return subtree[3] # return the int previously bound there 166 else: 167 __errorlist__.append(“invalid “ + typeLexemePair[0] + “ “ 168 + typeLexemePair[1] + “ in “ + errorsuffix) 169 return 0 170 171 # Assignment 3 172 # STUDENT implement __verify_proc__ according to the spec below. 173 # Pattern this function after __find_constant__ above. 174 def __verify_proc__(typeLexemePair, clientType, clientName): 175 “““ 176 Verify the mapping from typeLexemePair -> a proc within scope, where 177 typeLexemePair is (‘symbol’, SYMBOL) and SYMBOL is a proc name accessible 178 in the current scope. If SYMBOL is not already bound to a proc name, 179 store an error message within __errorlist__ and return []. Otherwise 180 return the parse subtree entry for this binding of the proc. 181 Parameters clientType and clientName constitute the control statement using 182 the proc name. 183 “““ 184 pass # STUDENT replace this line with the function definition 185 186 def __verify__loop__(breakORcontinue): 187 if (len(__loopstack__) == 0): 188 __errorlist__.append(“invalid “ + breakORcontinue 189 + “ outside of a loop”) 190 191 %% 192 parser PARSESTEP: 193 option: “context-insensitive-scanner” 194 ignore: “([ \r\t\n]+)|(//[^\n]*\n)” 195 token END: “$” 196 # The remaining tokens are reserved keywords. 197 token PROC: ‘proc’ 198 token ENDPROC: ‘endproc’ 199 token RETURN: ‘return’ 200 token IF: ‘if’ 201 token THEN: ‘then’ 202 token ELSE: ‘else’ 203 token ENDIF: ‘endif’ 204 token WHILE: ‘while’ 205 token DO: ‘do’ 206 token ENDWHILE: ‘endwhile’ 207 token DOUNTIL: ‘dountil’ 208 token UNTIL: ‘until’ 209 token ENDUNTIL: ‘enduntil’ 210 token BREAK: ‘break’ 211 token CONTINUE: ‘continue’ 212 token SWITCH: ‘switch’ 213 token CASE: ‘case’ 214 token ENDCASE: ‘endcase’

Page 197: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 5 of 8

215 token CONST: ‘constant’ 216 token VAR: ‘variable’ 217 token ARRAY: ‘array’ 218 token TABLE: ‘table’ 219 token ENDTABLE: ‘endtable’ 220 token LITERALSTRING: r’”[^”]*”’ 221 # ‘r quotes a *raw *string* 222 token DECINTEGER: r’(-?[1-9][0-9]*)|0’ 223 token HEXINTEGER: r’0[Xx][0-9a-fA-F]+’ 224 token OCTINTEGER: r’0[0-7][0-7]*’ 225 token SYMBOL: r’[^ \r\t\n]+’ 226 227 # The outer rule ‘goal’ returns a parse tree for the entire program. 228 # ASSIGNMENT 3 it now returns a 3-tuple as follows. 229 # element [0] the parse tree. 230 # element [1] It now also returns a *deepcopy* of the __symboltable__, 231 # element [2] a list, possibly empty, of semantic errors detected, 232 # appended in order of creation to this list (a deepcopy of __errorlist__). 233 # At present possible errors consist of A) multiple definitions of a 234 # proc/data symbol within a scope, B) invalid break or continue statements 235 # outside # of any enclosing DO-ENDWHILE or DOUNTIL-UNTIL blocks, 236 # C) constant symbol declarations, array length fields, and table values, 237 # that do not resolve to previously defined integer constants, 238 # D) SSTATEMNTs between CASE and ENDCASE that 239 # are not previously defined proc names available within the scope of 240 # the switch statement. 241 # 242 # All error checks must consider the scope of their symbol. For example, 243 # a constant or CASE-ENDCASE cannot see symbols defined in non-enclosing 244 # scopes. 245 246 # Assignment 3: Add call to __initglobals__() before any parsing. 247 # Return a triplet contain the abstract syntax tree, a deep copy 248 # of the symbol table, and a deep copy of the error message FIFO. 249 # This part is working. 250 rule goal: {{ __initglobals__() }} 251 STEP_PROGRAM END 252 {{ return(STEP_PROGRAM,deepcopy(__symboltable__),deepcopy(__errorlist__)) }} 253 254 # NOTE: DO NOT DO ( STEP_GLOBAL ) * ON A SINGLE LINE 255 rule STEP_PROGRAM: STEP_GLOBAL {{ program = [‘program’, STEP_GLOBAL]}} 256 ( STEP_GLOBAL {{ program.append(STEP_GLOBAL) }} 257 ) * {{ return program }} 258 259 rule STEP_GLOBAL: STEP_PROC {{ return STEP_PROC }} 260 | STEP_DATA {{ return STEP_DATA }} 261 262 # STUDENT 263 # Assignment 3: Push the proc name to the front of __procstack__ before 264 # parsing any statements, and pop it before adding the proc name to the 265 # symbol table. 266 # Also add this proc to the the symbol table before returning. 267 rule STEP_PROC: PROC SYMBOL {{ proc = [‘proc’,(‘symbol’,SYMBOL)] }} 268 ( STATEMNT {{ proc.append(STATEMNT) }}

Page 198: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 6 of 8

269 ) * 270 ENDPROC 271 {{ return proc }} 272 273 # Assignment 3 (STUDENT) add a call to __verify__loop__ before 274 # returning BREAK or CONTINUE. 275 rule STATEMNT: SYMBOL {{ return (‘symbol’, SYMBOL) }} 276 | INTEGER {{ return (‘integer’, INTEGER) }} 277 | LITERALSTRING {{ return (‘string’, LITERALSTRING) }} 278 | STEP_IF {{ return STEP_IF }} 279 | STEP_WHILE {{ return STEP_WHILE }} 280 | STEP_DOUNTIL {{ return STEP_DOUNTIL }} 281 | STEP_SWITCH {{ return STEP_SWITCH }} 282 | STEP_DATA {{ return STEP_DATA }} 283 | BREAK {{ return BREAK }} 284 | CONTINUE {{ return CONTINUE }} 285 286 rule STEP_IF: IF {{ iftree = [‘if’, [], [], []] }} 287 ( STATEMNT {{ iftree[1].append(STATEMNT) }} 288 ) * 289 THEN 290 ( STATEMNT {{ iftree[2].append(STATEMNT) }} 291 ) * 292 OPTIONAL_ELSE {{ iftree[3] = OPTIONAL_ELSE }} 293 {{ return iftree }} 294 295 rule OPTIONAL_ELSE: ELSE {{ elsetree = [] }} 296 ( STATEMNT {{ elsetree.append(STATEMNT) }} 297 ) * 298 ENDIF {{ return elsetree }} 299 | ENDIF {{ return [] }} 300 301 # Assignment 3 (done) push DO to the __loopstack__ after parsing it, 302 # and pop it after parsing ENDWHILE. Push at __loopstack__[0]. 303 rule STEP_WHILE: WHILE {{ wtree = [‘while’, [], []] }} 304 ( STATEMNT {{ wtree[1].append(STATEMNT) }} 305 )* 306 DO {{ __loopstack__.insert(0,DO) }} 307 ( STATEMNT {{ wtree[2].append(STATEMNT) }} 308 ) * 309 ENDWHILE {{ del __loopstack__[0] }} 310 {{ return wtree }} 311 312 # Assignment 3 (STUDENT) push DOUNTIL to the __loopstack__ after parsing it, 313 # and pop it after parsing UNTIL. Push at __loopstack__[0]. 314 rule STEP_DOUNTIL: DOUNTIL {{ dtree = [‘dountil’, [], []] }} 315 ( STATEMNT {{ dtree[1].append(STATEMNT) }} 316 ) * 317 UNTIL 318 ( STATEMNT {{ dtree[2].append(STATEMNT) }} 319 ) * 320 ENDUNTIL {{ return dtree }} 321 322

Page 199: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 7 of 8

323 # Assignment 3: Verify that each statement in SSTATEMNT is a name 324 # of a proc by invoking __verify_proc__(). 325 # THIS IS ALREADY DONE. It calls your __verify_proc__()!!! 326 rule STEP_SWITCH: SWITCH {{ stree = [‘switch’,[],[]] }} 327 ( STATEMNT {{ stree[1].append(STATEMNT) }} 328 ) * 329 CASE 330 ( SSTATEMNT {{ stree[2].append(SSTATEMNT) }} 331 {{ __verify_proc__(SSTATEMNT,stree[0],””) }} 332 ) + 333 ENDCASE {{ return stree }} 334 335 rule SSTATEMNT: SYMBOL {{ return ((‘symbol’,SYMBOL)) }} 336 337 rule STEP_DATA: STEP_CONST {{ return STEP_CONST }} 338 | STEP_VAR {{ return STEP_VAR }} 339 | STEP_ARRAY {{ return STEP_ARRAY }} 340 | STEP_TABLE {{ return STEP_TABLE }} 341 342 # THIS DONE DONE! 343 # Assignment 3: Add a fourth element to the ‘constant’ subtree consisting 344 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 345 # resolve to an int, store an error message in __errorlist__ and pretend 346 # it was a 0, in order to avoid cascading error messages. 347 # Also add this constant to the the symbol table. 348 rule STEP_CONST: CONST SYMBOL 349 {{ ctree = [‘constant’,((‘symbol’,SYMBOL))] }} 350 SYM_OR_INT {{ ctree.append(SYM_OR_INT) }} 351 {{ ctree.append(__find_constant__(ctree[2], ctree[0], ctree[1][1])) }} 352 {{ __add_to_symbol_table__(ctree) }} 353 {{ return ctree }} 354 355 # Assignment 3: Add this variable to the the symbol table. 356 rule STEP_VAR: VAR SYMBOL 357 {{ vtree = [‘variable’,((‘symbol’,SYMBOL))] }} 358 {{ __add_to_symbol_table__(vtree) }} 359 {{ return vtree }} 360 361 # STUDENT: Use STEP_CONST above as a guide for how to do this part. 362 # Assignment 3: Add a fourth element to the ‘array’ subtree consisting 363 # of the actual int value of a constant SYM_OR_INT. If the symbol does not 364 # resolve to an int, store an error message in __errorlist__ and pretend 365 # it was a 0, in order to avoid cascading error messages. 366 # Also add this array to the the symbol table. 367 rule STEP_ARRAY: ARRAY SYMBOL 368 {{ atree = [‘array’, ((‘symbol’,SYMBOL))] }} 369 SYM_OR_INT {{ atree.append(SYM_OR_INT) }} 370 {{ return atree }} 371 372 # THIS STEP IS DONE. 373 # Assignment 3: Replace the SYM_OR_INT elements of a TABLE with a 374 # 2-tuple consisting of the original SYM_OR_INT, followed by its int 375 # value. If the symbol does not resolve to an int, store an error message

Page 200: CSC 310 assignment 3 - Kutztown University of Pennsylvaniafaculty.kutztown.edu/parson/spring2009/csc310assign3.pdf · CSC 310, Spring, 2009, assignment 3, page 6 of 8 269 ) * 270

CSC 310, Spring, 2009, assignment 3, page 8 of 8

376 # in __errorlist__ and pretend it was a 0, in order to avoid cascading 377 # error messages. 378 # Also add this table to the the symbol table. 379 rule STEP_TABLE: TABLE SYMBOL 380 {{ ttree = [‘table’,((‘symbol’,SYMBOL))] }} 381 SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 382 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 383 {{ ttree[-1] = (ttree[-1], i) }} 384 ( SYM_OR_INT {{ ttree.append(SYM_OR_INT) }} 385 {{ i = __find_constant__(ttree[-1],ttree[0],ttree[1][1]) }} 386 {{ ttree[-1] = (ttree[-1], i) }} 387 ) * 388 ENDTABLE {{ __add_to_symbol_table__(ttree) }} 389 {{ return ttree }} 390 391 392 rule SYM_OR_INT: SYMBOL {{ return ((‘symbol’, SYMBOL)) }} 393 | INTEGER {{ return ((‘integer’, INTEGER)) }} 394 395 rule INTEGER: DECINTEGER {{ return DECINTEGER }} 396 | HEXINTEGER {{ return HEXINTEGER }} 397 | OCTINTEGER {{ return OCTINTEGER }} 398 399 %% 400 401 if __name__==’__main__’: 402 print ‘Test driver for STEP program parser.’ 403 while 1: 404 try: s = raw_input(‘>>> ‘) 405 except EOFError: break 406 if not strip(s): break 407 print parse(‘goal’, s) 408 print ‘Bye.’ 409


Recommended