8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations
http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 1/6
P r o c e e d in g s o f t h e
A C M S I G P L A N '8 4
S y m p o s i u m o n C o m ? i l e r
C o n s t r u c t i o n
SI G PL AN Not ices Vol . 19 , No. 6 , June 198~
A u t o m a t i c G e n e r a t i o n o f P e e p h o l e O p t i m i z a ti o n s ~
Jack W. Davidson
Dept. o f Appl ied M athemat ics and C omputer Science
University o f Virginia
Charlottesville, VA 22901
Christopher W. Fraser
Dept . of Com puter Science
Universi ty o f Arizona
Tucson, AZ 85721
A bstr a c t
T h i s p a p e r d e s c r i b es a s y s t e m t h a t a u t o m a t i c a l l y
g e n e r a t e s p e e p h o l e o p t i m i z a t i o n s . A g e n er a l
p e e p h o l e o p t i m i z e r d r i v e n b y a m a c h i n e d e s c r i p t i o n
p r o d u c e s o p t i m i z a t i o n s a t c o m p i l e - c o m p i l e t i m e f o r
a f a s t , p a t t e r n - d i r e c t e d , c o m p i l e - t i m e o p t i m i z e r .
T h e y f o r m p a r t o f a c o m p i l e r t h a t si m p l i fi e s r e t a r g e t -
i n g b y s u b s t i t u ti n g p e e p h o l e o p t i m i z a t i o n f o r c a se
a n a l y s i s .
1 . In tr o duc t io n
C o d e g e n e r a t o r s o f t e n c r e a te i n e ff i c ie n t j u x t a p o -
s i ti o n s. F o r e x a m p l e , i n c r e m e n t i n g a n d t e s ti n g a
v a r i a b l e c a n c r e a t e a r e d u n d a n t c o m p a r i s o n i f t h e
c o d e f o r t h e i n c r e m e n t a u t o m a t i c a l l y s e ts a c o n d i t i o n
c o d e r e g i s t e r. C o r r e c t i n g t h i s i n t h e c o d e g e n e r a t o r
c o m p l i c a t e s c a s e a n al y s i s c o m b i n a t o r i a l l y , s i nc e e a c h
c o m b i n a t i o n o f la n g u a g e f e a t u r es m a y g e n e r a t e a
u n i q u e j u x t a p o s i t i o n [ 9] , I t i s o f t e n c h e a p e r to g e n -
e r a t e c o d e l o c a l l y a n d t h e n u s e a p e e p h o l e o p t i m i z e r
t o i m p r o v e i n e f fi c i en t j u x t a p o s i t i o n s . P e e p h o l e
o p t i m i z a t i o n t y p i c a l l y r e d u c e s c o d e s i z e b y 1 0 - 5 0 %
[ 1 4 , 1 7] . E v e n t h e n e w c o d e g e n e r a t o r s d r i v e n b y
m a c h i n e d e s c r i p t i o n s [ 6] b e n e f it f r o m p e e p h o l e
o p t i m i z a t i o n [ 2] .
C la s s i c a l p e e p h o le o p t im iz e r s [1 , 1 4 , 1 5, 1 7 ]
r a p i d l y c o r r e c t a f e w h a n d - w r i t t e n , m a c h i n e - s p e c i f i c
]This w ork was supported n p art by the National Science Founda-
tion und er Grant MCS-7802545.
Permission to cop y without fee all or par t of this material s grant-
ed provided that the copies are n ot mad e or distributed for direct
commercial advantage, the ACM copyright notice and title o f the
publication and its date app ear, and n otice is given hat copying s
by permission of the Association for Com puting Machinery. To
copy otherwise, or to republish, requires a fee and or specificper-
mission.
©1984 ACM 0-89791-139-3/84[0600/0111500.75
p a t te r n s. F o r e x am p l e , t h e a m b i t i o u s F I N A L
o p t i m i z e r i n th e B L I S S - I i c o m p i l e r [1 7] d el e te s
u n n e c e s s a r y c o m p a r i s o n s , e x p l o i ts s p e c ia l - ca s e
i n s t r u c t i o n s a n d e x o t i c a d d r e s s i n g m o d e s , c o a l e s c e s
c h a i n s o f b r a n c h e s , a n d d e l e t e s u n r e a c h a b l e c o d e .
U n f o r t u n a t e l y , g o o d p a t t e r n s c a n b e h a r d t o i d e n ti f y
a n d a r e l a n g u a g e - , c o m p i l e r - , a n d m a c h i n e - sp e c i fi c .
A r e c e n t a l t e r n a t i v e t o c l a s s i c a l p e e p h o l e o p t i m i z -
e r s [ 3 ] us e s a m a c h i n e d e s c r i p t i o n t o s i m u l a t e a d j a -
c e n t i n s t r u c t i o n s , r e p l a c i n g t h e m , w h e r e v e r p o s s i b l e,
w i t h a n e q u i v a l e n t si n g l et o n . S u c h m a c h i n e - d i r e c t e d
o p t i m i z e r s u s e n o p a t t e r n s , s o t h e y a r e m o r e
t h o r o u g h a n d p o r t a b l e t h a n t h e i r c la s si c al c o u n t e r -
p a r t s , b u t t h e y a r e s l o w e r . T h e i r t h o r o u g h n e s s
a l l o w s t h e u s e o f n a i v e , e a si l y r e t a r g e t e d c o d e g e n e r a -
t o r s , b u t v e r b o s e c o d e m a k e s o p t i m i z a t i o n s p e e d
e v e n m o r e c r u c i a l .
T h i s p a p e r d e s c r i b e s a s y s t e m t h a t a u t o m a t i c a l l y
g e n e r a t e s p a t t e r n s f o r a f a s t c l a s s i c a l p e e p h o l e o p t i m -
i ze r . A m o d e r n m a c h i n e - d i r e c t e d o p t i m i z e r is r u n a t
compile-compile t i m e , a n d p a t t e r n s f o r a f a s t , c l a s s i -
c a l
compile-t ime
p e e p h o l e o p t i m i z e r a r e a u t o m a t i -
c a l ly i n f e rr e d f r o m i t s o u t p u t . T h i s c o m b i n e s t h e
t h o r o u g h n e s s a n d r e t a r g e t a b il i t y o f a m a c h i n e -
d i r e c t ed p e e p h o l e o p t i m i z e r w i t h t h e s p e e d o f a c la s -
s ic a l p e e p h o l e o p t i m i z e r . T h i s h a s s p e d u p t h e
p e e p h o l e o p t i m i z a t i o n p h a s e o f a r e t a r g e t a b l e c o m -
p i l e r b y a f a c t o r o f fi v e .
2 . A Ma c h ine D ir e c te d O pt imiz er
T h e s y s t e m u s e s a r e t a r g e t a b l e p e ' ep h o l e o p t i m -
i z e r c a l l e d P O . O t h e r d o c u m e n t s e l a b o r a t e o n PO
i t s e l f [ 3 , 4 ] ; t h i s p a p e r s u m m a r i z e s i t o n l y e n o u g h t o
i n t r o d u c e a n e w a p p l i c a ti o n : g e n e r a t i n g p a t te r n s f o r
a f a s t, c l a s s ic a l p e e p h o l e o p t i m i z e r .
111
8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations
http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 2/6
G i v e n a n a s s e m b l y l a n g u a g e p r o g r a m a n d a s y m -
b o l i c m a c h i n e d e s c r i p t i o n , PO s i m u l a t e s a d j a c e n t
i n s t r u c t i o n s a n d , w h e r e p o s s i b l e , r e p l a c e s t h e m w i t h
a n e q u i v a l e n t si n g le in s t r u c t i o n . E a c h m a c h i n e
d e s c r i p t i o n i s a g r a m m a r f o r s y n t a x - d i r e c t e d t r a n s l a -
t i o n b e t w e e n a s s e m b l y l a n g u a g e a n d r e g i s t e r
t r a n sf e r s . F o r e x a m p l e , t h e p r o d u c t i o n
m o v l s r e , d s t : = : d s t = s r c ; N Z = s r e ? 0 ;
d e s c r i b e s t h e V A X m o v l i n s t r u c t i o n , w h i c h c o p i e s it s
f i r s t o p e r a n d o n t o i t s s e c o n d a n d s e t s t h e c o n d i t i o n
c o d e t o r e f le c t t h e s i g n o f t h e r e s u l t. S i m i l a r p r o d u c -
t i o n s d e s c r i b e a d d r e s s i n g m o d e s .
T o i m p r o v e a n i n s t r u c t i o n , P O m u s t k n o w i ts
e f f e c t , t h a t i s , t h e r e g i s t e r t r a n s f e r s t h a t i t p e r f o r m s .
E a r l y v e r s i o n s o f PO c o m p u t e d e f fe c ts b y m a t c h i n g
a s s e m b l e r i n s t r u c t i o n s a g a i n s t t h e a s s e m b l e r s y n t a x
p a t t e r n s a b o v e a n d i n s t a n t i a t i n g t h e c o r r e s p o n d i n g
r e g is t e r t r a n s f e r p a t t e r n s . T h e m o s t r e c e n t v e r s i o n
s k i p s t h i s w i t h a c o m p i l e r t h a t e m i t s r e g i s t e r t r a n s f e r s
d i r e c t l y . R e g i s t e r tr a n s f e r s a r e n o h a r d e r t o e m i t
t h a n a s s e m b l y c o d e .
O n c e P O h a s t h e e f f e c t o f e a c h i n s t r u c t i o n , i t s y m -
b o l i c a l l y s i m u l a t e s t w o - a n d t h r e e - i n s t r u c t i o n
s e q u e n c es t o f o r m t h e i r c o m b i n e d e f f e c t . P O t h e n
s e a rc h e s t h e m a c h i n e d e s c r i p t i o n f o r a n i n s t r u c t i o n
w i t h t h i s c o m b i n e d e f f e c t . I f i t f i n d s o n e , i t r e p l a c e s
t h e o r i g i n a l i n s t ru c t i o n s w i t h t h e n e w o n e . F o r
e x a m p l e , t h e e f fe c t s o f t h e V A X i n s t r u c t i o n s
movl X , r l
s u b l 2 Y , r l
a r e
r [ 1 ] = m [ X ] ; N Z = m [ X ] ? 0 ;
r [ 1 ] = r [ 1 ] - r e [ Y ] ; N Z - r [ 1 ] - r e [ Y ] ? 0 ;
S y m b o l i c s i m u l a t i o n c o m b i n e s t h e s e t o y i e ld
r [ 1 ] = m [ X ] - r e [ Y ] ; N Z = m [ X ] - m [ Y ] ? O ;
w h i c h i s r e a l i z e d b y t h e i n s t r u c t i o n
s ub l3 Y , X , l
s o t h i s i n s t r u c t i o n r e p l a c e s t h e t w o a b o v e .
U n l i k e c l a s s i c a l p e e p h o l e o p t i m i z e r s , P O
h a s n o
p a t t e r n s : i t c o m b i n e s a l l p o s s i b l e p a i r s a n d t r i p l e s .
A s a r e s u l t , i ts e f f e c t c a n b e d e s c r ib e d f o r m a l l y a n d
c o n c i s e l y : w h e n i t is f i n i s h e d , n o o n e - , t w o - , o r
t h r e e - i n s t r u c t i o n s e q u e n c e c a n b e r e p l a c e d w i t h a
c h e a p e r s i n g l e i n s t r u c t i o n h a v i n g t h e s a m e e f f e c t .
T h i s t h o r o u g h n e s s a l l o w s c o d e g e n e r a t o r s t o f o r g o
c a s e a n a l y s i s a n d e m i t o n l y a s m a l l s u b s e t o f t h e
m a c h i n e ' s i n s t r u c t i o n s a n d a d d r e s s i n g m o d e s ( e . g . ,
o n e f o r m o f a d d , o n e f o r m o f s u b t ra c t ) . P O r e p la c e s
t h e m w i t h b e t t e r i n s t r u c t i o n s a s it c o m b i n e s a d j a c e n -
c ie s. A c o m p i l e r f o r th e p r o g r a m m i n g l a n g u a g e v [ 8]
b a s e d o n t h i s t e c h n i q u e [ 4 , 5 ] h a s b e e n r e t a r g e t e d t o
s e v e n d i f f e r e n t a r c h i t e c t u r e s , s o m e i n a s f e w a s t h r e e
m a n - d a y s . I t e m i ts c o d e c o m p a r a b l e t o h o s t - sp e c i fi c
c o m p i l e r s .
T h i s r e l i a n c e o n p e e p h o l e o p t i m i z a t i o n m a k e s
o p t i m i z a t i o n s p e e d e s p e c i a l l y c r u c i a l, a n d P O i s
s l o w e r t h a n c l a s s i c a l t a r g e t - s p e c i f i c p e e p h o l e o p t i m -
i z er s . T h e Y c o m p i l e r r u n s a t a f o u r t h t h e s p e e d o f
t h e U N I X p o r t a b l e C c o m p i l e r [ 1 0] , a n d P O u s e s
a l m o s t h a l f o f i ts ti m e . P r o p o s a l s t o s p e e d u p o p t i m -
i z e rs l i k e e o a r e a l r e a d y e m e r g i n g ~ [ 7, 1 I , 1 2 ] . T h e y
p r o p o s e t o p e r f o r m a t c o m p i l e - c o m p i l e t i m e s o m e o f
t h e s y m b o l i c s i m u l a t i o n t h a t P O p e r f o r m s a t c o m p i l e
t i m e . T h i s e n t a il s c o n s i d e r in g a t c o m p i l e - c o m p i l e
t i m e a l l p o s s i b l e p a i r s o f i n s t r u c t i o n s [ 1 2 ] o r a l l t h a t
u s e c e r ta i n r ul e s ( l ik e e l i m i n a t e r e d u n d a n t i n s tr u c -
t i o n s [ 7, 11 ] ) . N a t u r a l l y , t r a d e - o f f s a p p e a r l i k e l y - -
t h e f ir s t a p p r o a c h m a y b e c o s t ly o n s o m e m a c h i n e s ,
t h e s e c o n d m a y m i ss o p t i m i z a t i o n s , a n d b o t h m a y
g e n e r a t e u n u s e d o p t i m i z a t i o n s t h o u g h t h e p r o p o -
s a l s c e r t a i n l y m e r i t f u r t h e r i n v e s t i g a t i o n . T h e
s o f t w a r e d e s c r i b e d b e l o w c o m p l e m e n t s t h e s e
a p p r o a c h e s b y a u t o m a t i c a l l y i n f er r i n g p a t t e r n s f r o m
P O 'S b e h a v i o r o n s a m p l e d a t a .
3 . A u t o m a t i c G e n e r a t io n o f P a t te r n s
T o i m p r o v e s p e e d ,
PO
i s n o w u s e d a t c o m p i l e -
c o m p i l e t i m e t o g e n e r a t e p a t t e r n s f o r a f a s t c o m p i l e -
t i m e o p t i m i z e r , c a l l e d H O P , w h i c h m a y t h e n b e u s e d
i n P O 's p l a c e . H O P p a t t e r n s a r e e n c o d e d a s t e x t w i t h
e m b e d d e d p a t t e r n v a r i a b le s o f t h e f o r m $ i t o d e n o t e
c o n t e x t - se n s i t iv e o p e r a n d s . T h u s t h e p a t t e r n
r [ $ 1 ] = m [ $ 2 ]
r [ $ 1 ] = r [ $ 1 ] -
m [ 3 ]
r [ 1 ] = m [ 2 ] - m [ $ 3 ]
s p e c if i e s t h a t r e g i s t e r t r a n s f e r s l i k e
r [ 2 ] = m [ X ]
r [ 2 ] = r [ 2 ] -
m [ Y ]
s h o u l d b e r e p l a c e d w i t h
r [ 2 ] = m [ X ] - m [ Y ]
O t h e r c l a s s i c al p e e p h o l e o p t i m i z e r s u s e s i m i l a r
e n c o d i n g s [ 14 , 1 6 ] . A n a p p e n d i x g i v e s f u r t h e r e x a m -
p l es o f s u c h o p t i m i z a t i o n s a n d t h e i r a p p l i c a t io n .
i Only one of these proposals reports a prototype [12]. It
is m ore powerful than an earl y version of PO, though n ot
the cu rrent version. It co nsider s O(N ) pairs to PO'S O(N),
and, thou gh it ap pears likely that a daptations could run in
linear time, it is too early to comp are their speed with PO'S.
1 1 2
8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations
http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 3/6
H O P p a t t e r n s a r e i n f e r r e d f r o m P O 's b e h a v i o r o n a
t r a i n i n g s e t. A S a n o p t i o n , P O c a n r e c o r d e a c h
r e p l a c e m e n t i t m a k e s . F o r e x a m p l e , w h e n P O m a k e s
a r e p l a c e m e n t l i k e t h e o n e a b o v e , i t w r i t e s
r [ 2 ] = m [ X ]
r [ 2 ] = r [ 2 ] - m [ Y ]
r [ 2 ] = m [ X ] - m [ Y ]
t o a d i ag nos t i c f i le .
T h i s o u t p u t is a u t o m a t i c a l l y r e du c e d t o p a t t e r n s
b y r e p l a c i n g e a c h d i s t i n c t a s s e m b l y - t i m e c o n s t a n t
w i t h $ i . F o r e x a m p l e , t h e d i a g n o s t i c o u t p u t a b o v e
w o u l d b e c o m e
r [ $ 1 ] = m [ $ 2 ]
r [ $ 1 ] = r [ $ 1 ] - m [ $ 3 ]
r [ $ 1 ] = r n [ $ 2 ] - r n [ $ 3 ]
w h i c h i s t h e p a t t e r n a t t h e h e a d o f t h is se c t i o n . T h e
s y n t a x o f a s s e m b l y - t i m e c o n s t a n t s i s p o t e n t i a l l y
t a rge t - s pec i f i c . HOP i s r e t a rge ted b y s pec i fy ing th i s
s y n t a x .
PO recor ds th e l a s t u s e o f e ach reg i s t e r in e ach
b l o c k , b e c a u s e t h i s a l l o w s i t t o m a k e r e p l a c e m e n t s
t h a t w o u l d o t h e r w i s e c h a n g e t h e e f f e c t o f t h e p r o -
g r a m . W h e n t h i s i n f o r m a t i o n i s u s e d , it is a l s o
r e c o r d e d i n t h e d i a g n o s t i c o u t p u t :
r [ 2 ] = i
r [ 3 ] = m [ r [ 2 ] ] ( r [2 ] d e a d )
r [ 3 ] = m [ i ]
T h e s e o b i t u a r i e s a r e a u t o m a t i c a l l y r e d u c e d t o p a t -
t e r n s w i t h t h e r e s t o f t h e d i a g n o s t i c o u t p u t . T h u s t h e
e x a m p l e a b o v e y i el d s th e p a t t e r n
r [ $ 1 ] = $ 2
r [ $ 3 ] = m [ r [ $ 1 ] ] ( r [ $ 1 ] d e a d )
r [ $ 3 ] = m [ $ 2 ]
T h e a p p e n d i x d i s p l a y s s e v e r a l s u c h o p t i m i z a t i o n s .
A f e w p r o p o s e d p a t t e r n s a r e to o g e n e r al . F o r
e x a m p l e , t h e D E C S y s t e m - 1 0 d i a g n o s t ic o u t p u t
r [ 2 ] = m [ X ]
r [ 2 ] = r [ 2 ] + 1
m [ X ] = r [ 2 ] ( r [ 2 ] d e a d )
r n [ x ] : m [ X ] + 1
s h o u l d n o t y i e l d th e p a t t e r n
r [ $ 1 ] = m [ $ 2 ]
r [ $ 1 ] = r [ $ 1 ] + $ 3
m [ $ 2 ] = r [ $ 1 ] ( r [ $ 1 ] d e a d )
m [ $ 2 ] = m [ $ 2 ] + $ 3
b e c a u s e t h e r e p l a c e m e n t i s o n l y v a l id i f t h e i n c r e m e n t
$3 is 1 . T he va l id i ty o f p ro pos e d p a t t e rns l ike the one
a b o v e c o u l d b e c h e c k e d w i t h t h e m a c h i n e d e s c r i p t io n
m u c h a s P O c h e c k s p r o p o s e d c o m b i n a t i o n s o f
i n s t r u c t io n s . W h e n t h e i n s t r u c t i o n c h e c k e r d e te r -
m i n e d t h a t $ 3 c o u l d o n l y m a t c h 1 , i t c o u l d r e w r i t e t h e
p a t t e r n a c c o r d i n g l y . A t p r e s e n t , a si m p l e r e x p e d i e n t
i s u s ed : con s tan t s l i ke ze ro and on e tha t a re s pec ia l
to s o m e ins t ru c t ions ( i . e. , t ha t appea r exp l i c i t ly in the
m a c h i n e d e s c r i p t i o n ) a r e a d d e d t o a n e x c e p t i o n l i s t
and neve r rep laced wi th $ i . T h i s gene ra te s a few
e x t r a p a t t e r n s w h e n t h e s e c o n s t a n t s a p p e a r i n c o n -
t ex t s whe re they a re no t s pec ia l ( e . g . , a s reg i s t e r
ind ice s ) , bu t the nu m ber o f the s e i s s ma l l .
G i v e n t h e e s t a b l i s h e d s i m p l i c i t y o f t y p i c a l p r o -
g r a m s [ 1 3] , c o m p i l i n g a l a r g e , v a r i e d t r a i n i n g
t e s t b e d w i t h P o s h o u l d y i e l d e n o u g h d i a g n o s t i c o u t -
p u t t o g e n e r a t e m o s t n e e d e d p a t t e r n s . A t p r e s e n t, t h e
t e s t b e d i s t h e Y c o m p i l e r ' s f r o n t e n d , w h i c h c o m p i l e s
Y i n t o a s i m p l e a b s t r a c t m a c h i n e c o d e , p l u s a f e w
ex t r a t e s t c a s e s , wh ich exe rc i s e the few ope ra to rs s el -
d o m u s e d i n t h e c o m p i l e r . F i g u r e 1 p l o t s f o r t h is
t e s tb e d t h e n u m b e r o f V A X p a t te r n s g e n e r at e d
v e r su s t h e n u m b e r o f a ct u a l r e p l ac e m e n t s f r o m w h i c h
t h e p a t t e r n s a r e g e n e r a t e d . T h e p a t t e r n f il e g r o w s
rap id ly a t f i r s t and then l eve l s o f f . T he 17 ,138
r e p l a c e m e n t s g e n e r a t e o n l y 6 2 7 d i s t i n c t p a t t e r n s .
Us ing th i s pa t t e rn f i l e , HOP y ie lds the s ame re s u l t a s
P O w h e n c o m p i l i n g r o u t i n e s f r o m t h e t es t b e d . W h e n
c o m p i l i n g o t h e r t y p i c a l r o u t i n e s , H O P ' s r e s u l t s a r e
o n l y a b o u t 2 % l a r g e r t h a n P O 'S , w h i c h s u g g e s t s t h a t
e v e n t h i s s m a l l t e s t b e d i s a d e q u a t e .
U l t i m a t e l y , i t s h o u l d b e p o s s i b l e t o d o w i t h o u t a
t e s t b e d , b y u s i n g a n i n c r e m e n t a l t r a i n i n g p h a s e . T h i s
c o u l d b e i m p l e m e n t e d b y t h e f o l l o w i n g c h a n g e s t o
PO. Af te r r ep lac ing a pa i r o r t r ip l e , PO wo u ld in t e r -
n a l l y r e c o r d t h e p a t t e r n r e p r e s e n t e d b y t h e r e p l a c e -
m e n t ; i f th e p a i r o r t r i p le c o u l d n o t b e r e p l ac e d , P O
w o u l d n o t e t h i s a s w e l l. A l s o , PO w o u l d b e c h a n g e d
t o c o n s u l t t h i s r e c o r d a n d u s e t h e f a s t a l g o r i t h m
d e s c r i b e d b e l o w t o r e p l a c e o r r e j e c t j u x t a p o s i t i o n s
t h a t h a v e a p p e a r e d b e f o r e ; it w o u l d f a l l b a c k o n i ts
o r i g i n a l , sl o w e r a l g o r i t h m o n l y f o r j u x t a p o s i t i o n s
t h a t h a d n e v e r a p p e a r e d b e f o r e . T h u s P O w o u l d
r e a c h H O P ' s s p e e d a f t e r a f e w c o m p i l a t i o n s , a n d i t
w o u l d n e v e r m i s s a n o p t i m i z a t i o n d u e to i n s u f f i c ie n t
t r a i n i n g b e c a u s e P O ' s g e n e r a l m e c h a n i s m w o u l d b e
a v a i l a b l e f o r n e w j u x t a p o s i t i o n s .
1 1 3
8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations
http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 4/6
4, A Pattern-Directed Optimizer
HOP matches patterns without actua l string mani-
pulation, by separating each instruction's pattern or
skel eton from its operands as it reads them. This is
accomplished at compile time by the same procedure
used to form patterns at compile-compile time. For
example, the instruction
r [ 2 ] = r [ 2 1 - m [ Y ]
is reduced to the skeleton
r [ $ 1 ] = r [ $ 1 ] - m [ $ 2 1
plus the ope rands 2 and Y, respectively. That is, the
instruction is represented by the triple
r [ $ 1 ] = r [ $ 1 ]
-
m [ $ 2 ] , 2 , Y
This representation is a little like conventional
assembly code. The skeleton in the first field is deter-
mined roughly by the instruction's opcode and mode
bits. The operands in the remaining fields are deter-
mined roughly by the instruction's address and regis-
ter fields.
Hashing helps HOP match patterns and form
replacements fast. HOP stores skeletons and
operands uniquely in a hash table, so an input skele-
ton is compared with a line from a pattern by merely
comparing two addresses. This operatio n is logically
similar to, and costs about the same as, comparing
two binary opcodes in a classical peephole optimizer.
If a run of input skeletons matches some complete
pattern, then inter-instruction op erand consistency is
checked, again by comparin g addresses. Finally,
HOP forms replacements without actual string mani-
pulation. The skeleton for the replacement instruc-
tion is the last line of the successful pattern, and the
operands for the replacement instruction are formed
by reordering the input operands. Thus the typical
pattern is matched and, if successful, replaced, by
comparing and moving about a dozen pointers.
One detail complicates this procedure. The $i in
input skeletons are numbered from one, so pattern-
matching without string operations requires
renumbering the $i from each line of each pattern
when the pattern file is read. For example, the input
r[4] = m[A]
r[4] = r[4] - re[B]
is transla ted into the triples
r [ $ 1 ] = m [ $ 2 1 , 4 , A
r [ $ 1 ] = r [ $ 1 ] - m [ $ 2 ] , 4 , B
as it is read. To compare such triples with the pattern
r [ $ 1 ] = m [ $ 2 ]
r [ $ 1 ] = r [ $ 1 ] - m [ $ 3 ]
r [ $ 1 ] = m [ $ 2 ] - m [ $ 3 ]
without string operations, the $i of the second line of
the pattern are renumbered to yield
r [ $ 1 ] = r [ $ 1 ] - m [ $ 2 ]
as the patt ern file is read. The two strings are now
identically equal and can be compared by comparing
addresses in the hash table. A record of the
renumbering is retained for checking inter-
instruction operand consistency.
The input triples above are compared with the
pattern above as follows. First, the two input skele-
tons
r[$1] = m[$2]
r[$1] = r[$1] - m[$2]
are compared with the first two (renumbered) lines of
the pattern
r [ $ 1 ] = m [ $ 2 1
r [ $ 1 ] = r [ $ 1 ] - m [ $ 2 ]
by comparing two pairs of pointers. Next, HOP
checks that $i denotes the same operand in both
input instructions. Since $1 is the only $i that
appears more than once in the original (unrenum-
bered) pattern?, this merely compares the first
operand from the first instruction (the first 4) with
the first operand from the second instruction (the
second 4), again by comparing two string table
addresses. Since all comparisons have succeeded, a
replacement inst ruction is formed. Its skeleton is the
last line of the pattern
r [ $ 1 1 = m [ $ 2 1 - m [ $ 3 ]
and its three operands are t he 4 and A from the first
instruction and the B from the second instruction.
This represents the instruction
r[4] = m[A] - m[B]
which is the desired replacement for the two instruc-
tions above.
Hashing also helps locate applicable patterns
rapidly. HOP stores its pat terns in a hash table keyed
by the hashed addresses of the (uniquely stored)
skeletons tha t each matches. Thus HOP identifies the
patterns that apply to a given input sequence by
hashing the addresses of the skeletons from the input
I'$2 appears more than once in the
renumbered
pattern,
but this is an artifact of renumbering and so does not re-
quire consistencychecking.
1 1 4
8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations
http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 5/6
sequence. If this hash table is made large enough to
make collisions rare, H O P identifies any applicable
patterns in nearly constant time.
These measures make H O P f a s t , abou t 5 times fas-
ter than PO. In a typical application, it read 269 lines,
performed 136 replacements, and wrote out the
results in 1.3 CPU seconds on a VAX-I1/780. It
spends most of its time reading its input and building
the structures above.
replacements take less
time, the pattern file
compile-compile time.
incorporated patterns
takes 120K bytes.
HOP can also be
The actual matching and
than 5% of its time. To save
is incorporated into HOP at
For the VAX, HOP plus these
take 150 K bytes where PO
used for code generation.
Abstract machines are often mapped onto real
machines by macros, and single-input replacement
patterns are essentially macros. A compiler can thus
be retargeted by writing a machine description and
some patterns for naive code generation. These will
be augmented by automatically generated optimiza-
tion patterns. The use of a single program for code
generation and optimization should make compilers
faster, simpler, and easier to retarget.
HOP can also be used on assembly code. The
hand-written patterns for code generation could emit
assembly code, for this can be mapped to and from
register transfers for PO by translators automati cally
generated from the machine description [3].
Translating assembly code to register transfers would
slow Po, but this is unimportant now that HOP has
replaced PO at compile time.
A c k n o w l e d g m e n t s
The authors thank Dave Hanson for his many
helpful comments, and Torben Nielsen for his techni-
cal assistance.
A p p e n d i x
This appendix traces the optimization of the
VAX code for
j = i + 4
The figure below gives postfix intermediate code and
corresponding naive object code for this statement.
p o s t f i x o b j e c t c o d e
1. p u s h i r [ 2 ] = m [ i ]
2 . p u s h c 4 r [ 3 ] = 4
3 . a d d r [ 2 ] = r [ 2 ] + r [ 3 ] ( r [ 3 ] d e a d )
4 . p o p j m [ j ] = r [ 2 ] ( r [ 2 ] d e a d )
Initially, the pa ttern
r [ $ 1 ] = $ 2
r [ $ 3 ] = r [ $ 3 ] + r [ $ 1 ] ( r [ $ 1 ] d e a d )
r [ $ 3 ] = r [ $ 3 ] + $ 2
replaces instructions 2 and 3 with
r [ 2 ] = r [ 2 ] ÷ 4
Next, the pattern
r [ $ 1 1 = m [ $ 2 ]
r [ $ 1 ] = r [ $ 1 ] + $ 3
r [ $ 1 ] = m [ $ 2 ] + $ 3
combines instruction 1 with this new instruction,
yielding
r[2] = mill + 4
Finally, the pattern
r [ $ 1 ] = m [ $ 2 ] * $ 3
m [ $ 4 ] = r [ $ 1 ] ( r [ $ 1 ] d e a d )
m [ $ 4 ] = m [ $ 2 ] + $ 3
replaces this last instruc tion and instruction 4 with
m [ j ] = m [ i ] + 4
which represents the VAX inst ruction
a d d l 3 4 , i , j
Thus the four original instructions have been
replaced with one.
References
1o
2.
4.
.
J. T. Bagwell, Jr., Local Optimizations,
SIGPLANNotices
5, 7 (July 1970), 52-66.
T. Crowley, Combining Table-driven Effect
Selection and Description-Driven Peephole
Optimization for Automatic Code Generation,
MS thesis, MIT, September 1982.
J. W. Davidson and C. W. Fraser, The Design
and Application of a Retargetable Peephole
Optimizer, A CM Trans. Prog. Lang . and
Systems 2, 2 (April 1980), 191-202.
J. W. Davidson, Simplifying Code Generation
Through Peephole Optimization PhD
dissertation, University of Arizona, December
1981.
J. W. Davidson and C. W. Fraser, Code
Selection Through Object Code Optimization,
A CM Trans. Prog. Lang. and Systems to
appear.
115
8/19/2019 Davidson and Fraser. Automatic Generation of Peephole Optimizations
http://slidepdf.com/reader/full/davidson-and-fraser-automatic-generation-of-peephole-optimizations 6/6
6. M. Gan apa th i , C . N. F ischer and J . L .
H e nne ssy , R e t a r ge t a b l e C om pi l e r C ode
G e ne r a t i on , C o m p u t in g S u rv e y s 1 4 , 4
(D ecem ber 1982) , 573-592.
7 . R . G ie ge ri c h , A F o r m a l F r a m e w o r k f o r t he
D e r iva t i on o f M a c h ine - S pe c i f i c O p t im iz e r s ,
A C M Trans . Prog . L ang. an d Sys tem s 5 , 3
(Ju ly 1983) , 478-498.
8 . D . R . H a n s o n , T h e Y P r o g r a m m i n g L a n g u a g e,
S I G P L A N N o ti ce s
16, 2 ( Feb . 1981), 59-68.
9 . W . H a r r i son , A N e w S t r a t e gy f o r C od e
G e ne r a t i on - The G e ne r a l P u r pose O p t im iz ing
C om pi l e r ,
C o n f . R ec . 4 th A C M S y ru p. o n
P r in . o f P ro g ra m m in g L a n g u a g e s , J a n u a r y
1977, 29-37.
1 0. S . C . J o h n s o n , A P o r t a b le C o m p i l e r: T h e o r y
a nd P r a c t i c e , Conf . Rec . 5 th A C M Syrup . on
P r in . o f P ro g ra m m in g L a n g u a g e s , Jan. 1978,
97-104.
l l . P . B . Kess le r, Ma chine Dep enden c ies in
R e ta r ge t a b l e C om pi l e r C ons t r uc t i on ,
D i s s e r t a t i o n p r o p o s a l , D e p a r t m e n t o f
E le c t ri c a l Eng ine e r ing a nd C o m p u te r S c i e nc e,
Univers i ty of Ca l i forn ia , Berke ley , May 1982.
12. R .R . Kessler , Peep hole Op t imiza t ion in COG ,
O pe r a t i ng N o te 76 , U ta h S ym bo l i c
C o m p u t a t i o n G r o u p , C o m p u t e r S c i e n c e
D e p a r tm e n t , U n ive r s it y o f U ta h , June 1983 .
1 3. D . E . K n u t h , A n E m p i r ic a l S t u d y o f F o r t r a n
P r og r a m s , Sof tw are - -P rac t ice Exper ience 1 ,
2 (A pril-J un e 1 971), 105-133.
14. D . A . La m b , C ons t r u c t i on o f a P e e pho le
Opt imize r ,
S o f tw a re - - P ra c t i c e E x p e r i e n c e
11(1981), 638-647.
15. W . M . M c K e e m a n , P e e pho le O p t im iz a t i on ,
C o m m . A C M S , 7 (July 1965), 443-444.
16. A . S . T a ne n ba um , H . va n S t a ve r e n a nd J . W .
S te ve nson , U s ing P e e pho le O p t im iz a t i on on
I n t e r m e d ia t e C ode , A C M Trans. Prog. Lang.
a n d S y s t e m s 4, I (Jan ua ry 1982), 21-36.
17 . W. Wulf , R . K. Joh nss on , C . B . W eins tock , S .
O . H obbs a nd C . M . G e sc hke , The
D e s ig n o f
an Opt imiz ing Compi le r ,
Nor th Hol land, 1975.
700
Figure I URX Pa~ern File row~h
6 0 0
500
P
a 4
r 300
n
5
200
I00
L
L
i
L j
I - -
t
[ , , l
0 5000 I0000 15000 20000
e p l a c e m e n t s
1 1 6