+ All Categories
Home > Documents > Optimal Parallel Merging and Sorting Algorithms

Optimal Parallel Merging and Sorting Algorithms

Date post: 07-Aug-2018
Category:
Upload: raj-kumar-yadav
View: 234 times
Download: 0 times
Share this document with a friend

of 5

Transcript
  • 8/20/2019 Optimal Parallel Merging and Sorting Algorithms

    1/9

    Parallel Computing 14 (1990) 89-97 89

    North-Holland

    Optimal paral lel merging and sorting

    algorithm s using e processors

    without memory content ion

    J a u -H s i u n g H U A N G

    Department of Computer Science and Information Engineering, National Taiwan Unioersity,

    R.O.C.

    L e o n a r d K L E I N R O C K

    Compu ter Science Department, University of California, Los Angeles, Los A ngeles, CA, USA

    Received August 1989

    Revised Novem ber 1989

    Taipei, Taiwan,

    Abstract. A multi-way parallel merging algorithm is described to m erge two sorted lists each with size N on a

    shared-memory parallel system. The structure o f this algorithm is very regular and highly parallel. I t is show n

    that using P processors, the time complexity of this algorithm is O N / P ) when N >/p2, which is known to be

    optimal. This approach for parallel m erging leads to a multi-way parallel sorting algorithm with time complexity

    O(N log N ) / P ) when N~> p2. Clearly this is also optimal. In addition, these two algorithms d o no t require

    reading from or writing into the same mem ory location simultaneously, hence they can be applied on a E REW

    machine. In cases w hen N < p 2, we recursively apply this merging algorithm to show that for P = N (2k-1)/2k,

    the complexities of the merging algorithm and the sorting algorithm are

    O 3 k N / P )

    and O(3k(N log

    N ) / P )

    respectively.

    Keywords. M erging , sorting, shared-mem ory m ultiprocessor, complexity a naly sis, multi-way merging and

    sorting.

    1 I n t r o d u c t i o n

    T h e p e r f o r m a n c e o f a p a r a ll e l a l g o r i t h m i s n o r m a l l y m e a s u r e d i n t e rm s o f t h e n u m b e r o f

    p r o c e s so r s , P , u s e d a n d t h e t i m e c o m p l e x i t y , T N ) , r e q u i r e d . I t is w e l l k n o w n t h a t a p a r a l l e l

    m e r g i n g a l g o r i t h m i s o p t i m a l i f O ( P . T N ) ) = O ( N ) . S i m i l a r l y , a p a r a l l e l s o r t i n g a l g o r i t h m i s

    o p t i m a l i f O ( P . T N ) ) = O ( N l o g N ) . I n t hi s p a p e r w e d e n o t e l o g a s t h e lo g a r i t h m b a s e d o n 2 .

    T h e r e a r e a v a r i e t y o f a l g o r i t h m s i n w h i c h p a r a l le l m e r g i n g a n d s o r t i n g a r e d e s ig n e d

    [ 1 , 4 ,7 , 9 , 1 0 ,1 2 - 1 5 ] . T a x o n o m i e s o f p a r a l l e l s o r t i n g a l g o r i t h m s c a n b e f o u n d i n [2 ,3 ,1 1 ].

    I n a s h a r e d - m e m o r y p a r a l le l s y s t em , w e a s s u m e t h a t t h e r e a r e P p r o c e s s o r s s h a r i n g a g l o b a l

    m e m o r y s p ac e . E a c h p r o c e ss o r c a n re a d f r o m o r w r it e in t o a n y m e m o r y l o c at i on . D e p e n d i n g o n

    w h e t h e r c o n c u r r e n t r e a d f r o m o r c o n c u r r e n t w r i t e in t o a m e m o r y i s a l lo w e d , s h a r e d - m e m o r y

    p a r a ll e l s y s t e m s a r e c a t e g o r i z e d i n t o t h e f o ll o w i n g f o u r g r o u p s :

    ( a) C R C W ( C o n c u r r e n t R e a d C o n c u r r e n t W r i te ) m a c h i n e s : b o t h c o n c u r r e n t r ea d f r o m a n d

    c o n c u r r e n t w r i t e i n t o a m e m o r y l o c a t i o n b y m o r e t h a n o n e p r o c e s s o r is a l l o w e d .

    ( b ) C R E W ( C o n c u r r e n t R e a d E x c l u si v e W r i t e ) m a c h i n e s : c o n c u r r e n t r e a d f r o m b u t n o t

    c o n c u r r e n t w r i t e i n t o a m e m o r y l o c a t i o n b y m o r e t h a n o n e p r o c e s s o r is a l l o w e d .

    0167-8191/90/ 03.50 © 1990 - Elsevier Science Publishers B.V. (North-Holland)

  • 8/20/2019 Optimal Parallel Merging and Sorting Algorithms

    2/9

  • 8/20/2019 Optimal Parallel Merging and Sorting Algorithms

    3/9

    J.H. Huang , L. Kleinrock / Optimal parallel merging and sorting algorithms 91

    2 . 1 . C a s e s w h e n N = P 2

    T h i s a l g o r i t h m i s s e p a r a t e d i n t o f o u r s t e p s . F o l l o w s w e e x p l a i n t h e a l g o r i t h m i n e a c h s t e p . I n

    S t e p 1 , w e d i v i d e L 1 i n t o P s o r t e d s u b l i s ts i n s u c h a w a y t h a t e a c h s u b l i s t s c o n t a i n s e l e m e n t s

    w h i c h a r e P p o s i t i o n s a p a r t a n d e a c h s u b l i s t c o n t a i n s P e l e m e n t s . T h e i t h s u b l i s t ( 1 ~< i < P )

    c o n t a i n s e l e m e n t s a t l o c a t i o n s i , i + P , i + 2 P . . . . . i + N / P - 1 ) P . S i m i l a r ly , L 2 is d i v i d e d

    i n t o P s u b l is t s in t h e s a m e w a y . T h e r e f o r e , t h e i t h s u b l i s t i n L 2 c o n t a i n s e l e m e n t s a t l o c a t i o n s

    N + i , N + i + P , N + i + 2 P . . . . . N + i + N / P - 1 ) P .

    I n S t e p 2 , w e a s s i g n P i t o m e r g e t h e i t h s u b l is t f r o m L 1 a n d t h e i t h s u b l i st f r o m L 2

    ( 1 ~< i < P ) . A l l p r o c e s s o r s w o r k s i m u l t a n e o u s l y . N o t e t h a t e a c h p r o c e s s o r p u t s t h e m e r g e d

    r e s u l t b a c k t o t h e c o r r e s p o n d i n g m e m o r y l o c a t io n s w h i c h w e r e o r i g i n a ll y o c c u p i e d b y t h e t w o

    s u b l is t s i t m e r g e s . T h a t i s , P i p u t s i t s m e r g e d r e s u l t i n t o m e m o r y l o c a t i o n s i , i + P , i +

    2 P , . . . , i + N / P - 1 ) P , a n d N + i , N + i + P , N + i + 2 P . . . . . N + i + N / P - 1 ) P . O b v i -

    o u s l y , a l l p r o c e s s o r s c o n c u r r e n t l y m e r g e t w o s o r t e d s u b l i s ts e a c h w i t h s i z e N / P , h e n c e , a l l

    p r o c e s s o r s w i l l f i n i s h a p p r o x i m a t e l y a t t h e s a m e t i m e i n 2 N / P t i m e u n i t s . N o t e t h a t e v e r y

    p r o c e s s o r w o r k s i n t h e m e m o r y l o c a t io n s a s s ig n e d t o t h e t w o s u b l is t s it m e r g e s a n d w h i c h d o

    n o t o v e r l a p w i t h t h e m e m o r y l o c a t i o n s i n w h i c h o t h e r p r o c e s s o r s w o r k . T h e r e f o r e , n o c o n c u r -

    r e n t r e a d f r o m o r w r i te i n t o a m e m o r y l o c a t i o n is r e q u ir e d . A s w e w i ll l a te r p r o v e i n L e m m a

    2 .1 , a f t e r S t e p 2 , e v e r y e l e m e n t i s a t m o s t a d i s t a n c e P f r o m i t s f in a l p o s i t io n T h e f i n a l

    p o s i t i o n o f a n e l e m e n t i s d e f i n e d a s t h e p o s i t i o n o f t h i s e l e m e n t i n t h e f i n a l s o r t e d l i s t .

    A f t e r S t e p 2 , L 1 a n d L 2 a r e m i x e d t o g e t h e r t o b e c o m e a l a r g e l is t o c c u p y i n g l o c a t i o n s f r o m

    1 t o 2 N . I n S t e p 3 , w e f i r st g r o u p t h i s e n t i r e li st in t o 2 P n o n - o v e r l a p p i n g g r o u p s w i t h P

    c o n s e c u t i v e e l e m e n t s i n e a c h g ro u p . W e n u m b e r t h e s e g r o u p s f r o m 1 to 2 P ; h e n c e , th e i t h

    g r o u p o c c u p i e s t h e m e m o r y l o c a ti o n s f r o m ( i - 1 ) P + 1 to i P . N o t e t h a t e a c h g r o u p i s a s o r t e d

    s u b l i s t a s w i ll b e p r o v e d i n L e m m a 2 .2 . W e t h e n a s s i g n P i t o m e r g e g r o u p s 2 i - 1 a n d 2 i ( i = 1 ,

    2 . . . . . P ) . A s i n S t e p 2 , a l l P p r o c e s s o r s c o n c u r r e n t l y m e r g e t w o s o r t e d s u b l i st s e a c h w i t h s iz e P

    a n d t h e r e i s n o o v e r l a p p i n g i n m e m o r y l o c a t io n s b e t w e e n p r o c e s so r s . N o t e a l so t h a t e a c h

    p r o c e s s o r a l so p u t s t h e m e r g e d r e s u l t b a c k t o t h e c o r r e s p o n d i n g m e m o r y l o c a t i o n s o r i g in a l ly

    o c c u p i e d b y t h o s e t w o g r o u p s i t m e r g e s .

    I n S t e p 4, w e a g a i n g r o u p t h e e n t i r e li s t a f t e r S t e p 3 i n t o 2 P n o n - o v e r l a p p i n g g r o u p s w i t h P

    c o n s e c u t i v e e l e m e n t s in e a c h g r o u p a s i n S t e p 3 . W e t h e n a s s i g n P i t o m e r g e g r o u p s 2 i a n d

    2 i + 1 f o r i f r o m 1 t o P - 1 . N o t e t h a t o n l y P - 1 p r o c e s s o r s a r e u s e d a n d t h e f i r s t a n d t h e la s t

    g r o u p s a r e n o t p r o c e s s e d i n th i s s t e p . A l l P - 1 p r o c e s s o r s c o n c u r r e n t l y m e r g e t w o s u b li s ts e a c h

    w i t h s iz e P a n d n o o v e r l a p p i n g i n m e m o r y l o c a t io n s b e t w e e n p r o c e s s o r s. A s b e f o r e , t h e m e r g e d

    r e s u l t i s p u t b a c k t o t h e c o r r e s p o n d i n g m e m o r y l o c a t i o n s . A s w i l l b e p r o v e d i n T h e o r e m 2 . 4 ,

    t h e r e s u l t i n g l is t is s o r t e d a f t e r t h i s s t e p . I n b r i e f , t h e f l o w o f t h e a l g o r i t h m i s g iv e n b e l o w .

    Algorithm

    S t e p 1 : D i v i d e L 1 i n t o P s u b l is t s w h e r e t h e i t h s u b l i s t c o n t a i n s e l e m e n t s a t lo c a t i o n s i , P + i ,

    2 P + i . . . . . N - P + i . A l s o d i v i d e L 2 i n t o P s u b l i s t s s i m i l a r l y .

    S t e p 2 : H a v e P i m e r g e t h e i t h s u b l i s t f r o m L 1 a n d t h e i t h s u b l i s t f r o m L 2 a n d p u t t h e r e s u l t

    b a c k t o t h e l o c a t i o n s o r i g i n a l l y o c c u p i e d b y t h e s e t w o s u b l i s t s f o r 1 ~< i ~< P . A l l P

    p r o c e s s o r s w o r k s i m u l t a n e o u s l y .

    S t e p 3 : G r o u p t h e r e s u l t i n g l is t a f t e r S t e p 2 i n t o 2 P g r o u p s w i t h P c o n s e c u t i v e e l e m e n t s i n

    e a c h g r o u p . N u m b e r t h e se g ro u p s f r o m 1 to 2 P . H a v e P~ m e r g e g r o u p s 2 i - 1 a n d 2 i

    a n d p u t t h e r e s u l t b a c k t o t h e l o c a t i o n s o r i g i n a l l y b y t h e s e t w o g r o u p s f o r 1 ~< i ~< P . A l l

    P p r o c e s s o r s w o r k s i m u l t a n e o u s l y .

    S t e p 4 : G r o u p t h e r e s u l t i n g li s t a f t e r S t e p 3 i n t o 2 P g r o u p s w i t h P e l e m e n t s i n e a c h g r o u p .

    N u m b e r t he s e g r o u p s f r o m 1 t o 2 P . H a v e P ~ m e r g e g r o u p s 2 i a n d 2 i + 1 a n d p u t t h e

    r e s u l t b a c k t o t h e l o c a t i o n s o r i g i n a l l y o c c u p i e d b y t h e s e t w o g r o u p s f o r 1 ~< i ~< P - 1 .

    A l l P - 1 p r o c e s s o r s w o r k s i m u l t a n e o u s l y .

  • 8/20/2019 Optimal Parallel Merging and Sorting Algorithms

    4/9

    9 2

    J.H. Huang L. Kleinrock / Optimal paralM

    merg ing nd sor t in g

    algorithms

    L 1

    L 2

    3 P1

    5 P 2

    7 P 3

    1 2 P 4

    13 P1

    3 1 P 2

    4 1 P 3

    4 7 P 4

    5

    5 3 P 2

    6 3 P 3

    7 1 P 4

    81 P1

    9 6 P 2

    1 0 2 P 3

    2 1 4 P 4

    ~ - P 1

    9 P 2

    1 4 P 3

    2 6 P 4

    2 8 P 1

    3 6 P 2

    4 5 P 3

    5 4 P 4

    5 8 P 1

    6 2 P 2

    7 5 P 3

    7 6 P 4

    121 P1

    1 3 7 P 2

    1 9 0 P 3

    2 1 1 P 4

    (a)

    1 = '

    5

    7

    1 2 P 1

    3

    9

    1 4

    2 6

    13

    31

    4 1

    4 7

    2 8 ' P 2

    3 6

    4 5

    5 4

    5 3

    6 3

    7 1 P 3

    6 2

    7 5

    7 6

    81

    9 6

    1 0 2

    2 1 1 P 4

    121

    137

    1 9 0

    2 1 4

    b)

    F i g . 1. A n e x a m p l e .

    3

    5

    7

    9

    12

    14

    2 6

    13

    2 8

    31

    3 6

    c)

    P 1

    P 2

    P 3

    1

    3

    5

    7

    9

    1 2

    13

    1 4

    2 6

    28

    31

    3 6

    41

    4 5

    4 7

    5 2

    5 3

    5 4

    5 8

    6 2

    6 3

    71

    7 5

    7 6

    81

    9 6

    102

    121

    137

    1 9 0

    2 1 1

    2 1 4

    d)

    E x a m p l e . F i g u r e 1 s h o w s t w o s o r t e d l i s t s of e qu a l s iz e w i t h N = 16 a n d P = 4 a s s h o w n i n Fig .

    l a ) . A f t e r S t e p 2 i n t h e a l g o r i t h m w e h a v e t h e l i s t a s s h o w n i n F i g . l b ) . A f t e r S t e p 3 w e h a v e

    t h e l is t a s s h o w n i n F i g . l c ) . A f t e r S t e p 4 w e h a v e t h e f i n a l s o r t e d l i s t a s s h o w n i n F i g . l d ) a n d

    t h e a l g o r i t h m i s

    c o m p l e t e d .

    L e m m a

    2.1. Af t e r S t e p 2, every elem ent is with in _P posi t ions f rom i t s f ina l posi t ion.

    P r o o f . I n S t e p 1 , w e d e f i n e L l i t o b e t h e s u b l i s t s a s s i g n e d t o P , f o r m e r g i n g f r o m L 1 . S i m i l a r l y ,

    L 2 i is t h e s u b l is t a s s i g n e d t o P i f r o m L 2 . W i t h o u t l o s s o f g e n e r a l i ty , w e e x a m i n e t h e e l e m e n t s

    a s s i g n e d t o P i. W e a s s u m e t h a t X i s th e n t h e l e m e n t i n L 2 i i . e. , X i s t h e [ i + n - 1 ) P ] t h

    e l e m e n t i n L 2 ) a s s h o w n i n F i g . 2 . F i g u r e 2 a ) s h o w s l is t s L 1 a n d L 2 b e f o r e S t e p 2 a n d F i g .

    2 b ) s h o w s t h e re s u l t a f te r S t e p 2 . F o r X , o n e o f t h e f o l l o w i n g t h r e e c a s e s m a y h a p p e n . C a s e 1 :

    t h e r e a r e e l e m e n t s A a n d B i n L I ~ w h e r e A i s t h e m t h e l e m e n t i n L u i .e . , A i s t h e

    [ i + m - 1 ) P ] t h e l e m e n t i n L 1 ) a n d B i s t h e m + 1 ) s t e l e m e n t i n L I~ i .e . , B i s t h e [ i + m P ] t h

  • 8/20/2019 Optimal Parallel Merging and Sorting Algorithms

    5/9

    J.H. H uang L Kleinrock / Optimal parallel merging and sorting algorithms

    93

    i+mP

    N

    L2

    1

    2

    i+ n-1)P

    N

    original list

    i+ m-1)P

    i+ m+n-1)P

    ¢

    2N

    list after

    step two

    Fig. 2.

    e l e m e n t is L 1 ) a n d A < X < B . C a s e 2 : B i s t h e f ir s t e l e m e n t i n L I~ a n d X < B . C a s e 3 : A i s

    t h e l a s t e l e m e n t i n L l i a n d A < X . W e w i l l f i r s t p r o v e t h i s l e m m a f o r C a s e 1 .

    1 ) N u m b e r o f e l e m e n t s s m a l l e r t h a n X i s a t l e a st

    [ m - 1 ) e + i ] + [ n - 1 ) P + i - 1 ] = [ m + n - 1 ) P + i ] + i - P - 1 )

    w h e r e m - 1 ) P + i d e m e n t s a r e f r o m L 1 i .e ., d e m e n t s s m a l l e r t h a n o r e q u a l t o A ) a n d

    n - 1 ) P + i - 1 e l e m e n t s a r e f r o m L 2 i .e ., e l e m e n t s s m a l l e r t h a n X ) .

    2 ) N u m b e r o f e l e m e n t s s m a l l e r t h a n X i s a t m o s t

    [ m + l - 1 ) P + i - 1 ] + [ n - 1 ) P + i - 1 ] = [ m + n - 1 ) P + i ] + i - 2 )

    w h e r e m + 1 - 1 ) P + i - 1 e l e m e n t s a r e f r o m L 1 i .e ., e l e m e n t s s m a l l e r t h a n B ) a n d n - 1 ) P

    + i - 1 e l e m e n t s a r e f r o m L 2 .

    F r o m 1 ) a n d 2 ) , t h e r a n k i n g o f X i n a l l e l e m e n t s is b e t w e e n [ m + n - 1 ) P + i] + i - P )

    a n d [ m + n - 1 ) P + i ] + i - 1 ). F r o m o u r a l g o r i th m , t h e p o s i t i o n o f X a f t e r S t e p 2 is

    m + n - 1 ) P + i . T h e r e f o r e , e l e m e n t X i s w i t h i n a d i s t a n c e P f r o m i t s f i n a l p o s i t i o n .

    U s i n g t h e s a m e a r g u m e n t , w e c a n e a s il y p r o v e t h i s l e m m a f o r C a s e s 2 a n d 3 a n d t h i s l e m m a

    i s h e n c e p r o v e d . [ ]

    L e m m a 2.2.

    A f t e r S t e p

    2,

    e v e r y g r o u p i s s o r t e d .

    P r o o f . I n S t e p 1 , w e d e n o t e t h e 2 P e l e m e n t s a s s i g n e d t o P i f o r m e r g i n g t o b e E i . ~ ,

    E i 2 . . . .

    E~.2p where

    E k

    s t a n d s f o r t h e k t h e l e m e n t a s s i g n e d t o P~. C l e a r l y t h e e l e m e n t s f r o m E e l t o

  • 8/20/2019 Optimal Parallel Merging and Sorting Algorithms

    6/9

    94

    J.H Huang L Kleinrock / Optimal parallel merging and sorting algorithms

    E,, e a r e f r o m L x a n d t h e r e s t a r e f r o m L 2. W e f u r t h e r d e f i n e t h e k t h e l e m e n t f r o m P~ after

    Step 2 t o b e t h e k t h e l e m e n t i n o r d e r i n g a m o n g a l l E i, k f o r 1 ~< k ~< 2 P a f t e r P i f i n i s h e s

    m e r g i n g t h e 2 P e l e m e n ts .

    T o p r o v e t h i s l e m m a , w e p r o v e i t f o r th e i t h g r o u p w i t h o u t l o ss o f g e n e r a li t y . N o t e t h a t t h e r e

    a r e P e l e m e n t s i n th i s g r o u p a n d t h e j t h e l e m e n t ( 1 ~ ~ i + 2 .

    P r o o f

    W e p r o v e t h is l e m m a b y s h o w i n g t h a t t h e l a r g e s t e l e m e n t , s a y X , i n t h e i t h g r o u p i s

    s m a l l e r t h a n t h e s m a l le s t e l e m e n t , s a y Y , i n t h e j t h g r o u p w h e r e j > / i + 2 . F r o m L e m m a 2 .2 , X

    i s t h e l a s t e l e m e n t in t h e i t h g r o u p a n d Y is th e f i rs t e l e m e n t i n t h e j t h g r o u p . T h e a p p r o a c h i s

    s i m i la r t o t h e p r o o f o f L e m m a 2 . 2.

    F i r s t n o t e t h a t X i s a s s i g n e d t o P e a n d Y i s a s s i g n e d t o P 1 i n S t e p 1 . A l s o n o t e t h a t e v e r y

    e l e m e n t b u t t w o a s s i g n e d t o P 1 i n S t e p 1 i s g r e a t e r t h a n t h e e l e m e n t w h o s e m e m o r y a d d r e s s i s

    o n e l e s s t h a n i t a n d w h i c h i s a s s i g n e d t o P p ( i .e . , E l , k > Ee ,k -1 f o r 2 < k ~< P ) . T h e t w o

    e x c e p t i o n s a r e t h e t w o f i r s t e l e m e n t s a s s i g n e d t o P ~ f r o m b o t h l is ts ( i .e . , e l e m e n t s E l , 1 f r o m L ~

    a n d E l , p+ a f r o m

    L2 .

    S i n c e Y is th e j t h e l e m e n t f r o m P 1 a f t e r S t e p 2, Y i s g r e a t e r t h a n a t l e a s t j - 2 e l e m e n t s

    w h i c h a r e a s s i g n e d t o P I , - S i n c e t h e i t h e l e m e n t f r o m Pp a f t e r S t e p 2 i s X a n d j - 2 ~ i ; h e n c e

    X < Y . []

    Theo r em

    2.4. After Step 4, the entire l ist is sorted.

    Proo f F r o m L e m m a 2 . 3 i t i s s h o w n t h a t t o s o r t t h e e n t i r e l i s t , a n e l e m e n t i n g r o u p i a f t e r S t e p

    2 h a s t o c o m p a r e w i t h e l e m e n t s o n l y i n t h e ( i - 1 ) st g r o u p a n d t h e ( i + 1 )s t g r o u p s i n c e a l l

    e l e m e n t s in g r o u p j w h e r e j ~< i - 2 a r e s m a l l e r t h a n i t a n d a l l e l e m e n t s i n g r o u p k w h e r e

    k > / i + 2 a r e g r e a t e r t h a n i t . T h i s i s e x a c t l y d o n e i n S t e p s 3 a n d 4 w h e r e w e m e r g e g r o u p i w i t h

    g r o u p s ( i - 1 ) a n d ( i + 1) r e s p e c t i v e l y . H e n c e , a f t e r S t e p 4 , t h e e n t i r e l is t i s s o r t e d . [ ]

    Complexity analysis

    I n S t e p 2 , w e h a v e a l l P p r o c e s s o r s m e r g e t w o s u b l i s t s e a c h w i t h s i z e

    N i P

    c o n c u r r e n t l y ;

    h e n c e , t h e t i m e r e q u i r e d f o r t h is s t e p i s 2 N i P t i me u n i t s . T h i s i s a l s o t r u e f o r S t e p 3 . I n S t e p 4 ,

    w e h a v e a l l P - 1 p r o c e s s o r s a l s o m e r g e t w o s u b l i st s e a c h w i t h s iz e N i P c o n c u r r e n t l y ; h e n c e ,

    t h e t i m e r e q u i r e d f o r t h i s s t e p i s a l s o 2 N i P t i m e u n i t s . T h i s s h o w s t h a t t h e t o t a l t i m e

    c o m p l e x i t y o f t h i s p a r a l l e l m e r g i n g a l g o r i t h m i s O N / P ) . F u r t h e r , w e s e e t h a t t h e s c a l e

    c o n s t a n t o f t h is t i m e c o m p l e x i t y is a s s m a l l a s 3 .

    2.2. Cases when N > p2

    T h e m u l t i - w a y p a ra l l el m e r g i n g a l g o r i t h m c a n e a s il y b e a p p l i e d i n c a se s w h e n N

    > p2

    w i t h

    m i n o r m o d i f i c a t i o n . T h e a l g o r i t h m i s b a s i c a l l y t h e s a m e a s t h e c a s e s w h e n N = p 2 e x c e p t t h a t

  • 8/20/2019 Optimal Parallel Merging and Sorting Algorithms

    7/9

    J.H. Huan ~ L. Kleinrock / Optimalparallel merging and sorting algorithms 95

    S t e p s 3 a n d 4 a r e s l i g h tl y m o d i f i e d . T o e x p l a i n t h e a l g o r i t h m , w e a s s u m e N = M P w h e r e

    M > P i . e . , N > p 2 ) .

    A l g o r i t h m.

    S t e p 1 : D i v i d e L a i n t o P s u b l is t s w h e r e t h e i t h s u b l is t c o n t a i n s e l e m e n t s a t l o c a t i o n s i , P + i ,

    2 P + i . . . . . N - P + i . A l s o d i v i d e L 2 i n t o P s u b l i s t s i n t h e s a m e w a y .

    S t e p 2 : F o r a l l i f r o m 1 t o P , h a v e P i m e r g e t h e i t h s u b l i s t f r o m L 1 a n d t h e i t h s u b l i st f r o m

    L 2 a n d p u t t h e r e s u l t b a c k t o t h e l o c a t i o n s o r i g i n a l l y o c c u p i e d b y t h e s e t w o s u b li st s . A l l

    P p r o c e s s o r s w o r k s im u l t a n e o u s l y .

    S t e p 3 : G r o u p t h e r e s u l t i n g l is t S t e p 2 i n t o 2 M g r o u p s w i t h P c o n s e c u t i v e e l e m e n t s i n e a c h

    g r o u p . N u m b e r t h e s e g r o u p f r o m 1 t o 2 M . F o r a l l i f r o m 1 t o P , a s s i g n P i t o m e r g e

    g r o u p s 2 i - 1 a n d 2 i a n d p u t t h e r e s u l t b a c k t o t h e l o c a t i o n s o r i g i n a l l y o c c u p i e d b y

    t h e s e t w o g r o u p s . A f t e r t h i s i s d o n e , f o r a l l i f r o m 1 t o P , a s s i g n P i t o m e r g e g r o u p s

    2 P + 2 i - 1 ) a n d 2 P + 2 i. R e p e a t t h i s p r o c e d u r e u n t il e v e r y tw o c o n s e c u t i v e g r o u p s

    a r e m e r g e d .

    S t e p 4 : G r o u p t h e r e s u l t i n g l is t a f t e r S t e p 3 i n t o 2 M g r o u p s w i t h P e l e m e n t s in e a c h g r o u p .

    N u m b e r t h e se g r o u p s f r o m 1 to 2 M . F o r a l l i f r o m 1 to P , a s si g n P i t o m e r g e g r o u p s 2 i

    a n d 2 i + 1 a n d p u t t h e r e s u l t b a c k t o t h e l o c a t i o n s o r i g i n a l l y o c c u p i e d b y t h e s e t w o

    g r o u p s . A f t e r t h i s is d o n e , f o r a ll i f r o m 1 to P , a s s i gn P~ t o m e r g e g r o u p s 2 P + 2 i a n d

    2 P + 2 i + 1) . R e p e a t t h is p r o c e d u r e u n t i l e v e r y t w o c o n s e c u t i v e g r o u p s a r e m e r g e d

    e x c e p t t h e f i r s t a n d t h e l a s t g r o u p s .

    I t is e a s y t o s h o w t h a t t h e t i m e c o m p l e x i t y a b o v e i s st il l O N / P ) . H e n c e , w e c o n c l u d e t h a t

    f o r c a s e s w h e n N

    > / p 2

    t h e m u l t i - w a y p a r a l l e l m e r g i n g a l g o r i t h m a c h i e v e s t h e o p t i m a l t i m e

    c o m p l ex i ty O N / P ) .

    2 .3 . Cases when N

    < p 2

    I n t h i s s e c t i o n w e m o d i f y t h e m e r g i n g a l g o r i t h m s u c h t h a t i t c a n b e a p p l i e d t o c a s e s w h e n

    N < p 2 . A s m e n t i o n e d e a r l ie r , t h e t i m e c o m p l e x i t y i n t h is c a s e i s d e g r a d e d b y a f a c t o r o f 3 ~

    us in g P = N ~2k-~)/2k p ro ce sso rs f o r k > /1 .

    R e c a l l t h a t i n c a s e s w h e n P = v ~ - , o n e p r o c e s s o r i s u s e d t o m e r g e t w o s u b l i s t s w i t h a t o t a l

    s i ze 2 v ~ - . I f w e h a v e m o r e p r o c e s s o r s i .e . , P > v/-N -), w e c a n a p p l y m o r e t h a n o n e p r o c e s s o r t o

    m e r g e t h e 2 v /N e l e m e n t s . W e i l lu s t r a t e t h e c a se w h e n P

    =

    N 3/4 a s a n e x a m p l e . T h e a p p r o a c h i s

    s h o w n b e l o w .

    S t e p 1 : D i v i d e e a c h o f L 1 a n d

    L 2

    i n t o

    N 1 / 2

    s u b l i s t s i n s u c h a w a y t h a t e a c h s u b l i s t c o n t a i n s

    e l e m e n t s w h i c h a r e N 1/2 p o s i t i o n s a p a r t . T h e r e a r e N 1/2 e l e m e n t s i n e a c h s u b l i s t .

    S t e p 2 : W e w a n t t o m e r g e t w o s u b h s t s u s i n g a l l t h e p r o c e s s o r s . S i n c e t h e r e a r e o n l y N 1/2 p a i r s

    o f s u b li st s w a i t in g f o r m e r g i n g a n d w e h a v e m o r e t h a n

    N 1 / 2

    p r o c e s s o r s i . e . , P > N l / 2 ) ,

    m o r e t h a n o n e p r o c e s s o r c a n b e a s s i g n e d t o m e r g e t w o s u b l i s t s . A l s o , t h e r e a r e N 1/2

    e l e m e n t i n e a c h s u b h s t, w e c a n a p p l y t h e m u l t i- w a y p a r a l l e l m e r g i n g a l g o r i t h m

    m e n t i o n e d i n S e c t i o n 2 . 1 b y u s i n g

    N 1 / 2 ) 1/ 2 = N 1 / 4

    p r o c e s s o r s t o m e r g e t w o s u b l i s t s .

    T h a t i s , e a c h p r o c e s s o r m e r g e s t w o s u b - s u b l i s t s e a c h w i t h

    N 1 / 4

    e l e m e n t s . S i n c e t h e r e

    a r e N 1 / 2

    i n d e p e n d e n t p a i r s o f m e r g i n g w o r k i n g c o n c u r r e n t l y , t h e t o t a l n u m b e r o f

    p r o c e s s o r s r e q u i r e d i s

    N 1 / E N 1 /4 = N 3 /4 , w h i c h

    i s e x a c t l y th e t o t a l n u m b e r o f p r o c e s s o r s .

    T h a t m e a n s a ll p r o c e ss o r s w o r k s i m u l ta n e o u s ly .

    S t e p 3 : G r o u p t h e r e s u l t i n g li st a f t e r S t e p 2 i n t o 2 v ~ - g r o u p s w i t h N ~/2 e l e m e n t s i n e a c h

    g r o u p . W e w a n t t o m e r g e e v e r y tw o n e i g h b o r i n g g r o u p s a s b e f o r e . A g a i n , w e c a n u s e

  • 8/20/2019 Optimal Parallel Merging and Sorting Algorithms

    8/9

    96

    ,1.1-1. Hua n~ L. K leinrock / Optim al para llel merg ing and sorting algorithms

    N 1 /4 p r o c e s s o r s t o w o r k o n e a c h m e r g i n g . A s i n S t e p 2 , t h e r e a r e N 1/2 i n d e p e n d e n t

    m e r g i n g w o r k i n g c o n c u r r e n t ly , t h e n u m b e r o f p r o c es s o rs r e q u i r e d i s

    N 3/4.

    T h a t m e a n s

    a l l p r o c e s s o r s w o r k s i m u l t a n e o u s l y .

    Step 4:

    S i m i l a r t o S t e p 3 e x c e p t t h a t t h e g r o u p m e r g i n g s t a r t s f r o m g r o u p n u m b e r t w o .

    H e r e w e e x a m i n e t h e t i m e c o m p l e x i t y o f t h i s a lg o r i t h m . S i n c e w e a d d o n e m o r e l e v e l o f t h e

    m u l t i - w a y p a r a l l e l m e r g in g , t h e t i m e r e q u i r e d w i l l b e t h r e e t i m e s m o r e t h a n t h e o r i g i n a l a s

    d e p i c t e d i n S e c t i o n 2 . 1 . B y r e p e a t e d l y n e s t i n g t h i s a p p r o a c h o n m e r g i n g t w o s u b l i s t s , i t c a n b e

    s h o w n t h a t w e c a n u s e P = N 2 ~-a )/E k p r o c e s s o r s f o r t h e m e r g i n g a l g o r i t h m t o a c h i e v e a t i m e

    c o m p l e x i t y o f

    O 3 k N / P ) .

    3. M ult i way paral le l sort ing a lgorithm

    W e c o n s t r u c t a

    multi-way parallel sorting algorithm

    u s i n g t h e m e r g i n g a l g o r i t h m d e s c r i b e d

    a b o v e . I n t h i s s e c t i o n w e w i l l o n l y d i s c u s s c a s e s w h e n N = p 2 . T h e c a s e s w h e n N > p 2 a n d

    N < p 2 c a n b e d o n e s i m i l a r t o t h e p r o c e d u r e s g i v e n i n S e c t i o n 2 . T h i s s o r t i n g a l g o r i t h m i s

    b a s i c a l l y a m e r g e s o r t a l g o r i t h m e x c e p t t h a t w e u s e t h e m u l t i - w a y p a r a ll e l m e r g i n g a l g o r i t h m t o

    p e r f o r m t h e m e r g i n g .

    W e u se P = ~ p r o c e s s o r s t o s o r t 2 N e l e m e n t s . F o r e a se o f e x p l a n a t i o n , w e a s s u m e

    P = v ~ - = 2 k . T h e r e a r e t w o p h a s e s i n t h i s a l g o r i t h m . I n t h e f i r s t p h a s e o f t h e a l g o r i t h m , w e

    a s s i g n 2 v ~ - e l e m e n t s t o e a c h p r o c e s s o r a n d h a v e e a c h p r o c e s s o r s o r t i ts d a t a u s i n g a n y k n o w n

    o p t i m a l s e q u e n t i a l s o r t i n g a l g o r i t h m , e . g ., q u i c k s o r t. A f t e r p h a s e 1 , w e h a v e P s o r t e d l i st s.

    I n t h e s e c o n d p h a s e , w e r e c u r s i v e l y m e r g e t w o s o r t e d l i s t s i n t o o n e l a r g e r s o r t e d l i s t u n t i l

    t h e r e i s o n l y o n e l i st w h i c h i s t o t a l l y s o r t e d . T h i s i s e x a c t l y w h a t m e r g e s o r t d o e s . I n m e r g e s o r t ,

    w e d e f i n e a

    run

    a s m e r g i n g e v e r y t w o n e i g h b o r i n g l i s t s i n t o o n e l a r g e r s o r t e d l i s t f o r a l l l i s t s .

    H e n c e , e a c h t i m e a r u n i s p e r f o r m e d , t h e n u m b e r o f s o r t e d l i s t s i s r e d u c e d b y h a l f . A f t e r p h a s e

    o n e , t h e r e a r e P = 2 k s o r t e d l i st e d ; t h e r e f o r e , w e n e e d t o p e r f o r m k m e r g e r u n s t o f i n i s h th e

    m e r g e s o r t . H e n c e , w e d i v i d e p h a s e t w o i n t o k s t e p s. A t t h e b e g i n n i n g o f t h e i t h s t e p

    i = 1 ,2 . . . . . k ) , t h e r e a r e

    2 k / 2 i -1

    s o r t e d l i s ts e a c h w i t h s i z e 2 k +i . T h e n u m b e r o f p a i r s o f s o r t e d

    l i s ts wa i t i ng f o r m e r g ing in t h i s s t e p i s

    2 k / 2 i - 1

    d i v i d e d b y 2 , w h i c h e q u a l s 2 k / 2 i. S in c e t h e r e a r e

    t o t a l l y P = 2 k p r o c e s s o r s, t h e n u m b e r o f p r o c e s s o r s u s e d t o m e r g e e v e r y t w o l is t s is h e n c e 2 ~. I n

    e a c h s t e p , w e c o n c u r r e n t l y m e r g e p a i r s o f s o r t e d l i s ts u s i n g t h e p r o c e s s o r s a s s o c i a t e d w i t h e v e r y

    t w o s o r t e d l i s t s . A f t e r k s t e p s o f m e r g i n g , t h e r e i s o n l y o n e s o r t e d l i s t r e m a i n e d a n d t h e

    a l g o r i t h m i s c o m p l e t e d .

    W e d e f i n e N i ) to b e t h e s i ze o f e a c h l i st t o b e m e r g e d a n d P ~ ) t o b e t h e n u m b e r o f

    p r oc e sso r s u se d to m e r g e two l i s ts i n s t e p i . I f we c a n show N ~)> ~ P °2 f o r a l l i , a ll s t e ps

    a c h ie v e a n o p t i m a l t im e c o m p l e x i t y u s i n g t h e m u l t i - w a y p a ra l le l m e r g i n g a l g o r it h m . T h i s c a n

    e a s i l y b e p r o v e d a s f o l l o w s :

    N i)

    = 2 k + i ) > ~ 2 i + i = 2 i ) 2 =

    P i)2.

    F r o m t h e a b o v e p r o o f , i t i s

    s h o w n t h a t e v e r y m e r g i n g i n t h e m e r g e s o r t o b t a i n s a n o p t i m a l t i m e c o m p l e x i t y ; t h e r e f o r e , t h e

    e n t i r e s o r t i n g a l g o r i t h m i s o p t i m a l .

    Complexity analysis

    S i n c e e a c h p r o c e s s o r i n p h a s e o n e s o r t s t w o l i s t s e a c h w i t h s iz e f N - a n d a l l p r o c e s s o r s w o r k

    c o n c u r r e n t l y , t h e t i m e c o m p l e x i t y o f p h a s e o n e e q u a l s t h e t i m e c o m p l e x i t y o f a n y o p t i m a l

    s e q u e n ti a l s o rt i n g a l g o r it h m w h i c h e q u a l s O 2 v ~ - l o g 2 v ~ ) . B y n e g l e c ti n g th e c o n s t a n t m u l t i-

    p l ie r , t h e c o m p l e x i t y a b o v e e q u a l s O N l o g f N ) / v ~ ) w h i c h i n t u r n e q u a l s O N l o g

    N ) / P ) .

    I n p h a s e t w o , t h e t i m e c o m p l e x i t y o f t h e m e r g i n g i n t h e i t h s t e p is O N i ) / P i)) =

    O 2k + i /2 i ) -- -O 2 k ) =

    O N / P ) .

    N o t e t h a t t h i s t i m e c o m p l e x i t y is t h e s a m e f o r a l l s t e p s a n d

  • 8/20/2019 Optimal Parallel Merging and Sorting Algorithms

    9/9

    J.H . Huan g, L Kleinrock / Optima l parallel merging and sorting algorithms

    97

    i n d e p e n d e n t o f i. S i n c e t h e r e a r e k s t e p s in p h a s e t w o a n d k = l o g P = l o g ~ = ½ l o g N , th e

    t o t a l t im e c o m p l e x i t y o f p h a s e t w o e q u a l s

    O N / P ) k )

    = O N l o g

    N / P ) .

    S i n c e b o t h p h a s e s

    h a v e a t i m e c o m p l e x i t y as O N l o g

    N ) / P ) ,

    t h e t o t a l t i m e c o m p l e x i t y o f th i s

    m u l t i - w a y p a r a l l e l

    s o r t i n g a l g o r i t h m

    i s O N l o g

    N ) / P ) ,

    w h i c h i s o p t i m a l .

    4 C o n c l u s i o n

    M u l t i - w a y p a r a ll e l m e r g i n g a n d s o r t i n g a l g o r i t h m s p r o v i d e a n o p t i m a l t i m e c o m p l e x i t y u s i n g

    P ~ ~ p r o c e s s o r s. F u r t h e r , t h e se a l g o r i t h m s d o n o t r e q u i r e r e a d i n g f r o m o r w r i t i n g i n t o t h e

    s a m e m e m o r y l o c a t i o n c o n c u r r e n tl y ; h e n c e , t h e y c a n b e i m p l e m e n t e d o n a n y k i n d o f p a ra l le l

    c o m p u t i n g s y st em s E R E W , E R C W , C R E W , o r C R C W ) . A s m e n t i o n e d e a rl ie r, a n o th e r

    c o n t r i b u t i o n o f t h es e a l g o r i t h m i s t h e si m p l i c i ty a n d r e g u l a r i t y o f t h e s t r u c tu r e . I n a d d i t i o n , w e

    2 ~ 1)/2

    k . . . .

    s h o w t h a t f o r P = N , t h e c o m p l e x a tl e s o f t h e m e r g i n g a l g o r i t h m a n d th e s o r t i n g

    a l g o r i t h m a r e

    o 3 k N / p )

    a n d O 3 ~ N l og

    N / P )

    r e s p e c t i v e l y .

    R e f e r e n c e s

    [1] M. Ajtai, J. Komlos and E. Szemeredi, An O N log N ) sorting netwo rk, in:

    Proc. 15th A CM Syrup. Theory

    Comput.

    1983) 1-9.

    [2] S .G. Akl,

    Parallel Sorting Algorithms

    Academ ic, Orlando, F L, 1985).

    [3] S.G . Ak l, The

    Design and Analysis of Parallel Algorithms

    Prentice-Hall, Englewo od C fiffs, N J, 1989).

    [4] S.G. Akl and N. S antoro, Optimal parallel merging and sorting without m emory conflicts,

    IEE E Trans. Comput .

    36 10) 198 7) 1367-1369 .

    [5] K. B atcher, Sorting networks and their application, in:

    Proc. AF IP S Spr ing Joint Comput . Conf .

    1968) 307-314.

    [6] G. Bilardi and F. Preparata, A minimum VLSI network for O log N) time sorting,

    IE EE Trans. Comput. 34

    4)

    1985) 336-343.

    [7] A. Borodin and J. Ho pcroft, Routing, merging and sorting on parallel models of computation, J.

    Comput. System

    Sci. 30

    1985) 130-145.

    [8] D .M . Echstein, Simultaneous memory accesses, Tech. Rep. 79-6, Dept. Com put. Sci. , Iow a State Univ. , Am es, IA,

    1979.

    [9] D. Knuth, The

    Ar t o f Computer Programming. V ol. 3: Sorting and Searching.

    Addison-Wesley, Reading, MA,

    1973).

    [10] C. K rusk al, Searching, mergin g, and sorting in parallel com putation,

    IEEE Trans . Comput .

    32 10) 1983)

    942-946.

    [11] S. Lakshmivarahan, S.K ., Dhall an d L.L. Miller , Parallel sorting algorithms, in: M .C. Yovits, ed . ,

    Advances in

    Computers

    Academ ic Press, New York, 1984) 295-354.

    [12] T. Leighton, Tight bounds on the complexity of parallel sorting,

    IE EE Trans. Comput. 34

    4) 1985) 344-354.

    [13] F.P. Preparata, New parallel-sorting schem es,

    IEEE Trans . Comput .

    27 7) 1978) 669-673.

    [14] Y. Shiloach and U . Vishkin, Finding the max imum , merging and sorting in a parallel computation model, 2 1981)

    88-102.

    [15] L.G. V aliant, Parallelism in co mparison p roblems,

    SIA M J . C omput .

    4 1975) 348-355.


Recommended