+ All Categories
Home > Documents > Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK...

Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK...

Date post: 27-Sep-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
28
Transcript
Page 1: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Fortran 2008, 2018 coarrays and OpenMP

Anton Shterenlikht

Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, [email protected]

UK OpenMP Users' Conference 2018, Oxford, 22-MAY-2018

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 1 / 28

Page 2: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Table of Contents

1 Fortran coarrays

2 Coarrays/OpenMP spec challenge

3 F2008 DO CONCURRENT

4 Challenges for ARB and Fortran committee

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 2 / 28

Page 3: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Fortran support in OpenMP

OMP 4.5 (2015), 5.0 (2017) support (most of) F2003 (2004).Main exclusions: OO, IEEE features.! 11-year lag!

F2008 published in 2010. F2018 will likely be published in 2018.! Support in OMP 6? 7?

OMP ARB: alternate J3 Fortran members:! Kelvin Li (IBM), Henry Jin (NASA).http://www.openmp.org/about/members

J3 Fortran: OpenMP liaison: ! Bill Long (Cray)http://j3-fortran.org/doc/standing/links/001.txt

Who cares anyway?

Fortran coarray users!

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 3 / 28

Page 4: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Fortran coarrays - native SPMD

F2008:

coarray data objects

allocatable coarrays

coarrays of DT with allocatableor pointer components

remote de�nitions andreferences

execution segments

image control statements

atomics

critical sections

locks

F2018:

collectives

teams

events

many more atomics

failed images

Compiler support:

Cray

Intel

GCC/OpenCoarrays

Implementation: (Challenge!)

libpgas, DMAPP (Cray)

MPI, OpenMP (Intel)

MPI, GASnet (OpenCoarrays)

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 4 / 28

Page 5: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Coarrays primer - swap values between imagesi n t e g e r : : i [ � ] , n , tmp [ � ] , mypemype = t h i s ima g e ( )i = mypetmp = mypen = num images ( )i f (mype == 1) thensync images ( n ) ! pa i r�wi s e b a r r i e ri = tmp [ n ] ! remote read , s i n g l e s i d e d

e l s e i f (mype == n) thensync images (1 ) ! pa i r�wi s e b a r r i e ri = tmp [ 1 ] ! remote read , s i n g l e s i d e d

end i fp r i n t � , "on image " , mype , " i = " , iend

$ c a f r un �np 4 . / a . outon image 2 i = 2on image 3 i = 3on image 1 i = 4on image 4 i = 1

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 5 / 28

Page 6: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Coarrays - weak memory consistency model

F2018 DIS, 11.6.2 Segments:

if a variable is de�ned or becomes unde�ned on an image in a

segment, it shall not be referenced, de�ned, or become unde�ned

in a segment on another image unless the segments are ordered

11.6.1 Image control statements:

SYNC ALL

SYNC IMAGES

SYNC MEMORY

SYNC TEAM

FORM TEAM

CHANGE TEAM / END TEAM

ALLOCATE / DEALLOCATE coarrays

CRITICAL / END CRITICAL

EVENT POST / EVENT WAIT

LOCK / UNLOCK

MOVE ALLOC

STOP

END

Compiler cannot move operations across image control statements

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 6 / 28

Page 7: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Coarray/OpenMP usage - 3 examples

1 Gyrokinetic particle-in-cell code

2 Numerical weather prediction: European Center for Medium RangeWeather Forecast, Integrated Forecasting System (ECMWF IFS)

3 Physics/engineering: Cellular Automata library for SUPercomputers,CASUP: The University of Bristolhttps://cgpack.sourceforge.io

H. Richardson, Coarrays from laptops to supercomputers, 2015http://www.fortran.bcs.org/2015/BCS_FSG_2015_HR.pdf

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 7 / 28

Page 8: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Gyrokinetic particle-in-cellFrom: R. Preissl et al, Multithreaded Global Address Space Communication Techniques for

Gyrokinetic Fusion Applications on Ultra-Scale Platforms, SC11.

http://upc.lbl.gov/publications/Preissl_SC2011.pdf

Single sided calls

Excellent scaling, outperforms MPI/OpenMP

Non-standard, Cray extensions (atomics)

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 8 / 28

Page 9: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

ECMWF IFS

CREST

0

50

100

150

200

250

300

350

400

450

500

0 20 40 60 80 100 120 140 160 180 200 220

Fore

cast

Da

ys

/ D

ay

Number of Cores (thousands)

COARRAYS=T

COARRAYS=F

Tc1999L137 5 km (~2024) IFS model scaling on TITAN

ECMWF HPC in Meteorology workshop, 27-31 Oct 2014

No GPUs Used

16

From: G. Mozdzynski et

al, Challenges of getting

ECMWFs weather fore-

cast model (IFS) to the

Exascale, ECMWF HPC

in Meteorology workshop,

2014. PDF

Higher is better

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 9 / 28

Page 10: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

ECMWF IFS (cont'd) - coarray segment order rules?

CREST

!$OMP PARALLEL DO SCHEDULE(DYNAMIC,1) PRIVATE(JM,IM,JW,IPE,ILEN,ILENS,IOFFS,IOFFR)

DO JM=1,D%NUMP

IM = D%MYMS(JM)

CALL LTINV(IM,JM,KF_OUT_LT,KF_UV,KF_SCALARS,KF_SCDERS,ILEI2,IDIM1,&

& PSPVOR,PSPDIV,PSPSCALAR ,&

& PSPSC3A,PSPSC3B,PSPSC2 , &

& KFLDPTRUV,KFLDPTRSC,FSPGL_PROC)

DO JW=1,NPRTRW

CALL SET2PE(IPE,0,0,JW,MYSETV)

ILEN = D%NLEN_M(JW,1,JM)*IFIELD

IF( ILEN > 0 )THEN

IOFFS = (D%NSTAGT0B(JW)+D%NOFF_M(JW,1,JM))*IFIELD

IOFFR = (D%NSTAGT0BW(JW,MYSETW)+D%NOFF_M(JW,1,JM))*IFIELD

FOUBUF_C(IOFFR+1:IOFFR+ILEN)[IPE]=FOUBUF_IN(IOFFS+1:IOFFS+ILEN)

ENDIF

ILENS = D%NLEN_M(JW,2,JM)*IFIELD

IF( ILENS > 0 )THEN

IOFFS = (D%NSTAGT0B(JW)+D%NOFF_M(JW,2,JM))*IFIELD

IOFFR = (D%NSTAGT0BW(JW,MYSETW)+D%NOFF_M(JW,2,JM))*IFIELD

FOUBUF_C(IOFFR+1:IOFFR+ILENS)[IPE]=FOUBUF_IN(IOFFS+1:IOFFS+ILENS)

ENDIF

ENDDO

ENDDO

!$OMP END PARALLEL DO

SYNC IMAGES(D%NMYSETW)

FOUBUF(1:IBLEN)=FOUBUF_C(1:IBLEN)[MYPROC]

Fortran2008 coarray (PGAS) example

ECMWF HPC in Meteorology workshop, 27-31 Oct 2014 29

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 10 / 28

Page 11: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Coarrays/OMP problems

1 High level abstraction, multiple implementations, e.g. as MPIprocesses on coprocessors (ifort 16+).Clash with OMP target device specs?

2 All rules based on the image level.No "sub-image" (thread, �ne grain) level control semantics?

3 Performance with full standard conformance.Fully asynchronous execution on thread level?

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 11 / 28

Page 12: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Coarrays inside OMP parallel regions - unde�ned!Can learn from MPI/OMP: MPICH_MAX_THREAD_SAFETY

Specifies the maximum allowable thread-safety level that

is returned by MPI_Init_thread() in the provided argument.

This allows the user to control the maximum level of

threading allowed. The legal values are:

----------------------------------------------------------

Value MPI_Init_thread() returns

----------------------------------------------------------

single MPI_THREAD_SINGLE

funneled MPI_THREAD_FUNNELED

serialized MPI_THREAD_SERIALIZED

multiple MPI_THREAD_MULTIPLE

----------------------------------------------------------

from Cray intro mpi(3)

Coarrays/OMP can work up to SERIALIZED ?

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 12 / 28

Page 13: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

COARRAYS THREAD *?

single funneled/serialised multiple

From: R. Preissl et al, Multithreaded Global Address Space Communication Techniques for Gyrokinetic Fusion Applications on

Ultra-Scale Platforms, SC11.

COARRAYS THREAD * - support level by OpenMP?

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 13 / 28

Page 14: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Coarray comms outside OMP parallel regions

i n t e g e r : : a ( 100 , 100 , 1 00 ) [� ]main : do i t e r =1, n i t e r

! p a i r w i s e sync , e . g . sync imagesc a l l h a l o e x change ( a ) ! Remote c a l l s! $omp p a r a l l e l do sha r ed ( a ) num threads ( . . . )do i =1,n! Update/ use "a" on my image . No remote c a l l s

end do! $omp end p a r a l l e l do

end do main

Each image spawns threads

Same or di�erent number ofthreads per image

Similar toMPI_THREAD_SINGLE mode

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 14 / 28

Page 15: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

COARRAYS THREAD SERIALIZED?

i n t e g e r : : a ( 0 : n+1) [� ] , b ( 0 : n+1) , img , tmp! $omp p a r a l l e l do p r i v a t e ( i , tmp) sha r ed ( img , a , b )l oop : do i =1, ni f ( img . eq . 1 . and . i . eq . n ) thensync images (2 ) ! The th r ead tha t has i=n on img 1a ( n+1)=a (1) [ 2 ] ! w i l l s ync wi th img 2 and p u l l a ( 1 ) .

end i fi f ( img . eq . 2 . and . i . eq . 1 ) thensync images (1 ) ! The th r ead tha t has i=1 on img 2a (0 ) = a ( n ) [ 1 ] ! w i l l s ync wi th img 1 and p u l l a ( n )

end i f! k e r n e l f u n c t i o n : b ( i ) = fun ( a )

end do loop! $omp end p a r a l l e l do

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 15 / 28

Page 16: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Coarrays/OMP serialised - complete program

i n t e g e r , pa ramete r : : n=10 ! Assume 2 images !i n t e g e r : : a ( 0 : n+1) [� ] , b ( 0 : n+1)=0, img , i t e rimg = th i s ima g e ( )i f ( img . eq . 2 ) b ( n+1) = 1 ! S i n g l e non�z e r o e l ementmain : do i t e r = 1 , 2�n

a = b ! Halo d e f i n e d o u t s i d e OMP! $omp p a r a l l e l do d e f a u l t ( none ) p r i v a t e ( i , tmp) &! $omp sha r ed ( img , a , b )l oop : do i =1, n

i f ( img . eq . 1 . and . i . eq . n ) then ! hx on img 1sync images (2 ) ! on th r ead whicha ( n+1) = a (1) [ 2 ] ! has i=n .

end i fi f ( img . eq . 2 . and . i . eq . 1 ) then ! hx on img 2

sync images (1 ) ! on th r ead whicha (0 ) = a (n ) [ 1 ] ! has i =1.

end i fb ( i ) = a ( i +1) ! k e r n e l f u n c t i o n

end do loopi f ( img . eq . 2 ) b ( n+1) = a (0)! $omp end p a r a l l e l do

end do mainend

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 16 / 28

Page 17: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Coarrays/OMP serialised - results - all ok!

Kernel: copy the value from the neighbour on the right:

b ( i ) = a ( i +1)

2 images, 4 threads/image

HX on image 1 ! always thread 3. HX on image 2 ! always thread0. Expected but irrelevant - could be any thread!

Array section b(1:10) (no halos) initially:

00000 00000 j 00000 00001<� img 1 �> j <� img 2 �>

b(1:10) after 10 iterations:

00000 00000 j 11111 11111<� img 1 �> j <� img 2 �>

b(1:n 10) after 20 iterations:

11111 11111 j 11111 11111<� img 1 �> j <� img 2 �>

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 17 / 28

Page 18: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Coarrays/OMP serialised - conclusions

Ok for 1D arrays - single thread (single, funneled, serialised) - singleelement copy

Will not work for 2D+ arrays

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 18 / 28

Page 19: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Coarrays/OMP multiple - problems!

N2 N3

N5

N7N6

N4

N1

N8

N1

A

N2 N3

N4 N5

N6 N7

B

N8 /

!

N2 N3

N5

N7N6

N4

N1

N4

N6

N1

43

1 2

N3

N8 /

A

N2 N3

B N5

N6 N7 N8

N4

N2

N1

Naive: every thread does its own HX.

Unsafe with multiple threads - might break sync rules ! deadlocks!

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 19 / 28

Page 20: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Coarrays/OMP multiple - SYNC IMAGES - no luck!

Executions of SYNC IMAGES statements on images M and T

correspond if the number of times image M has executed a SYNC

IMAGES statement in the current team with T in its image set is

the same as the number of times image T has executed a SYNC

IMAGES statement with M in its image set in this team.

To avoid races and deadlocks, need to ensure:

1 Segment ordering rules are not broken, and

2 SYNC IMAGES statements are corresponding

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 20 / 28

Page 21: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Coarrays/OMP multiple - deadlocks?

image 1

i f ( t h i s ima g e ( ) . eq . 1) thensync images (2 ) ! then HX

end i f ! t h r ead 1

i f ( t h i s ima g e ( ) . eq . 1) thensync images (3 ) ! then HX

end i f ! t h r ead 2

image 2

image 3

t1

t1

t2

t2

t3

t3

t1 t2 t3

image 4

t1 t2 t3

image 1

image 3

i f ( t h i s ima g e ( ) . eq . 3) thensync images (1 ) ! then HX

end i f ! t h r ead 2

i f ( t h i s ima g e ( ) . eq . 3) thensync images (4 ) ! then HX

end i f ! t h r ead 1

Order of SYNC IMAGES invocationunpredictable

Circular dependency

All wait ! deadlocks.

Cannot use SYNC IMAGES in

Multiple mode?

Need a �ner grain sync mechanism

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 21 / 28

Page 22: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Coarrays/OMP multiple - Locks? Critical? Events? - same

problem

Also unsuitable:

LOCK / UNLOCK - image control statement

CRITICAL / END CRITICAL - image control statement

EVENT POST / EVENT WAIT - image control statement

Sync too coarse. Not thread-safe! Breaks OMP rules:

All library, intrinsic and built-in routines provided by the base

language must be thread-safe in a compliant implementation. In

addition, the implementation of the base language must also be

thread-safe.

Async thread execution is likely to break the segment ordering rules.Need "sub-image" level sync.

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 22 / 28

Page 23: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Coarrays/OMP multiple - no image control statements!

Any thread that needs a halo element does a remote coarray read.

r e a l : : a ( 0 : n+1 ,0: n+1) [ 2 ,� ] , b ( 0 : n+1 ,0: n+1)i n t e g e r : : g r i d ( 2 ) , i , j , img , i t e rg r i d = t h i s ima g e ( a )

main : do i t e r = 1 , n i t e ra = bsync a l l ! m u l t i p l e SYNC IMAGES i n p r oduc t i o n code! $omp p a r a l l e l do d e f a u l t ( none ) sha r ed (n , a , b , g r i d )do j = 1 , n ! Any th r ead can do remotedo i = 1 , n ! c o a r r a y r e ad si f ( i . eq . n . and . g r i d ( 1 ) . ne . 2 ) a ( n+1, j )=a (1 , j ) [ g r i d (1)+1 , g r i d ( 2 ) ]i f ( i . eq . 1 . and . g r i d ( 1 ) . ne . 1 ) a (0 , j )=a (n , j ) [ g r i d (1)�1 , g r i d ( 2 ) ]i f ( j . eq . n . and . g r i d ( 2 ) . ne . 2 ) a ( i , n+1)=a ( i , 1 ) [ g r i d ( 1 ) , g r i d (2)+1]i f ( j . eq . 1 . and . g r i d ( 2 ) . ne . 1 ) a ( i , 0)=a ( i , n ) [ g r i d ( 1 ) , g r i d (2)�1]b ( i , j ) = 0 .25 � ( a ( i �1, j ) + a ( i +1, j ) + a ( i , j �1) + a ( i , j +1) )

end doend do! $omp end p a r a l l e l do

end do main

Sync still needed outside OMP parallel regions, ! limits async executionopportunities ! no performance gain?

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 23 / 28

Page 24: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Coarrays/OMP multiple - no image control statements!

A relaxation kernel example.4 images, 20� 20 coarray array per image ! 40� 40 model.Values are set to the image number initially.

Start After 1000 iterations

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 24 / 28

Page 25: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Coarrays/OMP multiple - Atomics?

For any two executions in unordered segments of atomic

subroutines whose ATOM argument is the same object, the

e�ect is as if one of the executions is performed completely

before the other execution begins.

But... only Integer and Logical types

Seems no solution for other data types...

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 25 / 28

Page 26: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Further considerations - DO CONCURRENT

A DO loop where the order of iterations is immaterial.

Programmer guarantees to the compiler that such loop isparallelisable.

Can be implemented via OpenMP.

Severe restrictions, e.g:

An image control statement shall not appear within a DO

CONCURRENT construct.

Possible solution ! "split" HX (as in the relaxation example above):

Coarray de�nitions outside OMP parallel regions

Coarray references inside OMP parallel regions

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 26 / 28

Page 27: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Conclusions - Challenges for ARB and Fortran committee

1 OpenMP/Coarrays specs are overdue...

2 Serialised coarray comms from OMP parallel regions - ok

3 Multiple coarray comms from OMP parallel regions - unlikely to work

Challenges for...ARB:

Coarrays/OMP 6= MPI/OMP! New rules needed.

OMP requirements - `the base language must ... be thread-safe' -need rethinking for F2008, F2018 - parallelism is built into the

language!

Fortran committee:

Richer atomics for unordered segments? For all intrinsic types?Derived types?

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 27 / 28

Page 28: Anton Shterenlikht - UK OpenMP Users...Mech Eng Dept, The University of Bristol, Bristol BS8 1TR, UK mexas@bristol.ac.uk ... N4 N1 N8 N1 A N2 N3 N4 N5 N6 N7 B N8 /! N2 N3 N5 N6 N7

Acknowledgements

The author would like to thank Bill Long (Cray) for helpful suggestions,and to acknowledge �nancial and other support from the the followingorganisations.

EPSRC grants EP/R013047/1, EP/P034446/1.

ARCHER UK National Supercomputing Service,http://www.archer.ac.uk

Advanced Computing Research Centre, University of Bristol,https://www.acrc.bris.ac.uk

Anton Shterenlikht (Bristol University, UK) Fortran coarrays and OpenMP 22-MAY-2018 28 / 28


Recommended