+ All Categories
Home > Documents > Claudio F. R. Geyer II - UFRGS Inês de Castro Dutra COPPE - Sistemas - UFRJ.

Claudio F. R. Geyer II - UFRGS Inês de Castro Dutra COPPE - Sistemas - UFRJ.

Date post: 21-Dec-2015
Category:
View: 215 times
Download: 1 times
Share this document with a friend
Popular Tags:
45
Claudio F. R. Geyer II - UFRGS Inês de Castro Dutra COPPE - Sistemas - UFRJ
Transcript

Claudio F. R. GeyerII - UFRGS

Inês de Castro DutraCOPPE - Sistemas - UFRJ

2

Outline Introduction Sequential Implementation Parallel Implementation Performance Conclusions Future Work

3

Introduction Why logic programming?

Formal basis expression power implicit parallelism suitability to some problems

Main Language: Prolog syntax declarative and operational

semantics

4

parent(arthur,carol).parent(carol,john).

grandparent(X,Y) :- parent(X,Z), parent(Z,Y).

length([H|T],N) :- length(T,N1), N is N1+1.length([],0).

Introduction

5

Sequential Implementation Interpreters x Compilers WAM (WarrenAbstract Machine)

structure copying environments choicepoints heap trail

6

Sequential Implementation

7

Parallel Implementation Control Parallelism

ORP: Or-parallelism ANDP: And-parallelism And + Or

Data Parallelism Unification Path

8

Parallel Implementation: ORP Problems

representation of multiple bindings to the same variable

Solutions stack sharing stack copying

9

Parallel Implementation: ORP

10

Parallel Implementation: ORP Stack sharing

binding arrays hash windows version vectors variable importation …

11

Parallel Implementation: ORP

Speculative work Prolog semantics? Side-effects and pruning Scheduling

12

Parallel Implementation: ANDP

IAP: Independent and-parallelism DAP: Dependent and-parallelism DetAP: Determinate and-parallelism

13

Parallel Implementation: IAP

Goals that do not share variables can proceed in parallel.

Compiler support CGEs: Conditional Graph Expressions

14

Parallel Implementation: IAP

paper(P,A,D,L) :- author(A), date(D), loc(P,A,D,L).

Possible CGE:

indep(A) & indep(D) => author(A) & date(D),loc(P,A,D,L)

15

Parallel Implementation: IAP Cross-product of solutions Recomputation

qsort([], []).qsort([P|T],L) :- partition(T,P,A,B), qsort(A,L1), qsort(B,L2), append(L1,[P|L2],L).

16

Parallel Implementation: DAP Goals that share variables can

proceed in parallel Producer and consumer

Chosen at compile-time or runtime

one value or stream Compiler support

17

Parallel Implementation: DAP

producer(N,Out) :- N > 0, N1 is N - 1, Out = [ferrari|Ms], producer(N1,Ms).producer(0,Out) :- Out = [].

consumer([ferrari|Ms]) :- go-ride-ferrari, consumer(Ms).consumer([]).

18

Parallel Implementation: DetAP

Goals that match at most one clause can be executed first and in parallel

Compiler support Reduction of search space

19

Parallel Platforms Shared-memory Distributed memory Distributed-shared memory

Implicit x Explicit Parallelism Programming Model Process or processor-based

20

Shared-memory Or-Parallel Systems Aurora

WAM-based processor-based shared stacks binding arrays

21

Aurora: Binding Arrays

22

Shared-memory Or-Parallel Systems Scheduling in Aurora

Wavefront Argonne Manchester Bristol Dharma

23

Shared-memory Or-Parallel Systems

Wavefront, Manchester and Argonne: topmost dispatching

Bristol and Dharna: bottom-most dispatching speculative work

24

Shared-memory Or-Parallel Systems Muse

WAM-based processor-based stack copying

25

Muse: Stack Copying

Multiple environments maintained via stack-copying

Memory space divided into identical address spaces to avoid pointer relocation

Incremental copying

26

Shared-memory Or-Parallel Systems Scheduling in Muse

Sophisticated operations to avoid data race

workers keep data structures about idle and busy workers below their subtrees

Shadowing Preference to leftmost work

27

Shared-memory And-Parallel Systems

&-Prolog &ACE DASWAM

28

Shared-memory And-Parallel Systems &-Prolog

RAP-WAM CGEs compiler support

&ACE based on &-Prolog

DASWAM DAP and IAP, producer

determined at runtime

29

Shared-memory And+Or Systems

Andorra-I ACE SBA ParAKL Penny DAOS

30

Andorra-I

Determinate and-parallelism or-parallelism side-effects, cuts and commits teams of workers scheduling reduction of search space

31

Andorra-I

DetAP phase

ORP phase

#det goals = 0

#det goals <> 0

32

Shared-memory And+Or Systems ACE

IAP + ORP Stack copying IAP a la &-Prolog Composition tree Last parallel call optimisation

33

ACE

34

SBA IAP + ORP Stack sharing Shared Binding Arrays IAP a la &ACE Binding array divided into fixed

segment sizes Conditional variable bound to a

pair <seg#,offset>

35

Performance

Andorra-I

36

Performanceprog name Andorra-I JAM Aurora Musenrv400 8.25 8.37 ---- ----bt_cluster 9.37 9.70 ---- ----bt_wms 3.32 ---- ---- ----road_markings 6.24 ---- ---- ----chat_80_db5 7.30 ---- 7.30 5.915x4x3_puzzle 9.66 ---- 9.51 8.69warplan 1.20 ---- 2.63 1.06protein_all 6.81 ---- 9.49 8.64protein_1st 2.78 ---- 4.10 3.12fly_pan 6.88 ---- ---- ----scanner 5.47 ---- ---- ----cipher 5.65 ---- ---- ----

37

Performance

Pgm map 8queen Xword 8queenp zebra flypan

Prolog 5003 383146 6377 133612 19404 10539

Andorra-I 1047 214918 835 8496 5757 1517

Reduction in search space

38

Performance: bt_cluster

39

Performance: chat-80

40

Performance: floorplan design

41

Applications Optimisation Problems Databases Natural Language Processing Data Mining Constraint Satisfaction Problems ….

42

Conclusions Logic programming: high level of

abstraction Favours Implicit Parallelism Several applications Good performance on small to

medium parallel architectures High performance is coming!

43

Future Work More efficient methods to combine

and + or parallelism Scheduling is an important issue Sophisticated compiler support Memory management Parallel constraint logic

programming Efficient cluster implementations Applications

44

Future Work

Ideal System

45

Perspectives


Recommended