Copyright © 2003, SAS Institute Inc. All rights reserved. Developing Client/Server Applications to...

Post on 14-Dec-2015

214 views 1 download

Tags:

transcript

Copyright © 2003, SAS Institute Inc. All rights reserved.

Developing Client/Server Applications to Maximize SAS® 9 Parallel Capabilities

Cheryl DoningerSAS Institute

Copyright © 2003, SAS Institute Inc. All rights reserved.

The SAS Intelligence Value Chain

Usability

Interoperability

Manageability

Scalability

Copyright © 2003, SAS Institute Inc. All rights reserved.

Scalable SAS/ACCESS

OracleDB2SybaseTeradata

Scalable Performance Data Server

CPU 1 RemoteHost

CPU 2

Clients

Stored ProcessScalable Servers

OLAP Metadata

SASCONNECT

SASCONNECT

SAS

SASCONNECT

THREAD 1THREAD 2

Threaded Procedures

THREAD N…

Scalability – SAS 9SAS Scalable Architecture

Piping Piping

Copyright © 2003, SAS Institute Inc. All rights reserved.

Copyright © 2003, SAS Institute Inc. All rights reserved.

Copyright © 2003, SAS Institute Inc. All rights reserved.

Scalability with SAS 9

parallel threads

parallel processes

Copyright © 2003, SAS Institute Inc. All rights reserved.

Single Threaded V8 SAS

Copyright © 2003, SAS Institute Inc. All rights reserved.

Multiple Processes

Copyright © 2003, SAS Institute Inc. All rights reserved.

SAS 9 Multiple Threads

Copyright © 2003, SAS Institute Inc. All rights reserved.

Multiple Processes and Multiple Threads

Copyright © 2003, SAS Institute Inc. All rights reserved.

A Very Satisfied MP CONNECT Customer…

"I've been dreaming of this capability within SAS for approximately 12 years. The first day back in the office after the course, within 30 minutes I was able to apply the technique to an existing program and reduce processing time by over 50%.”

David Walker

Centers for Disease Control and Prevention

Copyright © 2003, SAS Institute Inc. All rights reserved.

Independent Parallelism

Data Source B Proc Sort

Proc SortData Source A

0 elapsed time

Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT – Independent - Scale Up

Execute

Simultaneously

SMP Server

Read and

Summarize

SAS Data

Read and

Summarize

SAS Data

Extract

Oracle

Data

PROC STEP

DATA STEP

Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT – Independent - Scale Out

Parent SAS Session

SAS Session n

SAS Session 2

Copyright © 2003, SAS Institute Inc. All rights reserved.

Piping – Worth the Price of Admission to SAS 9…

“…piping is the big one that has made a difference to our day - jobs have been cut by up to 60% meaning we can deliver in a much quicker time frame at end of month.”

Charles Pollack

SUNCORP METWAY

Copyright © 2003, SAS Institute Inc. All rights reserved.

Pipeline Parallelism

Data Step Proc Sort

0 elapsed time

Proc Sort

Data Step

Proc Sort

Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT – Piping – Scale Up

Overlapped

Execution

SMP Server

Read and

Summarize

SAS Data

DATA

STEP

DATA

STEP

PROC

STEP

Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT – Piping – Scale Out

Parent SAS Session

SAS Session n

SAS Session 2

Copyright © 2003, SAS Institute Inc. All rights reserved.

When to Use MP CONNECT

long running jobs

independent data sources

independent tasks

tasks that can be overlapped

utilize SMP hardware or processors on a network

Copyright © 2003, SAS Institute Inc. All rights reserved.

Considerations for MP CONNECT

I/O bottlenecks

WORK library

CPU bottlenecks

Copyright © 2003, SAS Institute Inc. All rights reserved.

Gartner’s Definition of Grid Computing

“a grid is a collection of resources owned by multiple organizations that is coordinated to allow them to solve a common problem”

Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT in Cluster Environment

32 node Linux cluster / MOSIX

1 Ghz Intel P3 processors

1 G RAM per processor

100 Mb backplane

Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT in Cluster Environment

Estimated

Work Time for Wait

i No. Time/20 Entire Distribution Time/20

Host Host Iter Iter Problem Efficiency Iter

4 task4 3940 0:04:17 446:05 96% 0:00:07

17 task17 3920 0:04:17 446:03 96% 0:00:09

18 task18 3900 0:04:17 445:26 96% 0:00:10

Total elapsed time: 14:30:03

Cumulative working time: 447:46

Cumulative waiting time: 15:14:54

Scaling efficiency: 96.50%

Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT in Grid Environment

100 heterogeneous nodes

W2K, WXP, variety of Unix OS’s

combination of V8 SAS and SAS 9

Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT in Grid Environment

Estimated

Work Time for Wait

i No. Time/30 Entire Distribution Time/30

Host Host Iter Iter Problem Efficiency Iter

7 ld055 570 0:15:18 1060:11 204% 0:00:00

48 in028 1230 0:06:52 476:07 91% 0:00:00

97 hd204 3120 0:02:40 184:42 35% 0:00:01

Total elapsed time: 5:12:19

Cumulative working time: 468:41

Cumulative waiting time: 0:39:42

Scaling efficiency: 90.04%

Copyright © 2003, SAS Institute Inc. All rights reserved.

Combining Parallel Processes and Threads

Copyright © 2003, SAS Institute Inc. All rights reserved.

SAS 9 Partitioned Data Model

SAS® 8 SAS 9 SPDE Engine & SPD Server®

data

index

metadata

data1

data2

data3

data4

Bitmap/B-tree

Hybrid index

Bitmap/B-tree

Index

metadata

Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT and SPDE Engine

single input, 4.8 GB, 20 million obs

two data steps, two PROC FREQs

4-way unix box

six iterations of implementation

Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT and SPDE Engine

Data Step 1 Data Step 2 Proc Freq 1 Proc Freq 2

partitioned input

partitioned USER=

4 parallel sessions

Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT and SPDE Engine

total improvement in elapsed time of 65%

Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT and Threaded SUMMARY

two raw input files (~1.5G each)

8-way 900 MHz unix box

two data steps, two PROC SUMMARYs, and a merge

Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT and Threaded Summary Sales.txt Goals.txt

1 Step

Merge

Data step Data step

SummarySummary

Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT and Threaded Summary

total improvement in elapsed time of 70%

Copyright © 2003, SAS Institute Inc. All rights reserved.

Considerations for Combining MP CONNECT and Threading

tune threads per session on SMP −CPUCOUNT −THREADS/NOTHREADS−OS processor set command

depends on −application, −data, and −hardware configuration

Copyright © 2003, SAS Institute Inc. All rights reserved.

For More Info…

Scalability and Performance Community

−http://support.sas.com/rnd/scalability

Copyright © 2003, SAS Institute Inc. All rights reserved.