Copyright © 2003, SAS Institute Inc. All rights reserved.
New directions inSAS Enterprise Miner®
for SAS® 9.1Sascha SchubertProduct Manager Data MiningSAS EMEA
David DulingEM Development DirectorSAS Inc.
Copyright © 2003, SAS Institute Inc. All rights reserved.
Agenda
! Data Mining in the Intelligence Value Chain
! Objectives for SAS 9.1� Usability� Scalability � Interoperability� Manageability
! Demo
Copyright © 2003, SAS Institute Inc. All rights reserved.
Data Mining in the Intelligence Value Chain
Product Manager Data Manager Call Center Data Miner
Data Miner! Builds tables from queries! Creates statistical models! Passes models to consumers
Consumers of datamining! BI planning! IT Scoring ! Front Office automation
Copyright © 2003, SAS Institute Inc. All rights reserved.
Usability
! SAS Client � EM 4.3 bus
! Java Client � EM 5.1
! SAS Code Node
iness analysts & statisticians
business analysts & statisticians
Add custom analytical functionality
Copyright © 2003, SAS Institute Inc. All rights reserved.
EM 4.3 - SAS Client
− Continuation of our existing client/server system− New Interactive Grouping node− Multi-threaded procedures− Generates SAS, C, and Java Score code− Model registration to EM Model Repository
Copyright © 2003, SAS Institute Inc. All rights reserved.
EM 5.1 � Thin Client
Enhancements
! Java Based
! Multiple active windows
! Integrated graphics
! Consistent controls
! On-line help
! Asynchronous processing
! Model Packages
Copyright © 2003, SAS Institute Inc. All rights reserved.
EM 5.1 Visual Data ExplorationInteractive graphics available for any data at any time
Copyright © 2003, SAS Institute Inc. All rights reserved.
EM 5.1 Model Assessment and Selection ToolsConsistent printed and graphical output for all models
Copyright © 2003, SAS Institute Inc. All rights reserved.
EM 5.1 New Nodes
! Rule Induction � Useful for classifying rare events. ! AutoNeural � Automatically search for the optimal NN architecture ! DMINE Regression � builds a model using PROC DMINE
� Binning, grouping, interaction building! Principal Component - creates principal components from
covariance or correlation matrix ! StatExplore - creates univariate and multivariate descriptive
statistics for most important variables
! Merge - merges observations from two or more data sets or more into a single observation in a new data set.
! Drop Variables � drops unwanted variables from data for efficieny
Copyright © 2003, SAS Institute Inc. All rights reserved.
New Procedures
! Support Vector Machines (exp) � popular algorithm for general classification problems
! Web Path Analysis � provides efficient and scalable mining of frequent paths from click-stream data.
! Taxonomy � supports hierarchical associations to populate rules at different levels in the hierarchy.
! ARBOR � decision tree - enables interactive training on the server and provide improved performance of disk resident data.
Copyright © 2003, SAS Institute Inc. All rights reserved.
New Tree ApplicationInteractive Training on the entire training table
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scalability
! Multi-tasking model training
! Multi-threaded procedures
! Wide data set support
Parallel processing in the PFD
Proc DMINE, DMREG, Sort, Summary�
Real-world data with thousands of variables
Copyright © 2003, SAS Institute Inc. All rights reserved.
Enhanced Performance
! Uses MP CONNECT technologies to distribute mining processes across multiple CPUs providing the ability to run nodes in parallel.
! Uses SAS Multithreaded Procedures
Copyright © 2003, SAS Institute Inc. All rights reserved.
Performance Tests
4 tasks
� Train four identical models� Tree� Logistic Regression� MLP Neural Network
� %gendata(rows,vars) ;� rows= 1, 4, 16, 64, 256K� vars= 4, 16, 64, 256
� Tasks = 1, 2, 4 � N=3*4*4*3=144 models
� Measure real time 2 tasks
1 task
Copyright © 2003, SAS Institute Inc. All rights reserved.
EM 5.1 Scalable Performance on SMP systemsMultiple tasks running concurrentlyin one process flow diagramLinear scalability !!!
Copyright © 2003, SAS Institute Inc. All rights reserved.
Interoperability! SAS Meta-Data server
ETL-Studio Table Definitions
Model Repository
! Model Repository Viewer
! Batch Processing
! Java API
! PMML Score Code
Exchange metadata between SAS solutions
Integration with IT
Push models to SAS solutions
Web Interface to model results
SAS Application development
Java Application development
Scoring everywhere
Copyright © 2003, SAS Institute Inc. All rights reserved.
ETL Studio Integration with Enterprise Miner
! ETL Studio� Define a Process Job to
Create a Table� Register Table to Metadata
Server
! Enterprise Miner� Read Table from Metadata
Server for mining
Copyright © 2003, SAS Institute Inc. All rights reserved.
Model Repository
Copyright © 2003, SAS Institute Inc. All rights reserved.
Web enabled model viewing
Copyright © 2003, SAS Institute Inc. All rights reserved.
Solutions Integration
SAS Solutions
! Marketing Automation
! Banking Intelligence
! Credit Scoring
! Process Intelligence
! Genetics / Micro Array
! Human Capital Management
Customer Solutions
! HP-ZLE
Copyright © 2003, SAS Institute Inc. All rights reserved.
Manageability
! True client/server architecture
! SAS Management Console
! Java Webstart client
Presentation clients & compute and data management servers
Server and User management
Zero-install and admin
Copyright © 2003, SAS Institute Inc. All rights reserved.
EM Architecture
SAS Servers
SAS System
Middleware Server
Web ServerHTTP
EM 5.1 JavaWeb Clients
SAS Metadata Server
Web Clients
ETL Studio
SAS/EM 4.3 SAS/Connect
IOM
EM ProjectsData Sources
SAS/Access - tcp/ip
EM Java Server( Optional )
RMI
IOM
IOM is a SAS IntegratedObject Model for networkcommunications. Weprovide Java and Windowsinterfaces to IOM services
Builds Models
Collects DataDeploys Models
Needs Models
Copyright © 2003, SAS Institute Inc. All rights reserved.
EM Metadata in the SAS Management Console
Copyright © 2003, SAS Institute Inc. All rights reserved.
GUI deployed through Java Webstart
! GUI deployed through Internet
! No maintenance needed on client
Copyright © 2003, SAS Institute Inc. All rights reserved.
EM 5.1 Demo
Copyright © 2003, SAS Institute Inc. All rights reserved.
Questions ???