Tools for Managing Big Data Analytics on z/OS
Mike Stebner, Joe Sturonas PKWARE, Inc.
Wednesday, March 12, 2014
Session ID 14948
Test link: www.SHARE.org
Introduction Heterogeneous Analysis
Addressing the process of packaging and transferring z/OS based information to an off-board analytic platform in an Effective, Cost-efficient and Secure manner.
What are some major hurdles that exploitation of advanced
System z facilities can overcome in this venue?
2
Introduction Heterogeneous Analysis
• Data Transformation • Code page differences (EBCDIC/ASCII) • Data Structures (Binary, Endian mode numerics, Parsing)
• Portability between dissimilar file system formats • Data Packaging (multiple discrete components) • Data Protection • Data Volume • Total raw size • Number of exchanges
3
Finding the Sweet Spot
4
What is the business impact of selected designs and facilities?
5
Focus on experiences with System z Facilities that help address two areas
• Data Transformation • Code page differences (EBCDIC/ASCII) • Data Structures (Binary, Endian numerics, Parsing)
• Portability between dissimilar file system formats • Data Packaging (multiple discrete components) • Data Protection - Encryption • Data Volume – Hardware Assisted Compression • Total raw size • Number of exchanges
6
Data Protection Data-Centric Encryption using ICSF
7
Machine z10-‐EC 2097
z10-‐BC 2098
z196 2817
z114 2818
zEC12 2827
zBC12 2828
Algorithm Supported
DES 3DES
AES128, 192, 256
DES 3DES
AES128, 192, 256
DES 3DES
AES 128, 192, 256
DES 3DES
AES 128, 192, 256
DES 3DES
AES 128, 192, 256
DES 3DES
AES 128, 192, 256
Crypto Hardware
CPACF CEX2C CEX3C
CPACF CEX2C CEX3C
CPACF CEX3C
CPACF CEX3C
CPACF CEX3C CEX4C
CPACF CEX3C CEX4C
Application Design Cryptographic Design Influences
• Data Exchange Format • Collection with associative constructs
• Data Transport (Container Format) • In-flight and ‘at rest’ security • Authentication and decryption service availability
• Cryptographic Identity and Associated Key Management • Dynamic vs. Static Keys • Inter-system Key Coordination
• Data Recovery (Contingency Keys) • Resource Capacity • Timeliness of service
8
9
Key Exposures – The need for Key Management
Crypto Facilities
10
OpenPGP Keyrings
Native X.509 Certificates
Proprietary Certificate Store
RACF/ACF2/Top Secret
Certificate Cryptographic
X.509 Certificates Public
LDAP Administration
Application Services
ICSF CKDS & PKDS
Certificate Authority
CEXnC / CPACF / Software Crypto
Data-Centric Encryption ICSF Data Encipherment Algorithms
• RSA PKi Encryption • Losing ground for longevity due to high cost of processing
increased key lengths • Symmetric Clear Key • DES class, AES (128 – 256 bit key strength) • May be employed with passphrase-generated key or CKDS
stored key • Symmetric Protected Key (SYMCPACFWRAP) • CKDS Secure Key
11
12
Symmetric Key Operational Comparison
“Clear” Fast, but Risky
“Protected” Fast & Secure
“Secure” Slow
o ICSF Software -or-
o System z CPACF
o System z CPACF o Cryptographic Card
o Passphrase Value -or-
o ICSF CKDS Registered (clear)
o ICSF CKDS Registered (encrypted)
o ICSF CKDS Registered (encrypted)
Leverage ICSF CKDS to Protect Passphrase Derived Keys
13
Illustrate Registered ICSF CKDS Key Set
14
CKDS Policy Control – Duplicate Key Value Protection
15
RACF key ring/certificate with PKDS Label:MSTEBNERSHARETEST ç RACF Label (r_datalib API access) Certificate ID:2QPVweLV4uPFwtXF2fLw8P1A Status:TRUST
Start Date:2013/12/17 19:00:25 End Date: 2014/01/18 19:00:24 Serial Number:10F0F1FF3C718DEE4D24BBEDA47A49D0 Issuer's Name:CN=UTN-USERFirst-Client Authentication and Email.OU=http: //www.usertrust.com.O=The USERTRUST Network.L=Salt Lake City.SP=UT.C=US
Subject's Name:[email protected]=Mike Stebner.OU=Corporate Secure Email.OU=Issued through PKWARE E-PKI Manager.O=PKWARE.648 N PL ANKINTON AVE.L=MILWAUKEE.SP=WI.53203.C=US Key Usage:HANDSHAKE
Key Type:RSA Key Size:2048 Private Key:YES
PKDS Label:SHARE2014MSTEBNER ç ICSF PKDS Label (implied access)
16
What is the business impact of selected designs and facilities?
17
Inherited OpenPGP Data Flow
18
• Onion layer concept • Encryption Layer • Compression Layer • Literal Data layer
• Data stream packets on each layer
Literal Data Layer
Compression Layer
Encryption Layer
Consider the Basic Data Flow
19
Simple copies from phase to phase
Understand OpenPGP Internal Stream Formatting (RFC 2440 or 4880)
20
OpenPGP Data Flow Overhead
21
Additional data manipulation logic from phase to phase
Illustration of Container Format Influence on Encipherment Facilities
22
Symmetric Keys X.509 Certificates OpenPGP
RACF/ACF/CA-TSS
ICSF PKDS
ICSF CKDS
FIPS 140-2
GOOD WORK REQUIRED NOT AVAILABLE
Compression Why is it important?
23
APPLICATION SERVICES
GCP/ zIIP/zEDC
Data acquisition
Result: Compressed & Encrypted Data on Target Platform
Data is offloaded, encrypted, and compressed.
What Compression Facilities are Available on System z? Software-based • General CP (e.g. gzip, OpenPGP, PKZIP, zlib) • Any viable cross-platform compatible algorithm chosen for
implementation • Deflate (RFC1951) is a commonly used algorithm that combines
LZ77 sliding dictionary compression with Huffman coding.
• Software using zIIP offload • Execute software routines on a System z9 or later • Requires APF authorization to run SRB enclave scheduling • Provides economic compression, but may not improve
latency performance.
24
What Compression Facilities are Available on System z?
Hardware-based • System z CMPSC Static Dictionary hardware compression • Available since the early 1990’s • Static dictionary LZ77 • Limited applicability outside of z/OS
• System z Enterprise Data Compression hardware • New with zEC12 and zBC12 systems • PCIE adapter card • Implements Deflate algorithm
25
Compression Facility Functional Comparison
26
Software General CP
Software on zIIP
CMPSC Static Dictionary zEDC
Portable
Generalized Compression
Requirements General CP Capacity
System z9 zIIP Capacity
(APF)
Pre-defined data structures
zEC12/zBC12 z/OS 2.1
zEDC Card
GOOD WORK REQUIRED NOT AVAILABLE
IBM zEnterprise Data Compression for z/OS and the zEDC Express Feature (I)
IBM Announcement; Document Number: ZSB03059USEN • Implements RFC 1951 Deflate compression • “When zlib uses zEDC, there can be up to 118X reduction
in CPU and up to 24X throughput improvement” • One or more PCIE cards servicing multiple partitions (15) • Currently supported only under a native z/OS LPAR • Check IBM statements of direction
• Optimized for larger amounts of data • Has configurable minimum size limits (4k floor)
• PTFs available for z/OS 1.12 and 1.13 to inflate • Also see SMP/E FIXCAT(IBM.Function.ZEDC)
27
IBM zEnterprise Data Compression for z/OS and the zEDC Express Feature (III)
• System Use Cases • SMF
• Phased Roll-out intentions • BSAM/QSAM (infrastructure layer) • DFSMSdss™/DFSMShsm™ backup/restore • z/OS Java™ Technology Edition, Version 7
• Detailed SHARE sessions • 15209: Experiences with IBM zAware and zEDC • 15099: zEnterprise Data Compression: What is it and How
Do I Use it? (Wed. 4:30 PM) • 15080: z/OS zEnterprise Data Compression Usage and
Configuration 28
IBM zEnterprise Data Compression for z/OS and the zEDC Express Feature (IV)
• z/OS V2R1.0 MVS Callable Services for HLL (Ch. 13-15) • Deflate stream compatible with GZIP, PKZIP, OpenPGP • Hardware availability checks to determine availability • IBM-provided compatible C library functions
• APF Authorized API for single-block compress/inflate • Unauthorized zlib interface (streaming data)
29
IBM zEnterprise Data Compression for z/OS and the zEDC Express Feature (V)
• z/OS V2R1.0 MVS Callable Services for HLL (Ch. 13-15) • Unauthorized zlib interface (streaming data) • Uses zlib.net z_stream programming interface (subset) • Raw Deflate Stream or GZIP modes (CRC32 with GZIP) • libzz.a include wrapper • Controlled by SAF-protected FACILITY class resource
FPZ.ACCELERATOR.COMPRESSION • z/OS UNIX _HZC_COMPRESSION_METHOD environment
control variable • May fall back to zlib software routines depending on zEDC
requirements, including size limitations • PARMLIB IQPPRMxx DEFMINREQSIZE (4K) and
INFMINREQSIZE (16K)
30
IBM zEnterprise Data Compression PKWARE Early Test Program Experience
• Objective • Assess compression using software GCP, zIIP and zEDC
• zEC12 • 5 General CPs, 2 zIIPs, 1 zEDC
• Workloads – Single system (no LPAR sharing of zEDC) • “Large” (1gb+) linear with multiple parallel (80 concurrent) • “Small” (256k) high volume
• Metrics • Elapsed Time • Processor time
31
zEDC Operations
Console Display General PCIE Status
32
zEDC Operations
Display zEDC PCIE Adapter Status
33
zEDC Operational Monitoring (II)
34
zEDC Processing Characteristics
• Multi-tasking with the zlib API is available • zlib API may not run on the zEDC hardware (per design) • Different minimum buffer size thresholds for deflate & inflate
• Only one ‘level’ of zEDC Deflate compression • 9 levels available in zlib software • Internal implementations of RFC 1951 Deflate may differ • May experience varying compression ratios (based on level)
right around the minimum buffer size restriction.
35
IBM zEnterprise Data Compression PKWARE Early Test Program Experience
Initial Results Overview (I) • zEDC sustained 1gb+ per second of raw compression • zEDC capacity exceeded application resource constraints • The affects of I/O and application processing prevented
saturation of zEDC • Under appropriate conditions, zIIP met or exceeded
application performance when compared to zEDC. • Optimized zlib C routines showed benefits over the libzz.a
wrapper code under some conditions. • Small files under the minimum buffer size • Inflation
36
IBM zEnterprise Data Compression PKWARE Early Test Program Experience
Initial Results Overview (II) • ETP limitations of first implementation identified • Buffer allocation issues • Buffer release • Rejected concurrent requests for the same size buffer
• Compression ratio (77% vs. 89% for software implementations)
37
Effect of Resource Availability zEDC vs. zIIP
38
Incorporate Design with Facility Transactional Example (1.5mb each)
39
Summary Slide
• The Mainframe is typically the source of record for critical business data • Data needs to move off the mainframe quickly, efficiently and
securely. • Numerous facilities on z/OS exist to make this quick, efficient
and secure – zIIP, CryptoExpress4S, CPACF, zEDC • Proper Transformation is critical to reduce hardware
dependencies and facilitate long term viability
40