Date post: | 26-Dec-2015 |
Category: |
Documents |
Upload: | crystal-bradford |
View: | 215 times |
Download: | 1 times |
Teradata Leaders in Enterprise Data Warehousing
John TulleyVice President, Teradata Canada
Email: [email protected]: 905-478-8997
2
2004 Revenue by Business Unit
NCR Corporate Overview
• Fortune 500 company• Global operations in more than 100
countries & territories• 28,500 employees
• 2004 Revenue $5.984B• 1999-2004 >51% revenue growth
TeradataFinancialRetailSystemediaCustomer ServicePayment & ImagingOther
Teradata Data Warehouse
Retail Solutions
Financial Solutions
Worldwide Customer Services
Systemedia
3
50% of Top 10 Global Retailers
60% of Top 10 Most Admired
Global Companies
80% of Top 10Global Telco Firms
60% of Top 10 Global Airlines
50% of the Top 10 Transportation
Logistic Firms
Top Industry Leaders Rely on Teradata
• Leading industries> Banking > Government> Insurance & Healthcare> Manufacturing > Retail> Telecommunications> Transportation Logistics> Travel
• World class customer list> More than 800 customers> Over 1200 installations
• Global presence > Over 100 countries
• 4,000 world-wide professionals dedicated to data warehousing
FORTUNE Global Rankings, July 2005
Teradata Top 10
4
The Teradata Difference
What We Do….• Enterprise data warehouse• Windows 2003/Unix/Linux scales from Intel laptop to MPP• Analytic capabilities transform data into information.• Extreme high availability• Industry leader in analytical applications• Integration with SAP, Siebel, Hyperion• Partnerships include Accenture, Bearingpoint, CAPGemini, Deloitte, EDS, Lockheed Martin• Strong customer references
All we do is Data Warehousing!
5
BestWorst
HPHPHP9000HP9000HP-UXHP-UXOracle Oracle
IBM SPIBM SPRS/6000RS/6000
AIXAIXDB2 EEEDB2 EEE
SunSunEnterpriseEnterprise
SolarisSolarisOracle Oracle
GenericGenericIntel IA-32Intel IA-32Win2000Win2000
SQL ServerSQL Server
UnisysUnisysES7000ES7000
Win2000Win2000SQL ServerSQL Server
IBMIBMS/390S/390
OS/390OS/390DB2 EEEDB2 EEE
CompaqCompaqAlphaAlphaTru64Tru64OracleOracle
Teradata
Data Mgmt.Data Mgmt.
Query Perform.Query Perform.
Scalability Scalability and Suitabilityand Suitability
Concurrent Concurrent Query Mgmt.Query Mgmt.
DW Track DW Track RecordRecord
Data AdminData Admin..
Source: Gartner ASEM Ratings 2004
Teradata - the recognized leader in data warehousing and high-performance decision analytics.….Gartner ASEM
6
Industry Leadership Recognition
• Gartner - “Dominant Lead” – 5th Consecutive Year> “DBMS is surely the place where NCR Teradata sets the gold standard. As
in previous years, the Teradata score was 98%, leaving little scope (and need) for improvement.”
– Gartner's [Application Server Evaluation Model] ASEM Data Warehouse Server Update, A. Butler, K. Strange, J. Enck, M. Chuba, November 2004
> Teradata[database management system] DBMS capabilities remain unchallenged by its competitors in the market.”
– Gartner’s Magic Quadrant for Data Warehouse DBMSs, 2004, Kevin H. Strange, June 2004
> “Teradata continues to drive a strong vision.”– Gartner Research, MarketScope: Customer Relationship Marketing, 1Q04, G. Herschel, J. Radcliffe, Feb 2004
> Gartner Dataquest recognized Teradata as the growth leader in the RDBMS market, with above market growth of 17.4%. 2005
> Teradata is rated “Positive” in Gartner’s MarketScope for Campaign Management, the highest rating awarded 2005
• META Group> “Teradata has displayed unmatched (but often copied) strength of vision
and focus in the [enterprise data warehouse] EDW market.” – METAspectrum Market Summary, Enterprise Data Warehouse METAspectrumSM Evaluation, 2004
7
BI Excellence AwardSponsor: Gartner Group•Continental Airlines - winner •Cardinal Health - finalist
Technology Leadership Award
Sponsor: Frost & Sullivan•Teradata selected for
Leadership Award – CRM Analytics
TDWI Best Practices Award•sunrise TDC Switzerland AG
– winner - Customer Relationship Management
NEXUS Awards
1to1 Impact AwardSponsor: Peppers & Rogers Continental Airlines recognized
as Technology Optimization winner
Editors’ Choice Awards Sponsor: Intelligent Enterprise •Teradata selected for the
“Dozen” Most Influential BI Companies
•Winner, Customer Analytics category
NEXUS AwardsSponsor: New Zealand
Direct Marketing Association•Bank of New Zealand,
silver award - data mining & analytics; bronze award - data management
Industry Awards and Recognition - 2005
8
Government Agencies with Teradata PresenceGovernment Agencies with Teradata Presence
• US Air Force• US Navy• US Transportation
Command• Defense Commissary
Agency• Army, Air Force
Exchange • Intelligence
Community• US Postal Service• Italian Post Office
•Dept. of Justice •Dept. of Housing and
Urban Development•Dept. of Agriculture•Arizona, Iowa, Florida,
Texas, Illinois, New York, Utah, Michigan
•RAMQ – Quebec•Australian Tax Office•South African Tax
Office
9
Teradata Solutions Methodology
Project Management
Data MappingData Mapping
Application RequirementApplication Requirement
Infrastructure & Education
Infrastructure & Education
Logical ModelLogical Model
Business Value
Business Value
EDWRoadmap
EDWRoadmap
Opportunity Assessment
Opportunity Assessment
Enterprise AssessmentEnterprise
Assessment
Value Assessment
Value Assessment
User Training
User Training
Production Install
Production Install
Components for Testing
Components for Testing
Initial DataInitial Data
System TestSystem Test
Acceptance Testing
Acceptance Testing
Custom Component
Custom Component
System Architecture
System Architecture
Test PlanTest Plan
Package AdaptationPackage
Adaptation
Education Plan
Education Plan
User Curriculum
User Curriculum
Information ExploitationInformation Exploitation
Physical DatabasePhysical Database
Operational ApplicationsOperational Applications
ECTL Application
ECTL Application
Backup & Recovery
Backup & Recovery
Availability SLA
Availability SLA
HW/SW Upgrade
HW/SW Upgrade
System Performance
System Performance
System DBASystem DBA
Help DeskHelp Desk
Business ContinuityBusiness
Continuity
Capacity PlanningCapacity Planning
Solution ArchitectSolution Architect
Data Migration
Data Migration
Support Management
Support Management
HardwarePlatform
HardwarePlatform
OperationalMentoring
OperationalMentoring
Software Platform
Software Platform
Technical EducationTechnical Education
Strategy IntegrateAnalyzeResearch Design BuildEquip Manage
Technology Neutral ServicesTechnology Neutral Services
InformationSourcing
InformationSourcing
Teradata’s success is the combination of hardware, software and methodology
10
Work
load
Com
ple
xit
y
Data Sophistication
Data Warehouse Needs Will EvolveData Warehouse Needs Will Evolve
OPERATIONALIZING WHAT IS happening?
Event-Based Triggering Takes Hold
ACTIVATING MAKE it happen!
Continuous Update/Short Queries
Event-Based TriggeringPrimarily Batch &
Some Ad Hoc Reports
Increase in Ad Hoc Analysis
ANALYZINGWHY
did it happen?
REPORTINGWHAT
happened?Analytical Modeling
Grows
PREDICTINGWHAT WILL
happen?
Batch
Ad Hoc
Analytics
• Query complexity grows• Workload mixture grows• Data volume grows• Schema complexity grows• Depth of history grows• Number of users grows • Expectations grow
11
Enterprise Analytical Topologies
Sources
Users
DW
Sources
Users
DW
Marts
Sources
Users
Marts
Sources
Users
Middleware
Data Mart Centric
Virtual,Distributed,Federated
Hub-and-Spoke DataWarehouse
EnterpriseData
Warehouse
Independent Data Marts
Leave Data Where it Lies
Dependent Data Marts
Centralized Integrated Data
With Direct Access
Pros
• Easy to Build Organizationally
• Easy to Build Technically
• No need for ETL• No need for separate
platform
• Allows easier customization of user interfaces & reports
• Enterprise view• Design consistency &
data quality• Data reusability
Cons
• Business Enterprise view unavailable
• Redundant data costs• High ETL costs• High App costs• High DBA and
operational costs
• No ETL• Meta data issues• Network bandwidth and
join complexity issues• Only viable for low
volume
• Business Enterprise view challenging
• Redundant data costs• High DBA and
operational costs• Data latency• ODS duplication
• Requires vision• Requires Data Owners
to willingly participate
ODS
12
Typical Data Warehouse Architecture
What’s wrong with this picture?
1. There are too many copies of the data. Will they all be the same?
3. The solution is too complex. Every line on the chart represents an ETL process that requires $$ for Life Cycle Maintenance
4. The solution is too expensive. There are numerous components that lead to increased costs. Costs often hidden in distributed organization.
Operational Data Stores
Central store, Hub, Clearing house
Data Marts
Transaction Systems
2. There is too much latency - too long to get the data to the people who need it. Everyone sees different inconsistent points in time
13
Teradata’s Enterprise Data WarehouseAn Integrated, Centralized Data Warehouse Solution
Transactional Data
Decision Users
Transactional Users
Data Transformation
OperationalData Store (ODS)
“Enterprise”Data Warehouse
Data Replication
Data Marts
En
terp
rise,
Syste
m,
& D
ata
base M
an
ag
em
en
t
Meta
data
Log
ical D
ata
Mod
el
Ph
ysic
al
D
ata
Base D
esig
n
Mid
dle
ware
/En
terp
rise M
essag
e B
us
Bu
sin
ess
& T
ech
nolo
gy –
Con
su
ltati
on
Su
pp
ort
& E
du
cati
on
Serv
ices
StrategicUsers
TacticalUsers
Reporting OLAP Users
Event-driven/Closed Loop
DataMiners
OptionalETL Hub
Optional
Optional
CUSTOMER
CUSTOMER NUMBERCUSTOMER NAMECUSTOMER CITYCUSTOMER POSTCUSTOMER STCUSTOMER ADDRCUSTOMER PHONECUSTOMER FAX
ORDER
ORDER NUMBERORDER DATESTATUS
ORDER ITEM BACKORDERED
QUANTITY
ITEM
ITEM NUMBERQUANTITYDESCRIPTION
ORDER ITEM SHIPPED
QUANTITYSHIP DATE
Optional ELT
SALES
PERIOD KEYPRODUCT KEYCUSTOMER KEYMARKET KEYDOLLARSUNITS
PERIOD
PERIOD KEYDATEDAYMONTHYEARQUARTERTRIMESTER
CUSTOMER
CUSTOMER KEYCUSTOMER NAMECUSTOMER CITYCUSTOMER POSTCUSTOMER STCUSTOMER ADDRCUSTOMER PHONECUSTOMER FAX
PRODUCT
PRODUCT KEYPRODUCT NAMEDISTRIBUTORPRODUCT DESCRIPTIONPRODUCT HEIGHTPRODUCT WIDTHPRODUCT DEPTHPRODUCT WEIGHT
MARKET
MARKET KEYCITYSTATEZIPZIP4DISTRICTREGIONCOUNTRY
Logical(Views) Application
DimensionalCo-Located
Dependent DM
Virtual Views
Single version of data
14
TERADATA is an Open System
TERADATA
CORBA
ODBC
IIOP
.NET
OLE-DB
ASP
WEBWEB
TERADATAUtilities
Queues
Adapter(s)
Mess
age B
us
Publis
h &
Subsc
rib
e
TAP Appl
JDBC
JSP
EJB
JDBC
JMS
Adapter(s)TERADATA
Utilities
Messages
Virtuallyany applicationor middleware framework can be integrated
with TERADATA !!!
JAVA
JDBC
JMS
16
Information Exchange
Fast Export
Decision Making EnvironmentTransactional Environment
Web Services
Enterprise Application Integration
T-Pump, MQ Adapter
Direct Data Access
MQ Adapter
LegacySystems
Secure DOD
Network
Secure DOD
Network
Front
Line
Business Services
Secure Wireless Warfighter Support
Base
Supply
DOD Supplier
Web- Sphere
Tibco(EAI)
.NET
OLAPQueries
IntelAgents
EventEngine
RulesEngine
Teradata Active Data Warehouse in action
1.Continuous Transactionfeeds on supplies usage
2. Conditioning & Loading of trans data
3.Stored Procedures
trigger based event
detection sends alert
to Warfighter, Warfighter
Support, & DOD Supplier
via MSTRNarrowcaster
Strategic & TacticalQueries
5.Warfighter receives alert via Secure Blackberry, adjusts Battle Plans to align with rush replenishment
4. and or DOD Vendor notified and reorders
T-Pump, MQ Adapter
AscentialInformatica
Data Acquisition
Fast Load, Multi Load
TERADATAStored Procedures
Q TablesUDF, Triggers
So what is Teradata ?
18
What is Teradata?
• RDBMS designed to run the world’s largest databases
• Latest Intel technology nodes• UNIX-MP-RAS, Windows 2003• Linux in Fall 2005• Scales linearly from Laptop to MPP• Has a parallel aware optimizer that allows multiple complex queries to run
concurrently• Standard access language (SQL)• Uses a “Shared-Nothing” architecture• Unlimited, unconditional parallelism• Linear Scalability allows for increased
workload without decreased throughput.
19
Teradata Hardware Architecture
• SMP Nodes> Latest Intel SMP CPUs> Configured in 2 to 8 node cliques> Windows, Unix or Linux
• BYNET Interconnect> Fully scalable bandwidth> 1 to 1024 nodes
• Connectivity> Fully scalable> Channel - ESCON> LAN, WAN
• Storage> Independent I/O> Scales per node
• Server Management> One console to view
the entire system Server Management
PE
SMP Node1
AMPPE
AMP AMP AMP
PE
SMP Node2
AMPPE
AMP AMP AMP
PE
SMP Node3
AMPPE
AMP AMP AMP
PE
SMP Node4
AMPPE
AMP AMP AMP
BYNET Interconnect
20
Teradata Shared Nothing Architecture
• Similar to Large SMP, except Interconnect runs at I/O Rates and not Memory Rates• Longer Lifetime: I/O Interfaces have a 3-5 Year Lifetime• Scaling Is By Increasing Link Data Rates and Parallel Links
P
Memory
FSB
I/O
P
P
Memory
FSB
I/O
P
P
FSB
P
MemoryI/O
P
FSB
P
MemoryI/O
21
SMP vs. MPP: The Teradata Advantage
• 2-Way SMP> 1.8 Relative CPU’s> 4 GB Memory> 3.2 GB/Sec BUS> 3.2 GB/Sec Memory> 1.5 GB/Sec I/O
• 4-Way SMP> 3.1 Relative CPU’s> 4 GB Memory> 3.2 GB/SEC BUS> 3.2 GB/Sec Memory> 1.5 GB/Sec I/O
• 2 2-Way Teradata Nodes> 3.6 Relative CPU’s> 8 GB Memory> 6.4 GB/Sec BUS> 6.4 GB/Sec Memory> 3 GB/Sec I/O
• 32 2-Way Teradata Nodes> 57.6 Relative CPU’s> 128 GB Memory> 102.0 GB/Sec BUS> 102.0 GB/Sec Memory> 48 GB/Sec I/O
22
• Rows are distributed evenly by hash partitioning> Done in real-time as data are loaded, appended, or changed.> No reorgs, repartitioning, space management
• Shared nothing software:> Each VAMP owns an equal slice of the data.> Each VAMP works exclusively & independently on its rows> Nothing centralized: No single point of control for any operation (I/O, Buffers,
Locking, Logging, Dictionary)
Teradata Data DistributionDividing the Work
VAMP1 VAMP2 VAMP3 VAMP4 ………………………………………………………VAMPn
Table A Table B Table C
Prime Index
Teradata Parallel Hash Function
P
DM
P
DM
P
DM
P
DM
P
DM
P
DM
P
DM
P
DM
P
DM
RowHash (Hash Bucket) Data Fields
23
File System
• File system architecture is fundamentally different> Broke all the rules> No Pages, BufferPools, TableSpaces, Extents,...> Data location and management are entirely automatic> Space allocation is entirely dynamic
• Absolutely minimal labor required> No reorgs
– Don’t even have a reorg utility
> No index rebuilds> No re-partitioning> No detailed space management> Easy database and table definition> Minimum ongoing maintenance
– All performed automatically
Self Managing Architecture
• Teradata’s self-managing philosophy provides the lowest total cost of ownership of any RDBMS > Automatic, random and even data distribution> Parallel-aware optimizer eliminates query tuning> Parallel utilities with low setup and checkpoint restart> Single operational view of entire MPP complex (AWS)> Single point of control for the DBA (Teradata Manager)> SQL-ready database management information (log files)
25
Teradata DBAs Don’t Worry About!
1. Install the Database
2. Understand, monitor and tune extensive operating system parameters
3. Understand, monitor and tune extensive database parameters
4. Determine the size and physical location and/or space allocations of tables and index partitions
5. Perform periodic table and index re-orgs
6. Manually restart multi-step load process when failure occurs
7. Ability to run queries and data maintenance 24x7
8. Sort data before loading
9. Calculate and configure fail-over plans in a clustered multiprocessing environment
10. Spend a lot of time planning and expanding the system
11. Query tuning for decision support
26
Teradata High Availability
• Teradata software provides high availability beyond other databases > Compensates for
hardware failures:– Automatic failover for
dynamicworkload rebalancing (migrating VPROCS)
– Online, continuous backup(Fallback)
> Recycles beforethe operating system completes its reboot (multi-node system)
PE
SMP Node1
AMPPE
AMP AMP AMP
PE
SMP Node2
AMPPE
AMP AMP AMP
PE
SMP Node3
PE
AMP AMP
PE
SMP Node4
AMPPE
AMP AMP AMP
BYNET Interconnect
AMP
AMP
28
Teradata’s Multidimensional Scalability(It’s more than just big data)
Amount of Detailed Data
Concurrent Users
CUSTOMER
CUSTOMER NUMBERCUSTOMER NAMECUSTOMER CITYCUSTOMER POSTCUSTOMER STCUSTOMER ADDRCUSTOMER PHONECUSTOMER FAX
ORDER
ORDER NUMBERORDER DATESTATUS
ORDER ITEM BACKORDERED
QUANTITY
ITEM
ITEM NUMBERQUANTITYDESCRIPTION
ORDER ITEM SHIPPED
QUANTITYSHIP DATE
Multiple Subject Areas
• Simple Direct at the start
• Moderate Multi-table Join
• Regression analysis
• Query tool support
Sophisticated Queries
29
Data Volume(Raw, User Data)
SchemaSophistication
QueryFreedom
QueryComplexity
QueryConcurrency
MixedWorkload
Query Data Volume
DataFreshness
EDW Requires Multi-dimensional Scalability
30
Data Volume(Raw, User Data)
Competition Scales One Dimension at the
Expense of Others
Limited by Technology!
SchemaSophistication
QueryFreedom
QueryComplexity
DataFreshness
Query Data Volume
The Teradata Difference“Multi-dimensional Scalability”
QueryConcurrency
MixedWorkload
Teradata can Scale Simultaneously Across Multiple Dimensions
Driven by Business!
31
Data Volume(Raw, User Data)
Competition Scales One Dimension at the
Expense of Others
Limited by Technology!
SchemaSophistication
QueryFreedom
QueryComplexity
DataFreshness
Query Data Volume
The Teradata Difference“Multi-dimensional Scalability”
QueryConcurrency
MixedWorkload
Teradata can Scale Simultaneously Across Multiple Dimensions
Driven by Business! TheTeradataDifference!
32
The Teradata Difference“Multi-dimensional Scalability”
WorkloadMixQuery
Complexity
Active Data Warehousing
3-5 Way Joins
Normalized
TBs
MBs
GBs
Query DataVolumes
10 TB
Others
100’s TBs +
Teradata
15 TB
20 TB
Multiple, IntegratedStars and Normalized
15+ way Joins +OLAP operations +Aggregation +Complex “Where” constraints +ViewsParallelism
Batch Reporting,Repetitive Queries
“Iterative”, Ad Hoc QueriesData Analysis/Mining
Near Real Time Data Feeds
SimpleStar
Multiple,IntegratedStars
Data Storage(raw, user data)
SchemaSophistication
5-10 WayJoins
5 TB
# of ConcurrentQueries
1,000’s
33
Implementation Summary• Integrated data from nine separate health-related agencies
•Managed and used by agency subject matter/programmatic experts, not by the IT department
•Over 200 users in Medicaid and 8,000 state-wide
Realizations and ROI• Estimated annual savings of $75 million–$100 million due to
advanced health care analysis
• Medicaid administrative costs have been reduced by 25 percent
• Recoveries for Medicaid Fraud has doubled
• Maximized Medicaid program savings while sustaining quality care
• Warehouse helped Michigan go from “last to first” in child immunization rates
• Track and substantiate savings in Medicaid pharmacy costs
• 2004 TDWI Best Practice Award Winner – Government and Non-Profit Category
Teradata Customer Since 1991
Business Solutions
• Data warehouse integrates claims/encounters; beneficiary eligibility data; provider data; birth records; death records; long-term care assessments; WIC data; immunizations; lead screening; newborn screening; & notifiable diseases.
•Fraud & abuse
•Contract management with health plans
•Healthcare cost & quality assessment
•Overpayment & COB analysis
•Program effectiveness
•Predict State’s healthcare needs
•Prioritize health initiatives for future
Customer ProfileAs the largest department in the State of Michigan, DCH is responsible for managing delivery of health care services to more than 1.2 million clients and overseeing an annual budget of $9.5 billion. DCH administers many of the state’s most critical programs, including Medicaid, WIC, and child immunizations.
State of Michigan, Department of Community Health (DCH)
34
Implementation Summary•More than five years of History
•1.3 Billion Claims
•650 users from 17 counties that is expected to grow to thousands
Realizations and ROI • First year in operation paid for entire implementation of
the DW!
• Better analysis of integrated data resulted in recoveries in the millions!
• $16m - Coordination of Benefits, $5m - duplicate payments, $1 million - overpayments
• $187 million saved due to better policy decisions based on medical and pharmaceutical analysis
• Millions saved due to efficiency of analysis such as Audit process reduced to 2 hours from 8 weeks
• 2004 NASCIO Award – Best Information Architecture Category
Teradata Customer Since 1999
Business SolutionsNew York is making more rapid, informeddecisions about programs, policies, and people across its vast Medicaid system.
• Fraud & abuse• Tracking bio-terrorism indicators daily by
pharmaceutical purchases with acute illness data from hospital emergency rooms
• Determining disease patterns and trends and the best possible treatment
• Tracking drug pattern usage to prevent abuse
• Program effectiveness• Service delivery effectiveness• Enhanced audit control• Forecasting the cost and utilization of
expensive prescription drugs• Identification of overpayments• Responding quickly to legislative
inquiries
Customer Profile New York’s Medicaid program provides critical health care services to more than 3.7 million participants – 2.4 million in New York City alone. To serve this constituency, the state processes and analyzes more than 300 million claims totaling more than $38 billion annually. It is the largest Medicaid program in the US.
The New York State Department of Health (DoH)
35
Iowa Department of Revenue
Tax Compliance• Have more accurate leads because of better information• Experienced substantial savings; staff can --
> Analyze greater volumes of data> Manage a greater number of cases> Exercise a higher level of control over taxpaying behavior > Before the EDW, this additional work would have caused
for a 20-25% increase of the audit staff• Generated $69.7M in incremental collections and refund
reductions in 2003> $30.6M through office examinations> $17.4M in refund reductions> $ 9.1M from tax gap revenues> $ 7.5M in out-of-state audits of multi-state businesses> $ 5.1M from in-state field audits Business Benefits
36
The Teradata Mission
Teradata Active Data Warehousing
strategictactical event-driven decision making in a singlecentralizedmission-criticalup-to-dateversion of the enterprise data
“Any Question, By Any User, At Any Time” All Decision Making…from One Copy of the Data.
strategic
tactical
Sources
Users
Active Data Warehouse