+ All Categories
Home > Documents > Service Oriented BI - Université libre de...

Service Oriented BI - Université libre de...

Date post: 01-Mar-2019
Category:
Upload: hoanghanh
View: 221 times
Download: 0 times
Share this document with a friend
37
05/07/2011 1 Service Service Oriented Oriented BI BI Service Service Oriented Oriented BI BI Services I S IaaS Cloud computing PaaS BigTable MapReduce SaaS CRM SCM BaaS SOA QoS BI on Services BPM KPI 05/07/2011 Alberto Abelló & Oscar Romero 2
Transcript

05/07/2011

1

ServiceService OrientedOriented BIBIServiceService OrientedOriented BIBI

ServicesI S IaaS› Cloud computing

PaaS› BigTable› MapReduce

SaaS› CRM › SCM

BaaS› SOA› QoS

BI on Services› BPM› KPI

05/07/2011Alberto Abelló & Oscar Romero 2

05/07/2011

2

“Services are economic activities offered by one t t th t l l i party to another, most commonly employing

time-based performances to bring about desired results in recipients themselves or in objects or other assets for which purchasers have responsibility. In exchange for their money, time and effort, service customers expect to obtain value from access to goods, labor professional skills facilities networks and labor, professional skills, facilities, networks, and systems; but they do not normally take ownership of any of the physical elements involved.”

Lovelock &Wright

05/07/2011Alberto Abelló & Oscar Romero 3

“All managerial themes unique to services All managerial themes unique to services are founded in customers providing significant inputs into the production process.”

Sampson

05/07/2011Alberto Abelló & Oscar Romero 4

05/07/2011

3

Focus on its core competence Focus on its core competence Decreases cost Access to latest technology

› Without investment Leverages benefits from a supplier

› Economies of scale

05/07/2011Alberto Abelló & Oscar Romero 5

Loss of direct control over quality Loss of direct control over quality Exposure to data security issues Dependence on one supplier Coordination expense and delays Atrophy of in-house capacity

05/07/2011Alberto Abelló & Oscar Romero 6

05/07/2011

4

ConvenienceL lit i› Locality issues

Dependability› Reliability

Personalization Price Quality

› Expectations vs perception Reputation Speed Security

› Confidentiality› Availability

05/07/2011Alberto Abelló & Oscar Romero 7

Business Process as a Service

Software as a Service

Platform as a Service

Infrastructure as a Service

05/07/2011Alberto Abelló & Oscar Romero 8

05/07/2011

5

• IBM WebSpherei• Oracle SOA suite

• webMethods• Apache ServiceMix• Microsoft Connected

Services Framework

05/07/2011Alberto Abelló & Oscar Romero 9

Services Framework• Open ESB• etc.

• Salesforce.comCl d9• Cloud9

• Oco• RightNow• Microstrategy• Quantivo

05/07/2011Alberto Abelló & Oscar Romero 10

• Quantivo• Oracle on Demand• etc.

05/07/2011

6

• Google BigTableA Si l DB• Amazon SimpleDB

• Microsoft SDS• FanthomDB• Aster DB• Vertica

05/07/2011Alberto Abelló & Oscar Romero 11

• Vertica• K2 Analytics• etc.

• Amazon EC2• IBM SmartCloud• Google app engine

05/07/2011Alberto Abelló & Oscar Romero 12

• Etc.

05/07/2011

7

05/07/2011Alberto Abelló & Oscar Romero 13

05/07/2011Alberto Abelló & Oscar Romero 14

05/07/2011

8

05/07/2011Alberto Abelló & Oscar Romero 15

“Cloud computing is a model for enabling convenient on-demand network access to a convenient, on demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction ”interaction.

NIST (National Institute of Standards and Technology)

05/07/2011Alberto Abelló & Oscar Romero 16

05/07/2011

9

On-demand self-service On demand self service Broad network access Resource pooling Rapid elasticity Measured service

05/07/2011Alberto Abelló & Oscar Romero 17

Illusion of infinite resources Illusion of infinite resources Elimination of up-front commitment Pay-per-use

05/07/2011Alberto Abelló & Oscar Romero 18

05/07/2011

10

Buy a cow• High upfront investment• High maintenance cost• High maintenance cost• Produces a fixed amount • Stepwise scaling

Buy bottled milk

05/07/2011Alberto Abelló & Oscar Romero 19

Daniel Abadi analogy

Buy bottled milk• Pay-per-use• Lower maintenance cost• Linear scaling• Fault-tolerant

Availability of service Data lock-in Data confidentiality Data transfer bottlenecks Performance unpredictability Scalable storage Debugging Debugging Scaling quickly Reputation fate sharing Software licensing

05/07/2011Alberto Abelló & Oscar Romero 20

05/07/2011

11

Deployment Deployment› Localization› Routing› Authentication

TuningPl t› Placement

› Resource partitioning› Service level objectives› Dynamically varying workloads

05/07/2011Alberto Abelló & Oscar Romero 21

05/07/2011Alberto Abelló & Oscar Romero 22

05/07/2011

12

Software in the cloud Software in the cloud› DBMS› Workflow management› Versioning

Cloud software

05/07/2011Alberto Abelló & Oscar Romero 23

HBase› Based on Google BigTable (2006)

Hadoop› Based on Google MapReduce (2004)

05/07/2011Alberto Abelló & Oscar Romero 24

05/07/2011

13

key value

family1 family2 familyn…

column1 column2 columnm…

version1 version2 versionp…

(row:string, column:string[, time:int64])string

05/07/2011Alberto Abelló & Oscar Romero 25

Key

Supports single row transactions Compression per block can be enabled The schema determines the locality of data

05/07/2011Alberto Abelló & Oscar Romero 26

05/07/2011

14

Child Parent

Child Parent

05/07/2011Alberto Abelló & Oscar Romero 27

05/07/2011Alberto Abelló & Oscar Romero 28

05/07/2011

15

Processes pairs [key value]

Map Merge-Sort

Reduce

Processes pairs [key, value] Hides parallelization, fault-tolerance, data

distribution and load balancing

05/07/2011Alberto Abelló & Oscar Romero 29

public void map(LongWritable key, Text value) { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); g g ( );while (tokenizer.hasMoreTokens()) {

write(new Text(tokenizer.nextToken()), new IntWritable(1)); }

}

public void reduce(Text key, Iterable<IntWritable> values) { int sum = 0; for (IntWritable val : values) {

sum += val.get(); } write(key, new IntWritable(sum));

}

05/07/2011Alberto Abelló & Oscar Romero 30

05/07/2011

16

[“The”,[

Map

[1,1,1,1,…]]

Reduce

05/07/2011Alberto Abelló & Oscar Romero 31

The 1Project 1Gutemberg 1Ebook 1of 1The 1Outline 1Of 1Science, 1Vol. 11 1(of 14), 1by 1

The 57631

Programming model simple yet expressive Programming model simple yet expressive

Able to process structured or unstructured Elastically scalable Fine grained fault tolerance

05/07/2011Alberto Abelló & Oscar Romero 32

05/07/2011

17

1. The input data is partitioned into blocks2. Replicate them in different nodes

Input file

Block

Replicate

05/07/2011Alberto Abelló & Oscar Romero 33

3. Each map subplan reads a subset of blocks (i.e., split)4. Divides it into records5. Executes the map for each record and leaves them in 5. Executes the map for each record and leaves them in

memory divided into spills

Split

Itemize Map

05/07/2011Alberto Abelló & Oscar Romero 34

05/07/2011

18

6. Each spill is then partitioned per reducers7. Each partition is sorted independently8. Store the spills into disk

Partitionspills

05/07/2011Alberto Abelló & Oscar Romero 35

spillsCombine Store

9. Spill partitions are merged10. Each merge is sorted independently11. Store the result into disk

Merge partitions

05/07/2011Alberto Abelló & Oscar Romero 36

Merge partitionsper reducer

Combine Store

05/07/2011

19

12. Reducers fetch data from mappers13. Mappers output is merged and sorted14. Reduce function is executed per key15. Store the result into disk

Fetch

…Reduce Store

05/07/2011Alberto Abelló & Oscar Romero 37

Combine is executed locally› Assumes uniform random distribution of input› Assumes uniform random distribution of input› Reduces the number of tuples sent to reducers

Only possible when the reducer function is:› Commutative› Associative

Only makes sense if |I|/|O|>>#CPU

05/07/2011Alberto Abelló & Oscar Romero 38

05/07/2011

20

Does not benefit from compression Does not benefit from compression Writes intermediate results to disk

› Reduce tasks pull intermediate data Defines the execution plan on the fly

› Schedules one block at a time

05/07/2011Alberto Abelló & Oscar Romero 39

Bypass the storage system or even OS Bypass the storage system or even OS Add/Use indexing structures Follow some programming rules Provide sorting alternatives to merge-sort Offer alternatives to block granularity for g

schedulingJiang et al.VLDB 2010

05/07/2011Alberto Abelló & Oscar Romero 40

05/07/2011

21

Ashish Thusoo, et. al. “Hive - A WarehousinggSolution Over a Map-Reduce Framework”. VLDB’2009

Sergey Melnik, et. al. “Dremel: InteractiveAnalysis of Web-Scale Datasets”. VLDB’2010

Jens Dittrich, et. al. “Hadoop++: Making a Yellow Elephant Run Like a CheetahYellow Elephant Run Like a Cheetah(Without It Even Noticing)”. VLDB’2010

Alexander Alexandrov, et al. “MapReduceand PACT - Comparing Data ParallelProgramming Models”. BTW’2011

05/07/2011Alberto Abelló & Oscar Romero 41

No ACID No standard Low-level query

Michael Stonebraker

05/07/2011Alberto Abelló & Oscar Romero 42

05/07/2011

22

Do It Yourself• Expensive• Ad hoc development

05/07/2011Alberto Abelló & Oscar Romero 43

Florian Waas analogy

Off the Shelf• Economies of scale• Concrete functionalities

“… But to really unlock the power of Hadoop, y p p,you must be able to efficiently extract data stored across multiple (often tens or hundreds) of nodes with a user-friendly ETL (extract, transform and load) tool that will then allow you to move your Hadoop data i iinto a relational data mart or warehouse where you can use BI tools for analysis. “

Ian FyfePentaho

05/07/2011Alberto Abelló & Oscar Romero 44

05/07/2011

23

DM DMDM

ETL (Extraction, Transformation and Load)

DW

05/07/2011Alberto Abelló & Oscar Romero 45

WWW

Trademark• Expensive• Many functionalities• Mature

05/07/2011Alberto Abelló & Oscar Romero 46

Open source• Free• Simple functionalities• Young

05/07/2011

24

MapReduce Java Oracle

Oracle Grid Engine

Hadoop Distributed File System (HDFS)

05/07/2011Alberto Abelló & Oscar Romero 47

05/07/2011Alberto Abelló & Oscar Romero 48

05/07/2011

25

On-premises Service-basedCustomization + +/-Implementation time +/- +/-Application shut-off + -Hidden fees - -Security of data + -Process integrity + -

05/07/2011Alberto Abelló & Oscar Romero 49

Guarantee of quality + -

5%10%15%20%25%30%35%

Service %

05/07/2011Alberto Abelló & Oscar Romero 50

0%

05/07/2011

26

Increase the efficiency of transactions Reduce the costs of maintenance

› Reduce obsolescence costs Reduce the loss of sales Reduce the average time to deliver a

product› Improve the cash flow p Shorter delivery and charging time

› Reduce the costs of dealing with urgent orders› Improve the service to the client Smaller delivery time and price

05/07/2011Alberto Abelló & Oscar Romero 51

360º view of clients Automate and administering sales

processing Reduce service costs Improve collaboration Speed-up in sales cycle Speed-up in sales cycle Improve efficiency in the management

of contacts Increase sales in the clients base

05/07/2011Alberto Abelló & Oscar Romero 52

05/07/2011

27

Only one repository Only one repository› Finances and accounting› Material management› Human resources› Manufacturing› Sales and marketing› Sales and marketing

Modules include› SCM› CRM

05/07/2011Alberto Abelló & Oscar Romero 53

05/07/2011Alberto Abelló & Oscar Romero 54

05/07/2011

28

Business Process Management Business Process Management Service Composition Service Infrastructure and Management

+ + + +BusinessProcesses

Business

05/07/2011Alberto Abelló & Oscar Romero 55

BusinessServices

UtilityServices

DB MapReduce MDMCRMSCM ERP

Reusability Reusability Loose coupling Contract Abstraction Composability Autonomy Statelessness Discoverability

05/07/2011Alberto Abelló & Oscar Romero 56

05/07/2011

29

Distributed SOAcomponents

Design Functionality ProcessDesigned to … Last ChangeDevelopment cycle Long Interactive and iterativeCentered on … Cost BusinessCoordination Blocks OrchestrationCoupling Tight Loose (agile and adaptive)

05/07/2011Alberto Abelló & Oscar Romero 57

Coupling Tight Loose (agile and adaptive)Technologies Homogeneous HeterogeneousProgramming Objects MessagesEncapsulation Partial Full (contracts)

Primitive activity Primitive activity Complex activity

› Atomic transaction› Business activity Orchestration Choreography Choreography

05/07/2011Alberto Abelló & Oscar Romero 58

05/07/2011

30

Service broker(UDDI)

findServiceC

lient

(UDDI)

Service provider

description(WSDL)

t

register

05/07/2011Alberto Abelló & Oscar Romero 59

Busin

ess

obje

ct

request(SOAP)

response(SOAP)

Service Engineering and Design Service Engineering and Design Service Adaptation and Monitoring Service Quality

05/07/2011Alberto Abelló & Oscar Romero 60

05/07/2011

31

Definition Definition› Difference between perceived and expected

Negotiation› Service Level Agreement Service Level Objectives

A Assurance

05/07/2011Alberto Abelló & Oscar Romero 61

Input› Supply› Cost

Process› Performance› Security

Outcome› Customization› Customization› Satisfaction

Systemic› Reproducibility› Sustainability

05/07/2011Alberto Abelló & Oscar Romero 62

05/07/2011

32

05/07/2011Alberto Abelló & Oscar Romero 63

Turns manufacturers into services Turns manufacturers into services Becomes a barrier for competitors By coding customers allows to:

› Instruct staff› Manage queues in call centers› Target offers› Share data with other firms

05/07/2011Alberto Abelló & Oscar Romero 64

› Sell data to other firms

05/07/2011

33

“You cannot control what you cannot measure”

05/07/2011Alberto Abelló & Oscar Romero 65

05/07/2011Alberto Abelló & Oscar Romero 66

05/07/2011

34

05/07/2011Alberto Abelló & Oscar Romero 67

05/07/2011Alberto Abelló & Oscar Romero 68

05/07/2011

35

05/07/2011Alberto Abelló & Oscar Romero 69

05/07/2011Alberto Abelló & Oscar Romero 70

05/07/2011

36

05/07/2011Alberto Abelló & Oscar Romero 71

05/07/2011Alberto Abelló & Oscar Romero 72

05/07/2011

37

BI can benefit from services at four levels BI can benefit from services at four levels› IaaS› PaaS› SaaS› BaaS

Services benefit from BI Services benefit from BI› KPI and Balance Scorecards

05/07/2011Alberto Abelló & Oscar Romero 73

Mell, P., Grance, T.: The NIST Denition of Cloud Computing. Special Publication 800 145 National Institute of Standards Special Publication 800-145, National Institute of Standards and Technology (January 2011), draft

Abadi, D.J.: Data management in the cloud: Limitations and opportunities. IEEE Data Engineering Bulletin 32(1), 3-12 (2009)

Stonebraker, M., et al.: MapReduce and parallel DBMSs: friends or foes? Communication of ACM 53(1), 64-71 (2010)

Hostmann, B.: Business Intelligence as a Service: Findings and Recommendations Research G00164653 Gartner and Recommendations. Research G00164653, Gartner (January 2009)

Erl, T.: Service Oriented Architecture. Prentice Hall (2006) Castellanos, M., et al.: Automating the loading of business

process data warehouses. In EDBT’2009. pp. 612-623. ACM

05/07/2011Alberto Abelló & Oscar Romero 74


Recommended