Guidelines for describing Statistical Services · Web viewThe main foundation for the methods...

1

European Commission

DG Eurostat

G

Guidelines for describing StatisticalServices definitions

Version 1.027/03/2015

ESS.VIP.CRC.SERV: Version 1.0

Status: Final

The document is a deliverable based on cooperation between the Eurostat Enterprise Architecture team and the project “shared SERVices” which is part of the ESS Vision 2020

implementation programme.

2

1 About this document.....................................................................................................................4

1.1 CSPA.......................................................................................................................................4

1.2 Target Audience.....................................................................................................................5

1.3 How to use these guidelines..................................................................................................6

1.4 The Service Catalogue............................................................................................................9

1.5 SOA Lifecycle..........................................................................................................................9

2 CSPAService Definition................................................................................................................11

2.1 Service Identification...........................................................................................................13

2.2 Business Process and GSBPM..............................................................................................14

2.3 Business Service...................................................................................................................15

2.4 Business Goals.....................................................................................................................17

2.5 Ownership...........................................................................................................................20

2.6 Exposure..............................................................................................................................20

2.7 Outcomes............................................................................................................................25

2.8 Restrictions and policies......................................................................................................26

2.9 Service Input / Output.........................................................................................................28

2.10 Patterns...............................................................................................................................30

2.11 Anti-patterns........................................................................................................................31

3 CSPAService Specification............................................................................................................32

3.1 Business Function Identification..........................................................................................33

3.2 Input / Output / Metadata...................................................................................................38

3.3 Metrics and Key Performance Indicators.............................................................................41

3.1 Business Process..................................................................................................................43

3.2 Security................................................................................................................................44

3.3 Multilingual support............................................................................................................48

3.4 Identifying assets for re-use.................................................................................................49

3.5 Applicable Statistical Methodologies...................................................................................51

3.6 Patterns...............................................................................................................................51

3.7 Anti-patterns........................................................................................................................52

4 CSPA Service Implementation......................................................................................................54

4.1 Invocation protocols............................................................................................................55

4.2 Supported protocols............................................................................................................56

4.3 Data by reference protocols................................................................................................60

4.4 Canonical Data Models........................................................................................................62

4.5 Non-canonical data models.................................................................................................63


3

4.6 Distribution..........................................................................................................................64

4.7 Service Contract...................................................................................................................67

4.8 Requirements for security...................................................................................................70

4.9 Policies.................................................................................................................................72

4.10 Non-functional characteristics (QoS)...................................................................................72

4.11 Technical dependencies.......................................................................................................76

4.12 SOA Layering........................................................................................................................77

4.13 Service creation scenario.....................................................................................................79

5 Using the Services........................................................................................................................85

5.1 Discovering services.............................................................................................................85

5.2 Reusable services in the local architecture..........................................................................85

5.3 Using a statistical service...................................................................................................86

6 Bibliography.................................................................................................................................87

7 Annexes.......................................................................................................................................88


4

1 About this documentThe document is a deliverable based on cooperation between the Eurostat Enterprise Architecture team and the project “shared SERVices” which is part of the ESS Vision 2020 implementation programme.

This document provides guidelines in order to facilitate sharing of statistical services across the members of the European Statistical System (ESS). Producing official statistics is a highly regulated activity as there are international regulations and standards that result in the adoption of similar requirements in statistics’ production in each country. Similar requirements result in analogous challenges and eventually in the development and deployment of similar but not interchangeable solutions by National Statistical Institutes (NSIs). There is a need to conform to international regulation which is further promoted by a trend of decreasing budgets and the ‘digital revolution’ which is changing the operation of governments around the world. In this environment, an opportunity arises to pursue international cooperation by focusing on standardizing statistical production and developing reusable statistical services.

Data collection and dissemination is a constantly changing and evolving process. User needs in statistical production change regularly (e.g. new methodology to be used for generating a new set of indicators) and therefore need to be defined and documented. This document supports the analysis and definition of user needs in a standardized way. It helps defining new or updating existing business needs. Following these guidelines will develop knowledge about existing (statistical) processes and help improving and identifying functions which are shared among production processes for different statistical domains. The present document aims at combining the everyday needs tackled by statistical services with compliance with available standards and services in order to achieve:

Identification and description of statistical services that can be shared. Development of statistical services that can be reused; the application of the available standards

can ensure reusability. Implementation and sharing of services across member states and organizations.

From the guidelines provided in this document, business analysts will learn about how to describe service definitions in the context of agreed standards. The main foundation for the methods applied in this document is the recommendations described in CSPA V1.5 (1). The guidelines are focused around the idea of providing uniform service descriptions for deployment according to the principles of service oriented architecture.

CSPA (1) recommends following different levels for defining services. These guidelines use the following of the identified CSPA levels for describing statistical service definitions:

Statistical Service Definition (CSPA conceptual layer) Statistical Service Specification (CSPA logical layer) Statistical Service Implementation (CSPA physical layer)

1.1 CSPAThe Common Statistical Production Architecture (CSPA) is currently under development in a project carried out under responsibility of United Nations Economic Commission for Europe (UNECE), on behalf


5

of the international statistical community. It is expected that this specification will drive future ventures to deliver new statistical services.

The CSPA defines foundational principles for exposing re-usable and interoperable services in statistical organizations. Recommendations proposed by the CSPA are facilitating common statistical production issues such as interoperability, re-use, prevention of repetition of similar incompatible functions spread among various organizations, governance, technological and functional compatibility and also reducing cost burden related to incompatible software solutions running in disparate organizations. Below is the excerpt from CSPA (1) defining important business goals of that specification:

“The value proposition of CSPA, in providing statistical organizations with a standard framework, is to:

Facilitate the process of modernization and support the modernization efforts within statistical organizations

Provide guidance for transformation within statistical organizations Apply a consistent enterprise architecture approach within and across statistical organizations to

respond to the challenges of emerging information sources such as big data Facilitate the reuse / sharing of solutions and services and the standardization of processes, and

thus a reduction in costs of production Encourage interoperability of systems and processes Provide a basis for flexible information systems to accomplish their mission and to respond to

new challenges and opportunities Leverage the wider statistical community to more rapidly develop capabilities in areas of

emerging need such as the ability to harness alternative data sources Enable international collaboration initiatives for building common infrastructures and services Provide the ability to supplement internal capability by drawing on skilled resources from across

the statistical community Foster alignment with existing industry standards such as the Generic Statistical Business Process

Model (GSBPM) and the Generic Statistical Information Model (GSIM)"(1)

1.2 Target AudienceThis document is addressed to several types of recipients within organizations in ESS – business leaders and a number of IT roles. This work aims primarily at providing guidelines for delivering projects aligned to the ESS Vision 2020, however it could prove to benefit all statistical service development and reuse projects between ESS countries and also provide useful guidelines to other NSIs outside the ESS. Sometimes in the document we are referring to service provisioning entities as “providers”.

Business Leaders

Business leaders and senior managers will have an interest in ensuring that the future capabilities of their organization are met while minimizing cost. They should be involved in identifying services that the NSI wants to deliver if these do not already exist. Business leaders become CSPA Investors in the process and can represent a variety of management roles. They will have an interest in assessing candidate statistical services and ensuring resources are available for developing and implementing reusable services.

Enterprise Architects


6

Enterprise architects will support the identification and definition of services to meet the goals of their NSI and the ESS. They will be interested in the Service Definition, and making sure that the services implemented fit in the overall Enterprise Architecture. Enterprise Architects should use established standards and understand how the application architecture maps against the business architecture.

IT Architects

Software components to support service delivery are designed by software architects based on service implementation descriptions which outlines the most important aspects of architectural decomposition for SOA based information systems. Architects will be most interested in the Service Definition and in the Service Specification. IT architects should ensure that the application architecture of the statistical service meets CSPA principles.

Project Managers

In scoping and managing their projects ESS and local project managers will understand the impact of certain decisions on how they plan and deliver their projects. Project managers will be interested in finding out services that meet the strategic needs of their NSI, and ensuring that services exist or are developed that meet the goals of their projects. Project managers will be most interested in Service Definitions and should work with Enterprise Architects to ensure a consistency of approach. The work at business architecture is used to link to the business vision, the definitions can then be linked to user needs and used to map overall programme dependencies.

Development Leads

Development leads will receive specifications from the IT architects. They will understand the technology context within which they are developing the statistical service and the need to apply standards to ensure reuse. They will be interested in the Service Specification and will be responsible for developing the Service Implementation documentation.

Business Analysts (internal or external)

This document is used by business analysts to support them in service identification and description of their purpose. Created service descriptions will enable assembly, deployment planning, building etc. Business analysts undergo a three-step procedure to transform business design into a software components’ design which is described later in the next chapter.

Service Certification Committee

Members of a Service Certification Committee use this document to validate CSPA compliance of proposed software components and with other standards in force. It is to ensure interoperability, sustainability and alignment of IT and business. The certification body cooperates closely with an SOA center of excellence to deliver an interoperable and re-usable service which can later be registered in a Global Service Catalogue.


7

1.3 How to use these guidelinesThese guidelines can be used as an autonomous source, but they also include references to existing frameworks and roles. It is suggested that, where possible, you consider these guidelines within the context of a team that covers more than one role. If you are working on an ESSNet project that is implementing deliverables towards Vision 2020, consider who in your consortium and in your NSI are carrying out the different roles and will be interested in the different stages of statistical service reuse.

In accordance to the practice described in CSPA, describing service definitions is carried out at three levels of details and– the Service Definition, Service Specification and Service Implementation. The document is accompanied by three supplementary templates, one for each level. Service descriptions corresponding to each level are created from one of the provided templates and filled at each step of analysis according to these guidelines.

The three indicated CSPA levels are related to each other. Service Specifications build on Service Definitions ad Service Implementation build on Service specifications. The main three chapters of this document (Definition, Specification and Implementation) are explaining how and what to describe at each description layer. In each chapter there are diagrams to depict how information of one description layer (definition, specification, and implementation) is related to another layer.

Service Definition (conceptual)

This is the first step to take when describing services for all new ventures. Service description starts with a service definition (conceptual part) – a non-technical high level overview of a statistical service. This part is defined by the CSPA as “(CSPA160) The CSPA Service Definition is at a conceptual level. In this document, the capabilities of a Statistical Service are described in terms of the GSBPM sub process that it relates to, the business function that it performs and GSIM information objects which are the inputs and


8

outputs.” (1). The main goal of it is the service definition in high level business terminology. The service is described through the perspective of business goals, statistical business process and data which is processed. Business analysts defining a statistical service should create business process and information models aligned with agreed process and information frameworks (such as GSBPM (2) and GSIM (3)).The results of business process modelling(see the BPM@EC)can be exploited to give more detailed information on participating processes in the modelling. Furthermore, the business analyst includes elements of business vision and business goals which are used at later stages to imply service policies and performance indicators to measure how business goals can be achieved.

Service specification (logical)

The second step for service description – the service specification - is comprised in a more detailed specification of anticipated business functions. Business definition is broken down into more detailed conceptual descriptions of business functions provided by the service. There is a number of decomposition techniques described later in this document to aid analysts in the discovery of business requirements. At this stage the business goals and vision defined when defining the service are transformed into policies (“terms of use”). Performance indicators which are used to measure effectiveness of production processes are re-defined again on higher degree of details.

CSPA is defining this part as “(CSPA162) The CSPA Service Specification is at a logical level. In this layer, the capabilities of a CSPA Service are fleshed out into business functions that have GSIM implementation level objects as inputs and outputs. This document also includes metrics and methodologies.” (1). According to that definition, the focus is shifting from a high-level definition to a functional description of capabilities provided by the service. For the first time capabilities of the service are described in details as a black-box with a well-defined input and output.

Service implementation (physical)

The Service Implementation Description (physical view) is aiming to describe all technical details required for the delivery of software solutions which implement requirements defined in the service specification. This part of software description is outlining service contracts, software components, assembly procedures, non-functional requirements, etc. CSPA is describing these activities as “(CSPA164-165) The CSPA Service Implementation Description is at an implementation (or physical) level. In this layer, the functions of the CSPA Service are refined into detailed operations whose inputs and outputs are GSIM implementation level objects. This layer fully defines the service contract, including communications protocols, by means of the Service Implementation Description. It includes a precise description of all dependencies to the underlying infrastructure, non-functional characteristics and any relevant information about the configuration of the application being wrapped, when applicable.”(1). Guidelines provided in this part are mainly based on the SMP@EC methodology. Well established standards such WS-BPEL and SDMX-ML are used for describing low-level technical details of executable processes and data models.

In general for a CSPA Service definition there will be one CSPA service specification. On the implementation level there might be different implementations depending on for example software constraints or protocols to be used. Still each implementation must implement the data format defined in the CSPA Service specification. For completing the CSPA service implementation description the business analyst should collaborate with developers building or maintaining the concerned statistical service.


9

This guideline describes all the concepts required for service description at each level of three levels of details. Readers will learn from this guideline how to write service descriptions using three templates – service definition, specification and implementation.

1.4 The Service CatalogueOnce a statistical service has been developed and documented for the 3 levels of abstraction required for CSPA services (as well as any required additional documentation such as installation instructions or statistical methods), it is ready to be included as a draft in the Service Catalogue. Currently Eurostat is maintaining a Catalogue on behalf of the global CSPA project. In addition to the CSPA documentation, the service should also include a status such as ‘work-in-progress’ or ‘published’.

In order for the statistical service to be published, the extent to which it meets the CSPA standard has to be assessed. The current owner of the CSPA certification process is the CSPA Implementation Group. Under the certification assessment, the CSPA Definition, Specification and Implementation are iteratively assessed and enriched. Specific aspects that are assessed are:

Compliance with GSBPM, GSIM, CSPA-LIM Duplication and/or fit of the service in the Service Catalogue Ownership and support provided for the service Technical dependencies and technical compliance with CSPA Extent to which the service meets user stories

The Centre of Excellence should also be an established certifier with strong links to the CSPA Implementation Group and membership in it.

Currently two types of certification level exist:

Full CSPA compliance. This level is defined as meeting all the criteria specified in the CSPA v1.5 standard. A fully CSPA compliant service is open source, meets SOA architectural standards, its business functionality is clearly defined using GSBPM and GSIM, and it has been made available to the community with the inclusion of relevant CSPA documentation.

Partial CPSA compliance. This level is less clearly defined, but refers to when most of the criteria mentioned above have been met with some exceptions. Examples of services that have achieved partial CSPA compliance include those that have dependencies to proprietary technology (e.g., requires specific operating system that is not open source), and those that do not meet all the SOA architectural standards of CSPA (e.g., standalone applications rather than web services).

Currently a clearer set of criteria to certify a service is being developed. This set of criteria is likely to take the form of a business process with assigned roles and a checklist-type set of criteria to assess the service.

1.5 SOA LifecycleHow is this process related to SOA lifecycle and RUP?


10

Modelling is a process of capturing business design based on comprehension of business requirements and business objectives. Business operations, real life business models, are translated into specifications of business processes. Capturing the business design using a sophisticated approach that includes the use of specialized tooling lets you perform what-if scenarios with various parameters the business might experience. The process can then be simulated using those parameters to predict the effect that process will have on the business and IT systems. If the achieved results do not match the business objectives, then the process definition can be refined. The model also captures key performance indicators that are important measurements of your business, such as business metrics. These key performance indicators are built into the assembly of the application. In addition, you can monitor the indicators in production, capturing critical data to measure if the objectives are being met (4). This guideline is mainly focused on the “Modelling” phase for projects which aim to align IT and business using SOA as a tool.

The RUP “business modelling”, “requirements” and “Analysis & Design” disciplines may be applied as part of service description. Business modelling corresponds to a high-level service definition. Requirements analysis and use cases are the disciplines addressed in the service specification and finally the RUP “analysis & design” discipline is exploited in the service implementation description.


11

2 CSPAService DefinitionThe process of service description starts up with the first, conceptual part of analysis – the CSPA service description. To understand better purpose of this section, it is essential to understand how CSPA is defining this item:

“(CSPA160) The CSPA Service Definition is at a conceptual level. In this document, the capabilities of a Statistical Service are described in terms of the GSBPM sub process that it relates to, the business function that it performs and GSIM information objects which are the inputs and outputs.” (1).

This provides a high level definition of a service being delivered to consumers.

At later stages of the analysis, the above concepts are unfolded to provide detailed service specifications and implementation as depicted below.

Capabilities act as an entry point for the service function definition and later for specifying technical interfaces which enable service invocation. High level service descriptions are later translated into physical implementation of business logic which models behaviours required for service consumers.

Information (GSIM) and business process (GSBPM) models are defining basic high-level concepts used for information and process descriptions. They position services in the context of generic statistical process and define abstract information models used to represent statistical information. At later stages of analysis, these models are refined into data structures described using agreed vocabulary (for instance provided by GSBPM and GSIM). At the implementation level they are transformed into executable processes and canonical data models such as SDMX-ML (5) or BPMN / BPEL process definitions.


12

Service definition is backed by the concepts of business function, business process and business service (CSPA049(1)). These terms are introduced in the following chapters.

Service definition is created step-by step, capturing it from multiple perspectives. The first step is to describe the required statistical service in business terms. The main framework to facilitate this activity is the Generic Statistical Business Process Model (GSBPM). In the case of development within the ESS (such as Vision 2020 deliverables), the ESS member should relate the service to specific capabilities in the ESS Enterprise Architecture Reference Framework. If the ESS member has a Business Capability Model that relates to the GSBPM, it can be used in order to facilitate this activity. If your organisation has a Business Capability Model that provides more detail than GSBPM (for example, the ONS Business Service Model extends one level below GSBPM), this would make it easier to define fine-grained statistical services. Different ESS members can also have different approaches or policies with regards to the granularity of statistical services they intend to build. The corresponding guidelines are described in the “” and Business chapter of this guideline. As the next step, the business goals, ownership and service exposure are outlined to better comprehend business context. This is also the very first step in which service restrictions and policy definitions are settled.

The Definition benefits the ESS member by providing early sight of the potential space that the statistical services would occupy in the Enterprise Architecture: what are its interactions with existing architectural elements, are there any elements in the Definition of which the NSI does not have in-depth knowledge of (e.g., a new GSIM information object), who are its potential users in the business. The granularity of the proposed service is of interest at this stage. It is suggested that any Business Capability Model or internal Service Catalogue maintained by an NSI contains the information of these internal CSPA Definitions before submitting to the CSPA Service Catalogue.


13

2.1 Service IdentificationAny ESS member can initiate the process of identifying a statistical service. This might be driven by many different needs, such as:

A particular business capability is required by a modernisation programme in that NSI As part of Vision 2020 the NSI is identifying services needed to meet an ESS deliverable An ongoing activity by the NSI to maintain and develop its Enterprise Architecture

2.1.1 Candidate ServicesStatistical services identified by NSIs should be assessed by the NSI to consider whether they could have wider use. If internally the service is identified as reusable the service should be raised as a candidate service for the ESS. The development and use of these statistical services can be then facilitated and coordinated by the Centre of Excellence (CoE).

Candidate services can be:

Existing applications that the NSI wants to make available for reuse, which can involve a variable amount of effort in order to make the application a CSPA compliant statistical service.

New statistical services that the NSI wants to build, either by itself or in conjunction with partner countries.

For both types of candidate services, the CoE has to assess:

Who are the potential users of the statistical service? Does it deliver any elements in the ESS EARF and SPRA? What are the user needs of potential users of the statistical service across the ESS?

Once users have been identified, two main activities need to take place:

Agree roles; which of these NSIs are investors, builders or assemblers of the service? Identify the needs of the users

Once roles are agreed, the user needs must be captured against the Service Definition. This helps us to both assess the importance of the service ahead of its assessment and prioritisation, and to develop its Service Specification.

The user needs should preferably be captured following user stories. These can be used to understand some of the non-functional requirements of the service (such as statistical methods or security). There are many ways to capture user stories, with the main form being As a <user>, I want <something> so that <some reason>. For example, “As a data collection methodologist, I want to design a questionnaire so that I can develop a new business survey”. As noted in this example, capturing the user story helps not just to identify user needs, but also to document who the users are and what is their motivation. The user stories can then be used to assess whether the statistical service meets the needs, therefore becoming also acceptance criteria.

An example of a user needs driven approach to the development of services can be seen in the UK’s Government Digital Service Standard (https://www.gov.uk/service-manual/service-standard). Different


14

governments and institutions are likely to have different best practices towards the development of IT services that should be considered when developing statistical services.

2.1.2 Assessing candidate servicesThe SERV project is currently developing a framework for assessing services. Once an ESS list of candidate services has been produced, it has to be regularly reviewed and assessed in order to promote some candidates for development. The aim is to then include these statistical services, if they meet CSPA standards, in a shared Service Catalogue. Many of these services will be developed as part of projects that deliver Vision 2020 and key strategic services (such as those outlined in SPRA) should be prioritised where a match between candidate service and to-be architecture exists. Management and coordination of the assessments, and of the candidate lists, will be carried out for the ESS by the Centre of Excellence.

The current assessment framework is under development and will include two types of assessment. The first assesses candidate services for development. This assessment is more strategic, and focuses on the business needs of the ESS countries. The outcome of this assessment is a prioritised list of services that can then be developed.

The second assessment takes place later in the lifecycle of each statistical service. This assessment focuses on the compliance with the service definition and technology principles of CSPA. This assessment also allows understanding the benefits of the statistical service and whether it meets the needs of ESS countries.

2.2 Business Process and GSBPM

Generic Statistical Business Process Model defines a standard terminology to liaison between independent entities processing statistical data. GSBPM creates structures captured at four levels of details:

Level 0, the statistical business process Level 1, the nine phases of the statistical business process Level 2, the sub-processes within each phase Level 3, a description of those sub-processes

(Detailed descriptions of all the phases can be found in the GSBPM Definition(2))

Usage of well-established standards for process descriptions, such as GSBPM fosters service re-use and increases interoperability. Standards provide common vocabulary and widely accepted classification of processes.


● ● ●

GSBPM defines a common vocabulary

and standardizes statistical data

processing

● ● ●

15

As a business analyst, use GSBPM to position your service in a broader context. Think of a high-level business process definition where the same service might be used in numerous contexts.

The service definition document will contain only a high-level process identification, level 1 of business process and its identifier.

The service is a part of a wider statistical process, so it is required to change the way of thinking from “point-to-point” to “process” – oriented. Anticipate possible roles of the service in the statistical ecosystem. Think of possible uses in multiple contexts.

2.3 Business ServiceWhat the business function really is and how to understand it?!

“(CSPA055) A Business Service is the means of accessing a Business Function. It will perform one or more Business Processes. It is the who - or what - will undertake the work associated with each function. Business services should be scoped to support flexible sequencing and configuration of Business Functions within different Business Processes. A Business Service has an explicitly defined interface that requires the knowledge of what the service will deliver (including in what time frame) given a particular set of inputs. A Statistical Service is a kind of Business Service.” (1).


16

Each business function is “something an enterprise does, or needs to do, in order to achieve its objectives(CSPA052)” (1 p. 14). It is a repeatable task which brings an added value for a consumer. Business process is a set of business functions linked together in a chain of actions which altogether are aiming to give an added value to consumers. Shortly speaking, this part of service definition corresponds to the “needs to do” part of business function definition.

A concise definition of service capabilities in terms of required function should be stated in this part of service definition.

2.3.1 Service Definition Description – how-toIn this chapter we are trying to tell few words about the way the service definition description should be created by business analysts.

The service is usually defined through the following concept presented by CSPA (CSPA162).

Business Process which is used to deliver function Capabilities of a service provided Information which is being processed

Business process is only identified at a higher level of abstraction in this chapter of service definition and identified according to GSBPM. Additional information related to existing results of BPM@EC analysis might alternatively be provided.

Business Function is defined in terms of service’s capabilities. This is the “needs to do” part of business function observed from consumer’s perspective. This topic will further be extended in the chapter “Business Goals”.

Consider the following example of a business function definition

The above example defines the following service capabilities and processes


Example: Disseminate statistical data to public and foster re-usability and open standards. Use open standards to deliver service to a wider number of consumers

GSBPM: 7.4 (Disseminate/PromoteDisseminationProducts)].

17

Capabilities (business function or “needs to do”) – disseminate statistical datao “objectives” – deliver service to a wider number of consumers

Business process – based on the sub-process GSBPM 7.4

You will usually derive this definition from a “vision”, “business case”, “project context”, “problem statement” or and similar information available for your service.

In the above example, the business process has been identified from the following part of GSBPM (2)

“7.4. Promote dissemination products: [..]this sub-process concerns the active promotion of the statistical products produced in a specific statistical business process, to help them reach the widest possible audience[..]”(2).

The discussed example service is categorized as a dissemination service. Later in the course of analysis, the above information is refined further into business function specifications which are in turn implemented as invokable operations.

Business process definitions are subsequently defined in more details using standard notation such as BPMN or BPEL.

2.4 Business Goals“(CSPA010) An enterprise architecture aims to create an environment which can change and support business goals. It shows what the business needs are, where the organization wants to be, and ensures that the IT strategy aligns with this. Enterprise architecture […] ensures that the technology is aligned to the business needs” (1).

Business function of the service is defined from the perspective of “something an enterprise does, or needs to do, in order to achieve its objectives” (1 p. 13). It is expressed in terms of an added value for either the business itself or consumers who are serviced.

It is crucial to define business goals which drive the need for new capabilities. They should preferably be defined in a written form as they form an entry point for key performance indicators which in turn enable business process measurement, monitoring and optimization. Furthermore, they allow understanding of the real purpose of business and this knowledge may contribute to better definition of business functions.

As an example of business definition we will take business goals of Eurostat as a whole:

“Eurostat’s mission is to provide the European Union with a high-quality statistical information service[…]

providing the Commission and its departments with the high-quality statistical service needed to develop, implement and evaluate policies;

developing a partnership with the corresponding statistical services of the European Central Bank;

producing, with the assistance of the European Statistical System, reliable, comparable and relevant statistics covering the EU’s areas of competence;


18

disseminating Community statistics to the European public, businesses and decision makers, as part of its role as a public service provider;

supporting non-EU countries, particularly candidate countries, which wish to develop their statistical systems within the framework of the EU’s external relations with those countries”

The above mission statement defines a number of business goals which are created for Eurostat. In order to measure how much this mission is accomplished, goals should be quantified and measured. Business goal definition serves as an entry point for defining process metrics (CSPA162). They allow to measure progress in achieving business goals.

Some other business goals will be interpreted as requirements for the service. This includes functional requirements and system qualities such as performance, reliability, security, non-repudiation, etc.

For instance a goal defined as “disseminate statistical production to public” is defining not only an explicit function which should be delivered by the service. It will also influence the service exposure and requirements for inter-operability. It implies certain assumptions concerning security, data models.

The following chapters describe some of techniques exploited to infer business goals from other documents in a project.

2.4.1 Vision statementEvery business needs a vision which defines its purpose. The vision is an inherent part of strategic planning and is often defined as a single paragraph of text which constitutes an ultimate justification for existence of that business. In the Eurostat’s case for instance, the vision statement could be expressed as “Eurostat’s mission is to provide the European Union with a high-quality statistical information service”. The vision helps capturing business service definitions better.

Business goals can be derived from the vision and then expressed as numerical performance indicators to measure how the vision is realized by business processes. Later, at implementation layer, software architects will use indicators to anticipate technical components having capabilities to measure effectiveness of business processes. Process measurement is only the first imminent step for process optimization.

2.4.2 Service Definition Description – how-toBusiness goals may be described in the service definition description in a tabular form. Try to find out about the ultimate goal of the business. Derive business goals from the “vision”, “business case”, “project context”, “problem statement” in your project. Analyse them and imply few most important objectives.

Remember, that business functions are defined by CSPA as “something an enterprise does, or needs to do, in order to achieve its objectives” (1).

These goals have probably already been analysed at earlier stages of the project and they can just be provided in the service definition description for reference as in the example below


Business Goals

Metrics & KPI

Measure progress and bottlenecks

System qualitiesperformance, reliability, etc

Implem

entationSpecification

Definition

19

Why is that important?

The goal defined as “disseminate statistical production to public” defines service exposure and requirements for inter-operability. It implies certain assumptions concerning security and data models in use. That information is used at implementation layer to derive components securing the application and to define interoperable canonical data model such as SDMX-ML. Policies applicable to a service will as well be partly inferred from business goals.

In addition to that, the list of business goals is transformed into a set of performance indicators which enable decision makers to measure the progress in achieving business goals. Business processes will be measured with key performance indicators, alerts will be issued on the event of malfunctions or bottlenecks. Business process administrators will create monitoring dashboards to depict processing status in a real-time. Their definition is creating opportunity for further process optimization.

More details

In the service definition, business analysts will give more detailed descriptions of business goals in a similar manner to the example below:

Business Goal Description

Disseminate Statistical Information to public

“Eurostat wants to promote its services among a wider community of consumers. It is a part of its public mission “to provide the European Union with a high-quality statistical information service. [..] Eurostat is committed to [..]disseminating Community statistics to the European public

Enable clients of Eurostat to easily integrate their business with the services provided by Eurostat

There is a need to maximise the sharing and re-use of informatics tools developed by ESTAT and by Member States, and to ensure the long-term maintenance of such tools. In particular, within this project, the aim is provide certain institutional users an automated download of ESTAT data using up-to-date internet technologies. There is a demand from certain institutional users to have an automated download of ESTAT data using up-to-date internet technologies. The principal solution consists of three elements. First, a web service will be provided for requesting data using SDMX Queries.


Examples: Goal 1: Disseminate statistical information to the public. Goal 2: Enable clients of Eurostat to easily integrate their business with the services provided by Eurostat. Goal 3: Process data imputation in a more uniform way across Eurostat and National Statistical Institutes

20

2.5 Ownership“(Principle CSPA108) Ensure there is an authoritative source. Information consumed and produced by services should be sourced and updated from a single authoritative source. Information should be consistent across all relevant services”.

Funding and ownership are concepts deeply rooted in the governance of businesses. Decision rights for a service should be established from the beginning of its existence. Business owner should be known and authoritative enough to make decisions concerning service lifecycles, policies, service provisioning, assets involved and most importantly funding. Without knowing “who” is responsible for a service, the maintenance of information spread among multiple stakeholders in a coherent way may renders to be impossible “(CSPA108) Information consumed and produced by services should be sourced and updated from a single authoritative source”.

Think about “who” is responsible for the data being processed, He or she is responsible for the definition of statistical methods which are in use and most importantly, maintains service lifecycle.

Decision rights and ownership are elementary governance concepts which enable lifecycle management, service maintenance and optimization and make interoperability between different stakeholders possible. Define who is responsible for lifecycle of the service early so that stakeholders are aware who has authority to introduce changes upgrades and finally decide about retirement.

2.5.1 Service Definition Description – how-toService definition description devotes one chapter to ownership. This information is used throughout the long lifecycle of the service to enable its management (mainly in production). It identifies towards which stakeholder (“who”)one should turn to get a support and receive information about roadmaps, scope of provided service or learn about backward compatibility.

Provide exact identification of the business owner and service provider. That information is used by the Modernization Committee on Production and Methods when establishing communication channels.

2.6 ExposureThe SMP@EC guideline specifies the “One of the important aspects to consider while taking service exposure decisions is the alignment with the business they support and/or are expected to support in the future. The EC Process Glossary classification used to structure the service operation candidate list will help to identify, namely:


Example ownership: Eurostat, Unit B3: “IT For statistical production”Example ownership: Australian Bureau of Statistics (ABS) – microdata department

Ownership

MaintenanceLifecycleChange management

Governance Procedures

Implem

entationSpecification

Definition

● ● ●

Establish and govern decision rights from the

beginning

● ● ●

21

Opportunities for capabilities to be considered in a broader scope than the scope of the project (e.g. provide functionality that has business meaning and value at a Policy Domain level; provide functionality that has value for a given Process Category).

Potentially overlapping service operations already considered (either under development, or operational) in other projects of the organization that may imply the decision of not developing the capability candidate identified in the project but rather reuse an existing

functionality.” (6 p. 23)

The range of possible consumers for a service defines its exposure. There are six standard exposure ranges defined by SMP@EC (6). The “specific” exposure defined by SMP@EC is not discussed in this document as it concerns technical realization of a service.


The range of institutions or people being served by ESS has a real influence on governance policies – especially the security aspect of business operation. The higher exposure, the better service lifecycle management must be provided. Knowledge sharing with regard to service construction, planned changes and roadmaps must be maintained. In the end the system qualities such as high availability, scalability, compliance with open standards, ease of use, documentation are playing a major role in bringing higher interests among anticipated audience of consumers. The range of service exposure is playing an important part in the service definition.

Risks

The range of exposure should be foreseen from the beginning of a service’s lifecycle. Services for which the scope is not set on a right level may quickly become non-interoperable and non-re-usable. Such services may just become one another types of SILOs.

What’s next?

Define exposure to develop better policy management and service governance. This quality is important when planning security for software and infrastructure. Service designers will elect information exchange protocols to better support interoperability between various consumers. Relevant


22

standards defining applicable vocabulary must be ensured to streamline communication between international task force groups.

Learn about different exposures from the next chapters and in the end use the “how-to” guideline at the end of this chapter to define exposure for your service.

2.6.1 Exposure: Local“The service is visible and can be used only within the owner DG's scope(6)”

Services which are supposed to be provided on this level of exposure are subject for serious reconsideration. In general, services related to statistical processing should be positioned within GSBPM as their implementation will tend to reflect parts of statistical process.

Locally exposed services are realizing tasks which are so specific, that they cannot be identified as being a part of a globally shared process.

Maintenance of such specific, local services is increasing cost burden and result in repeating projects with the same business goals. Broader service exposure can be achieved by applying SOA adoption and its SOA implementation scenarios (such as service wrapping).

Governance Policies Security Open standards and interoperability

Service lifecycle: depends on other services / components

A policy of a wider scope is in force for this service

In most cases no specific security requirements

Low-interoperability, probably none

Canonical data models are always preferred over legacy solutions.

2.6.2 Exposure: Eurostat ESS (Domain)“The service is visible and can be used in the scope of the domain” (6)


Service lifecycle: well defined

Service registered in the local Artifact Repository

There is a domain-specific policy in force

Authentication: in most cases required – ECAS

Confidential information: yes

Non-repudiation: auditing

Canonical data models agreed on institutional level.

EC-established data models possible.



23

2.6.3 Exposure: European-Commission“The service is visible and can be used at the European Commission scope (by all EC DGs)”(6)


Service lifecycle: clear roadmap communicated to stakeholders in advance

Focus on backward compatibility

Service registered in the Local Artifact Repository

SOA Center of Excellence drives the governance

Well defined policies reinforced by governance.

Compliance with European legislation

Well defined terms of use

Authentication: in most cases required ECAS

Confidential information: possible

Non-repudiation: auditing

Canonical data models agreed on institutional level.

EC-established data models possible.


2.6.4 Exposure: Inter-institutional. “The service is visible and can be used outside of the European Commission scope (all EU institutions, Executive Agencies, and Member States)”(6)


Service lifecycle: clear roadmap communicated to stakeholders in advance


Service registered in the Global Artifact Repository

Modernization Committee on Production and Methods is informed and Architecture Review Board is driving definition and specification


Compliance with local and European legislation


Authentication: trust domains (service-to-service trust and user mapping)

Confidential information: possible

Non-repudiation: auditing/digital signature

Well recognized standards established by official committees and accepted by Architecture Review Board

Interoperable and Canonical Data Model for information exchanged established by an independent body (SDMX)

2.6.5 Exposure: Public“The service is visible and can be used by the public”(6)


24


Service lifecycle: clear roadmap

Widely available communication channels with subscription


Service registered in the Global Artifact Repository

Modernization Committee on Production and Methods is informed and Architecture Review Board is driving definition and specification


Compliance with local and European legislation


Physical separation from existing assets.

Self-service.

Authentication: public information only or self-registration

Confidential information: rather not available

Non-repudiation: auditing/digital signature

Publicly information systems are well separated from existing assets to reduce vulnerability to misuse

Well recognized standards established by official committees and accepted by Architecture Review Board

Interoperable and Canonical Data Model for information exchanged established by an independent body (SDMX)

2.6.6 Service Definition Description – how-toTypes of exposure and their impact in different scenarios are quite well defined. Business Analysts provide only an indication of service exposure in the service definition description.

Example:

You may also include a high-level overview diagram to depict how consumers are supposed to use the described service. The below diagram may serve as an example:


25

Internet

NSI Croatia NSI Luxembourg

private-sector enterpriseprivate person

Example Service being described

2.7 OutcomesOutcome of a business service is defined by the CSPA (1) as:

“(CSPA029)A service is a representation of a real world business activity with a specified outcome” (1)

Outcomes are representing results of service. Outcome should NOT be misunderstood and perceived only in terms of technical message formats or databases. Instead, it is a product or a service delivered in a normal course of statistical processing (business operation).


The CSPA is defining the principle of “(CSPA074) Designs are output driven” (1). Services not designed to deliver data in an expected format and processed according to expected method will never support requirements in full. Information modelling begins early at this stage to mitigate the risk of non-aligned business and IT.

What are risks?

Services producing outcomes in a form not satisfying requirements or reducing inter-operability will become one another SILO – non-reusable application in with limited capabilities for exploitation in a broader scope of statistical processes.


26

2.7.1 Service Definition Description – how-toIn case of statistical production, the outcome is almost always a form of statistical information. In the service definition descriptions, describe shortly what outcome is delivered and in which form. Consult the following examples to better comprehend the scope of this task.

We would like to warn readers about the danger of going into technical details like for instance technical message formats at this stage. The purpose of this chapter is to only define service capability or function. Later the service specification and service implementation descriptions will take these statements into account to define low level technicalities and implement software components handling required functions.

2.8 Restrictions and policies “The pillars of service oriented architecture are the service repository, the service provider, the service consumers, the service description, the service policies and governance.” (7)

Services with inter-institutional and public exposure are delivered under specific conditions of use. Definitions for constraints and terms of use are governed by the organization responsible for service provisioning.


In a situation where a service is used by independent partners, it is inevitable to define clear rules for cooperation which are constituted by policies. Even if the service is exposed to a narrower scope of consumers, policy definitions are improving governance and maintenance – shortly, it significantly improves cooperation.

Moreover, the policies create an entry point for better specification of security, open protocols, authentication, audit, encryption and similar qualities.

Policies are discoverable

Policies can be registered in service catalogues in the same manner as service interfaces. They are discoverable and can be freely looked up by any stakeholders to find out about ways for interoperable communication, authentication protocols and general terms of use.


Example: The outcome of the service is comprised of statistical data modified by the imputation and variable derivation according to the methodology XYZ to improve data quality.

Example: the service is providing statistical information in an interoperable format; consumers can formulate queries to obtain detailed statistical information limited to the range of data which is requested. Consumers may formulate complex queries involving conditional statements for the data to be obtained.

27

Policy definitions are guiding business analysts when specifying services. Business functions must be aligned with business needs but also with business restrictions expressed as formal policies.

Later, at service implementation layer, policies are transformed into

Physical policy assertions for security, auditing, reliable messaging, etc. Authentication mechanisms to enable consumer identification Inter-operable protocols “Terms of use” Some system qualities such response time, exception handling, availability, etc.

In the next paragraphs we are discussing business and operational policies and then discussing more practical guidelines on how to define them.

2.8.1 Business policiesBusiness policies represent business needs expressed in a natural language. They are transformed by service designers into specific technological solutions allowing definition for technical components capable of policy assertions at implementation level. Policies are describing restrictions of service use in terms of certain constraints which are usually stemming from legal, regulatory, or managerial rules.

Services registered in artefact catalogues become discoverable to any interested party. The catalogues have unique capability to serve information about additional properties of each component. Among the others, service policies can be looked up in exactly the same way as other types of artefacts. Consumers have easy access to the “terms of use” of a service.

Business policies are probably already defined and used in the enterprise. They are defined in a written form and applied in day-to-day operations. Very often they are stemming from legal compliance, local laws, and industrial policies. The below examples are provided to better present this idea:

2.8.2 Operational policies“The Operational policy layer provides the deployed products and software infrastructure that realizes the business solutions and the use of policy within those solutions. “ (4)

Operational policies can be understood as executable assertions to enforce policy use. Sometimes they are defined upfront, but in most cases they are just a result of top-down decomposition carried out in the specification / implementation layers. The example policy mentioned in the latter chapter “EU Directive 95/46/EC” can later be transformed in an executable assertion as follows


Example: The Service must follow the mandate of the EU Directive 95/46/EC for personal data processingExample: The legal basis for the “EGR IS” is provided by Regulation (EC) No 177/2008 of the EP and of the Council of 20 February 2008 establishing a common framework for business registers for statistical purposes and repealing Council Regulation (EEC) No 2186/93 (published in the EU Official Journal on 5 March 2008).

28

2.8.3 Service Specification Description – how-toDefine service restrictions accurately so that the service may be used in compliance with current laws and organizational policies. At this stage don’t run into the trap of any kind of technical reasoning – this is a high-level definition which is refined into technical details at later stages of service description. Define it as if it was addressed to service consumers. Policies are registered in service catalogues. They are discoverable and accessible inter-institutionally and publicly to promote re-use and interoperability.

Usually you will derive restrictions and policies from existing governance procedures. Some new policies will stem from requirements related to legal compliancy.

Analysis of exposure for public and inter-institutional services contributes to this task. Some further functional restrictions which must be put in place are usually coming to light as a result of exposure analysis.

Analyse only the most important policies which have a real influence on the service delivery.

The service definition description may also contain a chapter to describe policies in more details

Restriction/Policy Description

Copyrights for statistical data

Eurostat has a policy of encouraging free re-use of its data, both for non-commercial and commercial purposes. All statistical data, metadata, content of web pages or other dissemination tools, official publications and other documents published on its website, …..

Use of confidential information in restricted rooms only

Confidential information may be viewed only by a trained staff or Eurostat in facilities complying with the norm XYZABC1123,…


Example: 128-bit long encryption key is used to encrypt personal information when transferring data over network.

Examples:a) Copyrights for statistical data – this is a kind of policyb) Use of confidential information in restricted rooms onlyc) The Service must follow the mandate of the EU Directive 95/46/EC for personal data processing

29

2.9 Service Input / Output“(CSPA160) The CSPA Service Definition is at a conceptual level. In this document, the capabilities of a Statistical Service are described in terms of the GSBPM sub process that it relates to, the business function that it performs and GSIM information objects which are the inputs and outputs.” (1)

Based on that definition this chapter is addressing the 3 rd quality from the above definition of CSPA (1)

Business Process which is used to deliver function Capabilities of a service provided Information which is being processed

Business analysts should keep in mind that the CSPA specification is created on the foundation of two pillars: GSBPM (2) and GSIM (3). The General Statistical Information Model (GSIM) is defining itself as follows: “[…]It provides a set of standardized, consistently described information objects, which are the inputs and outputs in the design and production of statistics. Each information object is been defined and its attributes and relationships are specified. GSIM is the result of a collaboration involving statistical organizations across the world in order to develop and maintain a generic reference model suitable for all organizations and meet the strategic goals (in particular the modernization effort) of the official statistics community“(3). In the context of some projects, the GSBPM and GSIM may appear to be too limiting and so business analysts may choose to use other types of process or information standards. Regardless of what is chosen to describe processes and objects, it is recommended to always go for well-established and standardized classifications. This is to facilitate communication between stakeholders belonging to disparate organizations.


Process typology provided by GSBPM creates a standard terminology which removes ambiguities between statistical organizations. The same concerns the GSIM. It defines standard terminology to use when defining data structures to represent statistical information. Today, there are many large organizations which produce large amounts of statistical information every year. Many of these institutions have created in the past their own way to define data structures. This as well involves naming conventions. To streamline communication between these organizations and enable effective integration between them, it is inevitable to choose a standard terminology which is why CSPA is much emphasizing the role of GSBPM and GSIM.

Why to define inputs and outputs at this stage?

Service Designer is using that information for specifying functions in more details at later stages of analysis – the service specification. This information is augmented with more detailed information about at the specification phase where GSIM implementation objects are refined into detailed structures describing information, but still apart from any particular technology (such as XML). At the implementation layer, the input/output objects are transformed into technical data structures which define document/message semantics at the highest possible level of details. Software designers use industry standards to define data formats, schemas and interfaces such as WSDL, XSD, WADL, etc.


30

What are risks?

Input/output should be defined in general terms, not diving into specific technical details.

GSIM is a standard proposed by the CSPA (1), but other standards exist in the statistical domain as well. The Data Documentation Initiative (8) and SDMX (5) might be considered as possible alternatives. In the next paragraph we are putting forward few examples to present this concept better.

2.9.1 Service Definition Description – how-toBusiness analysts provide descriptions for the input/output information in relation to analysed statistical problem which the service is targeting to resolve. Business analysts should choose which data structures are concerned and use vocabulary of GSIM (3) to describe information being processed. You may also consider using alternative terminology, but it must be clearly indicated. Only well-established standards such as DDI (8) SDMX (5) and similar which are developed by independent standardization organizations, should be used for describing the statistical information.

Describe input and output concisely in a statistics-related terminology. Enable communication between numerous international groups of interests distributed geographically. Use vocabulary and statistical data and metadata classifications defined by well-established organizations with an international reach. This concerns especially services exposed inter-institutionally and publicly.

2.10 PatternsCSPA, SMP@EC and other standards are defining good practices which should preferably be followed when defining service. CSPA (1) puts forward a number of principles which should be followed.

2.10.1 Output-driven designs“CSPA074: Ensure the whole statistical process is output-driven. Output is the reference starting point; the statistical production process starts from the output desired, that is from required products, and goes backwards, defining the various aspects of the process.”(1)

Business analysts should focus on the ultimate need of consumers which in case of statistical production, very often means that correct statistical information should be delivered. All definitions, specifications and later software implementations should be possible to track them back to that need.

2.10.2 Capture information as early as possible“CSPA106: Information should be captured in a standard structured manner at the earliest possible point in the statistical business process to ensure it can be used by all subsequent services.” (1)

This rule stems from the experience of statistical community where precious statistical information is not captured at the first possible occasion or lost due to extensive data transformation. Original raw


Example of an “input”: Inquiries for data sets contain either an identifier component (GSIM III.I Data Set) of requested data set or any other query to obtain specific statistical information containing concepts (GSIM III.I Concepts)

Example of an “output”: the service is providing data sets as defined by the SDMX [5] and GSIM [3] Paragraph III.I (data sets)

31

data should be captured and stored for further processing. Clear governance policies concerning information processing should be defined in advance, before the information is even collected to avoid unreliable information.

2.11 Anti-patternsAnti-patterns are well recognized bad practices which should be avoided whenever it’s possible. Ensure that your organization is not repeating common bad practices.

2.11.1 Big-bangThis anti-pattern is also known as “bite more than you can chew”. It is observed when SOA is viewed as a panacea, leading to push the change all the enterprise systems and architecture at once. Such a big bang adoption could result in failures that are then blamed to SOA (4).

Think big, act small – create a SOA adoption roadmap. Migrate services in a well-known domain first and advance slowly being always in a close cooperation with the SOA center of excellence.

2.11.2 SILO ApproachServices are identified based on isolated applications rather than a applying more holistic, enterprise focus, thus same services are identified by different groups with different names. As a result no common services or service sharing are realized (4).

In order to avoid it, establish SOA center of competence as soon as possible, introduce knowledge sharing and quarterly bulletin to inform stakeholders about the progress of SOA transformation. Establish governance procedures and work in a close cooperation with the SOA center of excellence and enterprise architecture team.

2.11.3 So what’s newLack of understanding of the differences between SOA and previous computing paradigms drives sceptical to claim that SOA is just a name for same old technique (4).

The SOA center of excellence collects knowledge from multiple domains, organizes training and shares knowledge regularly based on a defined education plan. It facilitates communication between international groups of interest spread over multiple continents.

2.11.4 Misbehaving registries“Duplicate service registries and overlapping, unclear ownerships result in governance nightmare and runtime confusion, potential bad performance and unplanned costs due to duplication”(4).


● ● ●

Ensure that your organization is not

repeating common bad practices.

● ● ●

32

3 CSPAService Specification“(CSPA162) The CSPA Service Specification is at a logical level. In this layer, the capabilities of a CSPA Service are fleshed out into business functions that have GSIM implementation level objects as inputs and outputs. This document also includes metrics and methodologies.” (1).

This part of service description is created as a functional service description. Herein, the service is treated as a black-box with its inputs, outputs and restrictions. Its functions are described in details; data being exchanged are refined and presented as GSIM implementation objects. We are also introducing techniques to describe required service quality. Some guidelines on how to assign metrics to enable measurement of service effectiveness are presented. Guidelines provided in the following chapters are mainly based on the European Commission’s SMP@EC SOA guidelines (6) and SOA Reference Architecture as defined by Open Group committee (9) and IBM (4).

Some additional concepts such as security, multilingual support and “assets for re-use” are discussed in the following chapters to give a better understanding about the scope of this analytical phase.

To start up with the service specification, we must first understand definition of business function. It is described by the GSIM (3) as– “something an enterprise does, or needs to do, in order to achieve its objectives”. Later in the document we are outlining some techniques aiding business function identification. All of them are focused on supporting providers in the delivery of business-aligned services.


Functional analysis lies at the bottom of any business aligned design. To facilitate consumer needs and expose functions which can bring a real value to them, it is unavoidable to conduct a top down functional decomposition which results in a service portfolio – a catalogue of functional capabilities.

The list of business functions defined in a service specification is used later at the implementation layer for defining detailed service contracts including invokable operations, their parameters, pre-conditions and restrictions. It is worth mentioning, that one service specification may potentially be used by multiple software vendors to deliver more than one software implementation.

What are risks?

Note that this chapter doesn’t discuss “how” to realize functions or how to technically support enough high quality of service by physical means. Service specification is an intermediary step between service definition and implementation which provides a clear view on “what” is required by consumers. It is much simpler to correct errors made at this stage than to modify detailed designs at later stages of analysis.

We want to warn business analysts about a risk of getting into a trap of chatty definitions where functions are defined based on bottom-up approach. Implementations such as CRUD 1operations are

1Create, read, update, delete (sometimes search) operations. Technical implementation of persistent storage and information lookup - http://en.wikipedia.org/wiki/Create,_read,_update_and_delete


http://en.wikipedia.org/wiki/Create,_read,_update_and_delete

33

translated directly into functions which have little correspondence to the business and business analyst has to challenge a tangle of technological information soup to describe required functionality. We are rather discussing top-down decomposition techniques and use case definition methods instead of talking about particular technologies.

3.1 Business Function Identification“The purpose of the Service Identification activity is to find suitable service capability candidates (implemented by service operations that are grouped under SOA services) to be included in the IT system to develop in order to support the business needs” (6)

In this chapter, a number of business function identification techniques will be presented to facilitate description of a business service.

These techniques are helping business analysts to develop a list of business functions which must be provided to consumers. In the next paragraph we are describing these techniques in a simple and practical way where we are using contrived business case as an example. In the end we are giving practical tips on describing service specifications.

Let’s imagine a business case which will serve as an example for later chapters. In this scenario a statistical data provider “YellowStat” wants to improve their operation. The business goal for this service is defined as:

“Survey collection process is ineffective due to not well-defined processes in the enterprise. Data are lost due to disparate systems used for data registration and processing. As a result, many business opportunities are lost and the statutory mission of the organization to collect and effectively disseminate information cannot be realized. The “YellowStat” is planning to expand its business in next years and increase the number of collected surveys by 50%. In order to maintain trust and information security certain European restrictions related to personal information security (EU Directive 95/46/EC) must be adhered”.

In the next paragraphs we are trying to decompose this case into a set of business function descriptions aligned with the business goals of our contrived organization.

3.1.1 Top down functional decompositionAt the very first step we should learn about statistical processes and try establishing a widely understandable vocabulary. The GSBPM (2) specification proposes a common statistical process typology. During the course of analysis it becomes clear that the statistical data provider is involved in three parts of GSBPM – statistical framework building, data collection and dissemination.


34

Business analysts are applying top-down approach to decompose generic business process into major functional areas. This decomposition process is applied iteratively to increase level of details as the analysis is progressing. Functional areas are further decomposed into more granular business functions which are aligned with business needs. Business functions are translated into fine-grained use cases thereby describing of the service through interactions between consumers and service providers.

On the diagram, the “Process Surveys” service includes three GSBPM sub-processes – building, collection, dissemination. Not all of them are yet analysed and only the “Build” part of the business process is described in details. The GSIM.3 sub-process is decomposed into two required business functions “Create new project” and “Define questionnaire”. These definitions are on too high level of abstraction at the moment, but are later iteratively refined into use case descriptions such as “Register Collector” or “Register Candidate”.

3.1.2 Goal-to-function identificationTop-down decomposition technique very well addresses functional requirements, but there is however a risk of business goals not being covered by such kind of analysis. For instance such concepts as a service quality expressed in terms of KPIs or requirements for reporting may not be and usually are not covered by functional analysis.

The statistical service definition should devote some space for business goals and KPIs definitions. Key Performance Indicators are measuring progress towards defined business goals (see “Business Goals”). Business analyst will usually work with subject-matter specialists, statisticians and business executives to define goals and KPIs.

At the later stages of a project, KPIs are translated into technical software components to carry out quantified measures which monitor daily operations and long term goals.


35

Here an example showing how KPIs and metrics can be derived from business goals:

Goal KPIs Metrics Function

Provide self-service functions for survey registration and processing

Surveys per month

Increased number of surveys processed

2 000 Register survey result

Register Candidate

Register Collector

Build trust among stakeholders by protecting identity

number of incidents / records viewed per month * 100%

Number of personal information reviews compared to a number of complaints regarding the data protection

0,1 % Activity Log supporting non-repudiation

Increase number of processed surveys and reduce time-to-market.

Duration of time between survey collection and data dissemination expressed in working days

Time between survey collection and data dissemination should be limited

2 working weeks

Reporting service

Business event log

Process measuring services

Provide self-service functions for survey registration and processing

Currently, survey collectors use spreadsheets for survey creation and registration. There is a central shared drive where surveys are supposed to be uploaded in a timely manner, but this process cannot be controlled and in effect many surveys are coming in lately.

The above requirements have already been analysed as part of top-down decomposition and now the identified business functions are once again asserted as to their business alignment. The result of such analysis represents a typical functionality of a service with a broad exposure and based on the principle of self-service.

Build trust among stakeholders by protecting identity

There are certain mandates put in place by EU institutions regarding personal information processing. Their enforcement requires techniques supporting non-repudiation and audit trails.

There are no functional requests coming from customers which could be a source of non-repudiation requirements in our contrived case. It is nowhere explicitly stated to provide such capability of the system. Only based on business goal analysis for “Build trust among stakeholders by protecting identity” it is possible to infer the need for monitoring of user activities or activity log to support non-repudiation.


36

At the implementation layer, this requirement is further decomposed to deliver software components which can be used for auditing. The component is tracking consumer activities and enables auditors to point out who and when accessed personal information.

Increase number of processed surveys and reduce time-to-market.

Remove bottlenecks in the process. Measure activities to provide inputs for reporting and process optimization. Business process at technological level can be monitored and alerts may be defined for overflowing KPI thresholds. “(CSPA173) The Platform for reporting is responsible for enabling real-time monitoring and near-real-time presentation of user defined business key performance indicators (KPIs). Examples of how this mechanism could be achieved are; Static Dashboard or Business Activity Monitoring (also generates alerts and notifications to user when these KPIs cross specified thresholds).”(1).

Reporting is one of the most neglected functions in functional software decomposition techniques. It enables organization to optimize its operations and achieve defined goals.

3.1.3 Outcomes analysis“CSPA113 […] A service is a representation of a real world business activity with a specified outcome”.(1)

“CSPA074Ensure the whole statistical process is output-driven. Output is the reference starting point; the statistical production process starts from the output desired, that is from required products, and goes backwards, defining the various aspects of the process.”(1)

Service capability (or service function) is largely determined by what is delivered to end consumers. Outcomes analysis comprises part of the input for functional decomposition.

To understand better how outcome analysis result can look like, let’s consider the following example of functional requirement:

From that the above, one could reason that there is a business function to derive new random variables and improve data quality through “data imputation”. It is defined in terms of the expected results of processing.

Output-driven analysis enables business analyst to better comprehend capabilities consumer’s requirements.

The strength of this analysis is that it addresses one of the main CSPA principles: “(CSPA074) Designs are output driven” (1). By applying CSPA principles, business analysts take benefit of the statistical community’s experience and avoid risks of building non-reusable applications with data redundancy and limited capabilities.


Example: The outcome of the service is comprised of survey results stored in the European unemployment record. The data is modified by the imputation and variable derivation according to the methodology XYZ to improve the data quality.

37

3.1.4 Analyze information, process and processing rulesSMP@EC (6) is proposing some further valuable techniques described in the chapter “4.2 Decompose Business Model” (6 p. 16) of that methodological guide. It assumes that business functions are derived from business entities model, business process model and business rules model. These techniques are taking into account information being processed, processing rules and the business process as a whole.

Top-down decomposition methodologies ensure the right service granulation and alignment with business goals whereas techniques defined by SMP@EC are focused on details.

3.1.5 Notes on service granularity"The overall quantity of functionality encapsulated by a service determines the service granularity. A service's granularity is determined by its functional context"2. Service functions decomposed top-down determine service granularity and this is a primary principle. Business functions should correspond and address a business need. A secondary rule is to “model as coarse grained as possible. While fine-grained services are also possible, ultimately, the challenge is to find the balance between coarse and fine grained services that meet the business needs” (4 p. 125). Top-down and goal-to-function techniques are helping in reaching that aim. Additional valuable information on that matter may also be found in the SMP@EC (6 p. 26) (What is the "Correct Service Granularity").

3.1.6 Service Specification Description – how-toThe results of service identification comprise a list of business functions. Each of the business functions should be shortly characterized from the perspective of interactions between consumer and the service provider. Business functions are later assigned metrics to measure business performance. This topic is discussed more closely in the “Metrics and Key Performance Indicators” chapter.

2http://serviceorientation.com/soaglossary/service_granularity


Example:

Business Function: Register Survey Result

Description: Survey collectors register result of a single survey in the system for further processing. Once a set of surveys assigned to a collector is registered, the data undergo a process of data imputation (augmentation) and are prepared for further dissemination.

Metrics: time between survey collection order and survey result registration is measured to monitor the KPI “survey results should be registered within a period of 2 working weeks from the moment of collection order“

http://serviceorientation.com/soaglossary/service_granularity

38

3.2 Input / Output / MetadataCSPA072“Ensure the design, composition, operation and management of business processes, including all input and output interactions, are metadata driven and automated wherever possible.”

Data input / outputs are defined at a high level as part of service definition. In service specification, the same information should be described at a higher level of details. GSIM builds a terminology and describes relationships between concepts which are common in statistical data descriptions. This concerns both data coming from statistical experiments and metadata which define structures to contain concerned information.

The CSPA074 principle defined as “Ensure the whole statistical process is output-driven. Output is the reference starting point; the statistical production process starts from the output desired, that is from required products, and goes backwards, defining the various aspects of the process.” prescribes that statistical service / function specification should be output-driven. The analysis should focus on the final information provided to consumers as part of the service. This is a kind of bottom-up approach way of thinking. All requirements for the service should be traceable back to the requirements defined for information delivered by the service.


Information contained in this chapter is at later stages of service description (physical layer) used to describe canonical data models and communication protocols. Usually services are invokable and available through the use of web services or by other means of communication. WSDL, XML schema definitions, interface descriptions and message descriptions are created from specifications developed during the specification phase.

3.2.1 Service Specification Description – how-toThe service specification description which results from service analysis is describing data models in terms of GSIM (3). Business analysts build conceptual models to describe information and base them on the GSIM terminology.

In situations when models already exist, they are translated to the terminology of GSIM or any other standard terminology agreed to be used at Eurostat. GSIM is an agreed standard developed by numerous independent statistical institutions to streamline communication between them.

Below we are presenting a technique based on GSIM to develop a brand new data model. We are using a contrived example of a statistical research regarding unemployment in the European countries. That example is used to demonstrate techniques for data structure definition.

Let’s assume that the unemployment information in member countries is communicated through national statistical institutes and then process globally by Eurostat on behalf on every member country. We must make sure that data can effectively be described using agreed terminology – in this case we will use GSIM.


39

Note: Please note, that this is a contrived scenario and results of this analysis shouldn’t be used in any real production processing.

Usually results of such exercises are already available in the form of data model specifications (such as SDMX-ML). Business analysts should in general take benefit of existing well-established data models and standards before developing new ones.

Scenario

Our task is to build up a data model to contain information about unemployed population across Europe. First, let’s have a closer look at the basic terms defined by GSIM and express them in a more informal way – easier to comprehend:

GSIM defines among the others the following foundation terms:

Unit - A particular person or business for which required characteristic is measured

Unit type – a way of identifying abstract type of particularities – persons, small businesses, etc

Population – a group of particulars – a group of units – for instance population of all small businesses in EU with yearly turnover less than 1M EUR.

Variables – measuring particular characteristics about population (e.g. unemployment)

Categories – defining characteristics (e.g. male, female)

Category Sets – collections of categories ( {Male, Female} )

Code Lists – codified collections of categories (Male = M, Female = F)

Data Structures- capable of holding required data sets.

Concept- unit of thought differentiated by characteristics

Let’s try identifying basic business objects for our scenario. We will begin the analysis to identify “Concepts” required to represent information we are interested in.

The GSIM specification defines concept as one of variable, population, unit type, category – they are all defined by the GSIM specification (3).


40

Below is the summary of GSIM “concepts” applicable in our situation:

Concept Description

Variable Unemployment

Variable Temporal

Population Adults in production age at the area of EU12 – 12 members of European Union

Unit Type Number of unemployed

Categories for “Number of Unemployed” Male, Female

From this definition we can create code lists of male and female and define them for instance as

Sex := {Male = M; Female = F}

An example data set may be used to better comprehend required data structures

EU12

Male Female

2010 Q1 7% 12%

2010 Q2 6% 10%

2010 Q3 6% 11%

2010 Q4 8% 14%

The dimensional data structure to describe such information could be formed of a cube with three dimensions: temporal dimension, sex, geographical location and a measure of unemployment.

Concept

Dimension Temporal

Dimension Sex

Dimension Geography


41

Measure Unemployment

This is just an example of information analysis which is based on GSIM. It could be extended to more complex data structures and applied more widely in the organization to establish GSIM-like way of thinking.

Service Specification description usually contains only results of data model analysis. Data models are depicted in the form of diagrams or described textually. In any case, the model should be defined through the GSIM or any other well-established terminology.

The presented scenario shows a very simplified procedure which could be followed based on existing statistical methods.

3.3 Metrics and Key Performance Indicators“CSPA162–The CSPA Service Specification is at a logical level [...]This document also includes metrics and methodologies.”

Key Performance Indicators help organizations to measure progress towards defined business goals. They are identified based on the vision and business goals.

KPIs can be understood as quantified measures of achieving business goals expressed as a percentage or progress indicators (for example it can be a number of data sets processed per month).

They are used to detect bottlenecks in process optimization.


Business goals, metrics and KPIs together are aiding business analysts in describing system qualities such as performance, reliability, inter-operability, etc.

At implementation layer KPIs are transformed into physical mediation or notification components which are build-in BPMN or BPEL processes. They enable organizations to monitor their processes and optimize them. Daily process activity is controlled in a real time and reports for achieving long-term business goals may be generated for decision makers.

What are risks?

Not defining performance indicators is a common mistake as not doing so makes process measurement and optimization impossible. This affects, to a high degree, service manageability and governance. It makes impossible to verify how business goals are approached. Reporting becomes a difficult task.


42

3.3.1 Service Specification Description – how-toThe list of business goals developed in a service definition should be now decomposed further.

Let’s take the previous example of business goals from the service definition.

If the business goal is to provide statistical data to public, then think about measuring the number of consumers who access the information. Find out about assumptions – how many institutions should be able to access the information in the 1st year of operation, what response time is acceptable for consumers (1 hour, 1 day, 2 weeks, etc.).This is only an example which can be extended to other situations in your project.

Very often KPIs are already defined by decision makers – in this case the task is simplified and it is only to provide KPI definitions as a reference here.

Provide a bullet list of the top important key performance indicators identified in the course of analysis. Take into consideration the whole business process, which may be spread over multiple institutions. Details of how much time should each significant step in the process take (e.g. 2 days, 5 minutes) can be defined later. Process administrators will use indicators to monitor executions of real processes.

KPIs – must be quantified – expressed in numerical form like percentage or indication. Consult the following examples

A corresponding chapter of service specification may contain additional descriptions in a tabular form as in the below examples:


Example: Goal 1: Disseminate statistical information to the public.

Example:Service Usage –number of institutions using this service

Example:Dissemination Deadlines – time spent between acceptance of the statistical data coming from new surveys and the moment the data are available to public

43

KPI Description

Service Usage

number of institutions using this service

10000 requests for the top 10 data sets will be serviced the first year. Every next year is going to bring 10% increase in the number of requests

This describes the business goal (not a performance quality). It is used for measuring how far that business goal is away.

Dissemination – deadlines

time spent between acceptance of the statistical data coming from new surveys and the moment the data are available to public

There are certain deadlines in the dissemination process. For instance there are certain data sets in the Eurobase database which should become available to the public very quickly – in less than 2 working days.

3.1 Business ProcessBusiness process modelling is a long term activity aiming at effective service provisioning where one task may be handled by one or more actors working for multiple organizations. People handle the task in a well-defined workflow which is described using formal languages like BPMN. This is a set of linked together business functions (asks) bringing an added value to consumers. IBM gives more precise definition of business process in their SOA Whitepaper:

“As a gross generalization, a service is a repeatable task within a business process. So, if you can identify your business processes, and within that the set of tasks that you perform within the process, then you can claim that the tasks are services and the business process is a composition of services”3

Statistical business process is on a very high level of abstraction described by the GSBPM (3). By applying knowledge collected in the GSBPM you take benefit of collective experience of statistical community. Terminology used in GSBPM is widely recognized and enables communication for people of different background.

3.1.1 Service Specification Description – how-toDecompose your business process into pieces and relate them to GSIM Level 2 and 3. Provide descriptions about how the service is realizing these parts of processes.

3 IBM’s SOA Foundation “An Architectural Introduction and Overview”


44

Such analysis should be supported by the results of business process analysis (e.g. in Eurostat BPM@EC) using the BPMN notation. Service specifications may depict one or more processes where the service is supposed to be used.

3.2 SecurityThe CSPA (1) specifies one of its fundamental principles as “Principle CSPA064: Maintain community trust and information security. Conduct all levels of business in a manner which build the community's trust. This includes the community's trust and confidence in the statistical organization's decision making and practices and their ability to preserve the integrity, quality, security and confidentiality of the information provided.” (1)


To build up a trust it is inevitable to ensure correct protocols are followed. Data delivered inter-institutionally is coherent, integral and of high quality. Confidential information is accessed


45

only by entitled parties. Organizations are tracing consumer activities by establishing means of accountability which is often referred to as “non-reputability”.

The result of this analysis is used later at implementation layer for defining authentication protocols, auditing components, and for selection of appropriate encryption techniques. This may even go as far as to create new deployment scenarios where multiple instances of a software component may be physically deployed in a separation from each other to protect confidential information.

Service exposure and policies are the main entry points for security requirements definition. Existing policies, exposure decisions and compliance rules are building blocks for the security requirements. Learn more about links between exposure decisions and the security from the chapter “2.6Exposure”. The security requirements should be defined in terms of

confidentiality of information non-repudiation Authentication self-registration trust domains resource-level authorization

Risk analysis

The principle CSPA150 mandates a risk analysis regarding the security aspect of service provisioning. This concept is described closer in the “3.2.7Risk assessment (CSPA150)” chapter.

3.2.1 Confidential information“Confidentiality refers to the protection of individuals' and organizations’ information, and ensuring that the information is not made available or disclosed to unauthorized individuals or entities” (5)

Usually requirements related to confidentiality ask to protect information against information disclosure by un-authorized parties. This is achieved at a technical level using cryptography-based techniques which are exploited to ensure that no unauthorized access is possible both in terms of in-transit interception and data storage. For classified and sensitive information there are even stronger obligations. Such information cannot usually be stored outside of European Commission and can only be viewed in restricted facilities – rooms equipped according to strict security norms.

Definition of such requirements at early project stages allows service designers to define mechanisms applicable to a particular situation.

3.2.2 Non-repudiationAccording to the Webopedia non-repudiation is “In reference to digital security, non repudiation means to ensure that a transferred message has been sent and received by the parties claiming to have sent and received the message. Non repudiation is a way to guarantee that the sender of a message cannot later deny having sent the message and that the recipient cannot deny having received the message” .

Such guaranties are achieved in two ways


46

Access Audit

Access to classified information is tracked and audit reports are created based on such data. At any time the security officer may prove actions taken by a particular user. This way, the service users cannot deny actions they perform on a data.

Electronic Signature

Electronic signature gives the highest possible guaranties for document integrity and non-repudiation in the digital world. It guaranties that information is not altered in transit and that a person who claims document creation is really that person. Unfortunately introduction of electronic signature requires deployment of very costly and complex solutions like public key infrastructure. It also makes life of consumers much more difficult, because computer programs requiring digital signature may be demanding for average non-technical person. Digital signature should be considered only in situations when it has juridical background or a very solid justification.

3.2.3 AuthenticationAuthentication is the act of confirming person’s identity.

“Authentication allows a service provider to confidently identify consumers of a service. These might be end users, other services, processes, or computers. Many authentication techniques and standards exist. Typically, traditional applications at EC already use a centralized system: ECAS (European Commission Authentication Service). In the context of this Reference Architecture, ECAS is the de-facto authentication provider to any SOA.” (7)

Service providers with a local, domain (Eurostat) or European Commission exposure are encouraged to take benefit of existing security solutions such as ECAS (European Commission Authentication Service) which is a security gate for European Commission.

Service providers delivering outside of EC – inter-institutionally and publicly should consider authentication requirements carefully. The following points should be taken into account when planning service delivery:

3.2.4 Trust domainsProcesses spanning multiple institutions are executed by people working for disparate organizations or units. Inter-institutional cooperation may only be possible with a trust and confidence binding stakeholders together. At a technical level, such relationship can be maintained by building trust domains between IT systems. Wherever two organizations are cooperating closely and their IT systems must be integrated, a trust domain is established so that consumers in one domain can access resources in another domain without a need for double authorization. Many modern middleware products are supporting maintenance of trust domains out of the box. They provide capabilities to reduce user identification burden by applying user mappings between systems.

Maybe the most important thing to point out when thinking about trust domains is that their successful implementation is determined by formal arrangements between parties. Trust must first be established by decision makers and only then technical solution may be created. Their deployment requires strict policy governance in a long term.


47

3.2.5 Self-registrationServices available to public with “on-profile basis” customized experience usually need some means of user authentication to maintain user profiles and properties defined separately by each user. In these cases, business functions such as self-registration, password management and profile management should be provided to reduce operational efforts related to customer management.

3.2.6 Resource-level authorizationResource-level authorization assumes role-based access to resources. Consumers are categorized and each of them is assigned roles. Service consumers are assigned roles granting permission to a small chunk of information.

3.2.7 Risk assessment (CSPA150)“(CSPA150) For the purpose of this document[CSPA], the security concern relates to controls that are put in place to mitigate the risk that a CSPA Service or the data it controls is misused. This section provides some basic guidance on some of these controls. However, in general it is strongly advised that each CSPA Service implementation complete a Risk Assessment and document a Risk Mitigation Plan for high and extreme risks identified in the assessment.” (1)

Security of information and trust in information exchange is paramount for statistical data processing at European level.

Business analysis should anticipate a separate description discussing possible security risks, especially with regard to personal or confidential records. This part of service description may as well have been defined already in some other documents such as a corporation security policy (“security convention”). In such cases, the service specification could only refer to an existing document.

Example of risk analysis:

Risk definition Definition Mitigation

Leakage of personal information to unauthorized parties

Personal information is protected under the directive EU Directive 95/46/EC. Personal data must not be available to unauthorized parties

Confidentiality of data;

Personal information may only be viewed in restricted rooms

Data transmission is confidential

Data leakage regarding companies

Companies’ information must not be available to unauthorised parties, or must not be published as it is.

Confidentiality of data;

Companies information may only be viewed in restricted rooms

Companies’ data must not be published as it is, and cannot be derived from aggregated values.

Data transmission is confidential


48

3.2.8 Service Specification Description – how-toService specification should be defining security requirements in at least the following categories

Confidential information Non-repudiation Authentication Self-registration Trust domains Resource-level authorization

Such information should be described in a separate chapter of service specification to addresses all security-related topics. Furthermore, ancillary requirements concerning required security infrastructure, such as security hardware (e.g. smartcard) or the need for secured facilities.

Vulnerability analysis can be conducted to find out about security flaws and anticipate requirements to remove them.

In the context of user identification, business analysts are encouraged to analyse feasibility of using existing ECAS solution.

Requirements for audit trails, non-repudiation mechanisms, security reporting should become a part of service specification.

Implementation of trust domains requires that certain formal agreements must be made between stakeholders to enable software integration where trust domains are configured.

3.3 Multilingual supportServices delivered inter-institutionally and publicly are accessed by people of different origin and spread over continents or even the globe. Such services should be personalized for consumers. The following topics may be taken into consideration when defining multi-lingual services

Compliance to local laws – local regulations regarding the use of official languages may influence requirements for a service

GSBPM – certain steps of business processes, especially data collection (GSBPM 4.3) and dissemination (GSBPM 7.2) may be executed by various international groups of people; one example is surveys enquiries which are addressed to average citizens where surveys written in one language cannot be re-used between countries.

Meta-data – data consumed and delivered are described in multiple languages; presentation of such data occurs often in premises of National Statistical Institutes, but also publicly – by any statistical data consumer. A special care must be taken to provide translated metadata for statistical information to increase the range of consumers.


● ● ●

CSPA061 Capitalize on and influence national

and international developments

● ● ●

49

Artefact Catalogues – inquiries for service catalogues may be formulated in multiple languages. Policies and metrics should be addressed to major groups of consumers and certain descriptions need to be translated

Customized communication – communication for public and intern-institutional audience, such as e-mail subscriptions, surveys, etc. may need to be translated in multiple languages.

3.3.1 Service Specification Template – how-toDefine requirements for internationalization based on the exposure decision. Try to address all categories which could potentially prevent service re-use internationally.

3.4 Identifying assets for re-useStatistical organizations have over years developed various IT systems to support their business. Software solutions, standards, physical platforms are exploited in the context of statistical information collection and dissemination in a controlled manner. Each of these assets represents a part of statistical process - a small piece of business encapsulated in it. The data, software and procedures are investments which must not be disregarded. They are valuable assets which can be used to leverage the current business.

The following chapters describe some basic techniques to identify existing assets for re-use.

3.4.1 Bottom-up techniqueThe bottom-up decomposition technique is also known as “analyse existing IT Assets” and is described in SMP@EC (6 p. 21). To summarize, this technique boils down to creation of service portfolio for existing assets which might potentially be used to support business on a wider scale. This is a kind of map which could be used to find functional capabilities for re-use. In later chapters we are introducing the concept of service catalogues which are planned to catalogue all existing software components, policies, data formats etc.

3.4.2 Artefact catalogues“The pillars of service oriented architecture are the service repository, the service provider, the service consumers, the service description, the service policies and governance.” (7)


Examples: * Data set descriptions in query results are provided in three EU official languages *Local laws in Southern Republic of Africa mandate use of encryption when storing and transmitting personal information. This concept is detailed in the “security” chapter of this template * All the e-mail communication for subscribed consumers is provided in the three EU official languages * Meta-data for the inter-EU information on the import of goods must be provided in three official languages

50

Re-usable and shareable services are registered in artefact catalogues. Depending on services’ exposure they are registered in global or local catalogues. They enable service discovery and enable access to asset databases using user interfaces or machine-to-machine standard protocols (preferably the UDDI). Each catalogued service or application has a policy of use and related performance metrics. Services can easily be looked up based on key words from service descriptions.

Technical data format definitions expressed as XML schemas can be shared and re-used. New data types can be defined based on existing data types and shared.

Read more about catalogues in the governance guidelines for SOA.

Catalogues are split into two categories:

Local catalogues

They are used for service discovery only within one institution (European Commission); these services are so specific that they cannot or should not be exploited to support inter-institutional processes; only services which realize tasks solely for a particular institution (such as Eurostat) are registered there. In the beginning of SOA adoption they will contain information about assets which are non-interoperable and do not apply well established international standards, but are good candidates for service exposure and can be provided to a broader range of consumers. Services with a local exposure cannot be accessed inter-institutionally.

Global Artefact Catalogue

It is the paramount tool of SOA center of excellence to exchange information about re-usable assets. Services registered in that database are interoperable, because they apply well established and agreed standards. Their lifecycle is defined and managed in the catalogue. Policies regarding their operations are described in terms of standard policy assertions (such as WS-Policy) and “terms of use”. Essentially, this catalogue contains only assets which are re-usable inter-institutionally and publicly.

3.4.3 SOA AdoptionIn a period of SOA Adoption, institutions joining CSPA initiative will register their assets in local catalogues. These local catalogues are used by the institutions to lookup resources for local re-use. Legacy applications and other software solutions supporting the statistical operations are wrapped and exposed as interoperable services. Lifecycle and policies are defined in cooperation with SOA center of excellence which tackles high level of interoperability and streamlines communication between independent entities. Such wrapped services are later certified and registered in the Global Artefact Catalogue.

3.4.4 Service Specification Description – how-toUse local and global artefact catalogues to find out about existing assets for re-use. Access them by the means of user interface or UDDI – to lookup statistical services or databases. Search for assets which can be re-used or wrapped. Plan to expose existing assets as a service.

Create a service portfolio containing service descriptions of all local services. Identify each service so that it is uniquely identifiable.


51

Discover services being a part of global process in the Global Artefact Catalogue. Candidates for service wrapping and specific services provided locally are looked up in a Local Artefact Catalogue. Provide a name or reference to each service for unique identification.

3.5 Applicable Statistical MethodologiesCSPA069“Consider all capability elements (e.g. methods, standards, processes, skills, and IT) to ensure the end result is well integrated, measurable, and operationally effective.”

Statistical services are different from other types of business activities in a way that they process data according to well defined methods, which are almost always based on a long mathematical research. All the knowledge required for service provisioning is used behind the scenes of statistical processing.

Information about statistical methods in use constitutes the inherent part of service specification. The statistical methods used for data transformation and processing should always be well documented.

3.5.1 Service Specification Template – how-toDefine statistical methods to streamline communication between entities. Make this information discoverable through Artefact Catalogues.

If statistical methods concerning the particular service are already defined in other documents, then a reference should be provided as part of service specification.

Consider to summarize methods which are exploited- such as social science methods (cross-sequential, surveys, correlational research, experimental research, etc.) or mathematical methods (hypothesis tests, estimators, regression, etc.).

3.6 Patterns

3.6.1 Use agreed models and standards“CSPA105: All information used as inputs and outputs to Statistical Services should be described using a common, business-oriented, reference model.”

Interoperability and re-use are the pillars of SOA and CSPA. They can be supported only by applying well-recognized standards, preferably defined by independent committees. There should be one standard to represent information in a uniform way to enhance communication between stakeholders cooperating internationally.

3.6.2 Service cataloguesCreate a roadmap to build service catalogues. They will used to keep track of all software components supporting statistical processing.

3.6.3 Describe information and provide metadata“CSPA072: Ensure the design, composition, operation and management of business processes, including all input and output interactions, are metadata driven and automated wherever possible.”

Statistical information without meta-data is meaningless. It is just a set of digits and numbers. Meta-data give a sense to the information. Assuming that statistical information is collected and then


52

disseminated among multiple, independent stakeholders, meta-data should be defined in a language understandable by all parties.

3.6.4 Maintain independence between design and implementation“CSPA135The descriptions of CSPA Services are layered in conceptual (CSPA Service Definition), logical (CSPA Service Specification) and implementation (CSPA Service Implementation Description)”.

Design first, implement later. Development of new services should be driven by business goals. Service specification is a technology-agnostic service description for business aligned ventures in the statistical industry. One specification may result in multiple implementations.

3.6.5 Maximize Service Autonomy“CSPA140:Maximize service autonomy (completeness) to enable share-ability and reusability (External & Internal).”

Services should preferably be independent from each other, so they can be governed as completely separate entities. Communication between services occurs only through well-defined channels constrained by policies. Service implementation defines formal contracts (so called service contracts) to define boundaries for each service.

3.7 Anti-patterns

3.7.1 Bottom-up CRUD 4servicesA data-centric architectural style is applied to define new services. Service functions are defined based on information structures or bottom-up identification only. The software analysis doesn’t include steps for top-down functional decomposition and a resulting service would not be aligned to the business needs and thus not achieve expected results.

3.7.2 Web service = SOA“When architects equate SOA with web services they run the risk of replacing existing APIs with web services without proper architecture. This could result in identifying SOA services that are not business aligned” (4).

3.7.3 Technicalities everywhereService specification describes all, even the smallest, details of service operation. Algorithms used to process information are depicted on detailed activity diagrams, and there is a lot of focus on handling technical errors. This approach for service specification opposes the idea of separation of concerns. Service definition, service specification and service implementation should be described separately. The service specification is addressing functional definition of a service from the consumer’s perspective, where the system is described as a black box with its inputs and outputs.

3.7.4 Deploy and forgetNew services are planned without a vision and a little or none business alignment. Business analysis for a new service is superficial. Business processes are unknown or not well documented. There are no measures defined to monitor progress in achieving business goals. Once the service is in production, bottlenecks in the operation occur often and no remedies can be applied due to a lack of knowledge

4Create, read, update and delete


53

about the process. Service monitoring and reporting were not planned to be implemented and so service operation cannot be optimized.

3.7.5 Chatty services“This antipattern describes a common mistake developers usually make when they realize a service by implementing a number of web services where each communicates a tiny piece of data. This will result in the implementation of a large number of services leading to degradation in performance and costly development” (4).

3.7.6 Performance optimization firstMany software architects have bad experiences with the early XML processing stacks which used to have a low performance profile. Processing of large data amounts often resulted in application server out of memory issues and desired results were achieved a few orders of magnitude slower than in corresponding legacy solutions. New legacy protocols were developed to overcome limitations of XML and SOAP. It resulted in the point-to-point integration style where only two applications can operate together, but are not interoperable in a general sense.

Modern middleware solutions have a long time ago overcome limitations of the first implementations. Services should be planned according to business needs, based on top-down functional decomposition. Performance is an important non-functional characteristic, but optimization is only a small part of IT operations. It is much more expensive to maintain multiple non-interoperable applications in the long time horizon than to optimize well designed applications applying principles of separation of concerns.


54

4 CSPA Service ImplementationThe statistical service implementation carries out service description on the last layer of CSPA. It is defined by CSPA as “(CSPA164) The CSPA Service Implementation Description is at an implementation (or physical) level. In this layer, the functions of the CSPA Service are refined into detailed operations whose inputs and outputs are GSIM implementation level objects. (CSPA165) This layer fully defines the service contract, including communications protocols, by means of the Service Implementation Description. It includes a precise description of all dependencies to the underlying infrastructure, non-functional characteristics and any relevant information about the configuration of the application being wrapped, when applicable.” (1)

This service description is devoted to outline architectural designs of a proposed software solution which will support a business service. It conveys information required by software implementers to create executable software components. Software implementation in case of SOA has an extended meaning in comparison to traditional programming models. SOA assumes a model in which applications may be created not only by applying programming techniques, but also by exploiting executable components. For instance Oracle Fusion SOA products provide some components to construct “composite applications” by linking together standard building blocks like mediation components, BPEL processes, human tasks, callouts for external systems, etc.

To summarize, the 3rd part of service description is about describing the:

Detailed operations (service contract) Inputs and outputs (GSIM) Communication protocols (including patterns like publish/subscribe, request/response) Dependencies to underlying infrastructure Non-functional characteristics Configuration details Metrics


55

In the next chapter we are providing guidelines about service implementation description for the above mentioned points.

With regard to “(CSPA156 – process metrics)A CSPA Service will generally capture metrics related to the function that it performs. To all intents and purposes, these process metrics are treated by the CSPA Service as just one of its outputs and should be reflected as such in the CSPA Service Specification.” (1)

The implementation description is also addressing at technical level requirements to measure performance of business (business process metrics, KPIs).

4.1 Invocation protocols“(CSPA114) CSPA Services have invokable interfaces that are called to perform business processes.” (1)

There are various protocols which may be used for service invocation. Despite the wide choice of communication protocols, vendors should make a firm decision as to which protocols are to be supported. That should be decided based on the range of consumers interested in the service use –the service exposure. Furthermore, constraints described in policies and governance regulations should be considered when planning communication channels.

Existing SMP@EC (6) standard suggest to choose only one or at most two protocols to provide uniform access to functionality. This is so called “canonical protocol” and “dual protocol”.

“To avoid supporting different communication protocols that might compromise interoperability, a SO architecture should establish a single


56

main protocol. Most common protocol used in SOA is SOAP over HTTP. If a single communication protocol is not feasible for some reasons, check the Dual Protocols pattern[…]. However, this might not always be feasible for different reasons like performance issues or required transaction support. With the Dual Protocols patterns, the inventory will contain two levels of services” (6)

Based on that is advised to

Decide about primary protocol – preferably HTTP and SOAP Choose a secondary protocol


Vendors should make all efforts to use only standard and well recognized communication protocols to foster re-use and interoperability. The less diversity in that matter contributes to lower costs and better interoperability.

What are the risks?

Use of non-standard protocols creates an obstacle for service reuse (CSPA045). Software designers may succumb to temptation of using non-standard protocols, especially in situations where performance is paramount or functional requirements are specific, however this is a short-term strategy. In cases where non-standard protocols are used, sooner or later, organizations which want to re-use existing assets, start exploiting techniques for protocol transformation which are non-trivial mechanisms leading to increased computing power consumption and lower data quality. In the long-term no savings are made when using non-standard protocols.

SOA is focused on interoperability and re-use. Performance optimization is in SOA only one of numerous techniques which are applicable only at technical (implementation) layer. In other words, performance is not the primary concern of SOA. Vendors should always prefer standard and well established communication protocols over legacy solutions with local exposure.

Implementation of various different invocation protocols is increasing complexity and cost of maintenance and management. Lifecycle of service is becoming increasingly complex when multiple channels to access exposed services are in use.

4.2 Supported protocolsThe following sections describe multiple data transfer protocols that can be exploited as a transport layer for statistical information. These protocols are described in a separation from data formats which will be described in Canonical Data Models and Non-canonical data models chapters.

4.2.1 SOAPWeb-service calls exploiting SOAP and HTTP is currently the most standard and the best documented way to invoke web-services (see also supported protocols described by SMP@EC (6))

It is advised that schema definitions used for WSDL definitions are externalized to separate files and included in WSDL, so that common data types are defined externally and re-used across organization. WSDL and XML schema definitions may be published in artefact catalogues for easy re-use.


57

SOAP, originally defined as Simple Object Access Protocol, is a protocol for exchanging structured information in the implementation of web services in computer networks. It has been historically developed as a simple protocol to transfer information about objects between computers and that simplicity resulted in numerous interoperability issues. Fortunately these problems have been addressed by OASIS in the WS-I5 standardization initiative which defines certain rules to obey for interoperable web-service provisioning. In particular, the WS-I basic profile mandates to use only document/literal or RPC/literal bindings. Software architects should take into account limitations of the raw SOAP and make sure that their implementations are applying one of WS-I profiles. Usually WS-I compatibility is ensured by application server vendors.

Modern application servers are implementing very robust mechanism to overcome former performance issues related to XML processing6.

4.2.2 RESTThe Representational State Transfer (REST) protocol has gained a wide acceptance across the Web. It is much easier to implement REST-full web services than SOAP web services but on the other hand, REST doesn’t enforce any strict message specifications which may lead to inter-operability issues. For instance, there is no widely accepted standard similar to WSDL to define REST-full interfaces. Furthermore, the REST protocol doesn’t define itself any message formats (XML, JSON, plain text, etc.) to standardize communication which is a flexible feature which may unfortunately cause interoperability issues. REST is only a standard defining service addressing and some basic dependency on existing protocols (HTTP). The REST is not maintained by any standardization organization yet.

It is advised to use the REST protocol only if there is a need to invoke web services using hand-held devices with limited capabilities or Web components executed in internet browsers (JavaScript). It should only be considered as a secondary protocol and never as a primary protocol.

4.2.3 MessagingThere are also a number of other protocols which are acceptable. These are:

Microsoft Message Queue - Service is a MSMQ consumer Java Messaging Service - Service is a JMS consumer

There are many other already existing and well established messaging products available on the software market. Just to mention few of the most popular products:

Oracle Fusion IBM Websphere MQ Microsoft MSMQActiveMQ (open source) RabbitMQ (open source)

What’s interesting about messaging?

5http://ws-i.org/ and http://www.oasis-ws-i.org/This initiative has historically been started by Microsoft, IBM and Sun to overcome compatibility issues in web-service stack implementations of their application servers.6JAX-WS, operating at XML message level Why to use STAX API


http://docs.oracle.com/javase/tutorial/jaxp/stax/why.html

http://docs.oracle.com/cd/E17904_01/web.1111/e13734/provider.htm

http://www.oasis-ws-i.org/

http://ws-i.org/

58

Messaging middleware software has established its own place in the cartography of integration solutions. Usually these solutions have interesting properties which significantly ease integration between disparate software solutions

Guaranteed delivery – messages which cannot be delivered immediately, are stored in a durable data repository and re-delivered automatically later based on pre-defined policies

Transactions – operations to send or receive message can be enlisted in a global transaction encompassing one, two or more ongoing transactions with databases of different vendors7

Easy transfer for large payloads – messaging middleware solutions often provide functions for message fragmentation and compression which enable transfers of large messages reliably and at a high performance

Clustering and load balancing – with messaging it is easier to balance high load among multiple servers; there is a number techniques exploiting messaging to avoid overloads

There exist a series of enterprise integration patterns (10); this is a set of well-defined and tested techniques which are based on messaging solutions

Risk: messaging is not a specification

Unlike SOAP and REST, messaging is not a formal specification, but rather a class of software solutions which are implementing different standards - very often closed or legacy standards. The messaging software solutions are operating at network layer and they don’t define any particular message formats. Usually, only textual or binary messages can be transferred and their structure definition is left to software vendors. This is the role of software architects to design which canonical or non-canonical formats describe information structure, whether to use SOAP, SDMX-ML, CSV or any other information format.

Software vendors should always describe message formats whenever discussing message interchange using messaging middleware products.

Risk: products are non-interoperable

Unfortunately, products of different vendors have limited interoperability out of the box. For instance MSMQ cannot communicate with IBM Websphere MQ directly without employing additional bridging mechanisms. Not every vendor provides bridging solutions as part of their standard deployment packages.

7 This is possible thanks to the two-phase commit protocol defined by X/Open XA (Open Group)


59

This limitation may be overcome with a help of JMS 8adapters so that messages in one communication channel are translated into messages in another channel through a JMS adapter.

In recent years there have been efforts to standardize message exchange protocols between multiple vendors but they are not enough advanced yet. The STOMP 9and AMQP10 are the most promising proposals, but none of them is widely implemented in any of commercial products yet.

When planning a messaging solution, the above limitations must be taken into consideration. Costs for commercial middleware software deployments for all possible consumers have to be calculated to avoid situations where communication is becoming impossible because of budgeting issues. This applies especially for services with inter-institutional and public exposure. Specifically, the national statistical institutes may have limited financial capabilities.

Make planning for messaging gateways to enable interoperability. Consider efforts of the OASIS 11 to standardize AMQP protocol when choosing messaging or bridging capabilities. Take into account different financial situation of national statistical institutes (NSIs) and other potential consumers.

Publicly available services should not assume the use of any non-standardized middleware products. In case of public services, it is advised to always use simple and widely accepted protocols such as HTTP, XML and SOAP.

8 Java Messaging Service9https://activemq.apache.org/stomp.html 10http://www.amqp.org/ 11https://www.oasis-open.org/


https://www.oasis-open.org/

http://www.amqp.org/

https://activemq.apache.org/stomp.html

60

Messaging will usually be used for services available inter-institutionally or internally at European Commission and in particular these services which process large amounts of information. For publicly-exposed services it should carefully be considered to use open standards such as HTTP and SOAP.

4.2.4 File exchangeIt is strongly advised to consider the use of the messaging or standard protocols such HTTP POST/PUT or FTP for file exchange to ensure interoperability.

File exchange will usually be used for services available internally at European Commission. For inter-institutional or publicly-exposed services it should be carefully considered to use open standards such as HTTP and SOAP.

Limitations - addressing

HTTP protocol assumes resource identification by URLs. Knowing the URL, any HTTP-enabled piece of software is capable to obtain information. Direct file exchange has a serious limitation which may cause interoperability issues – there is no common addressing scheme to identify resources.

Limitations – reliability

File exchange is just as simple as it is and so it has certain implications. For instance there are no guarantees for file delivery. Interrupted file exchange cannot be restarted automatically by operating system itself. In many cases it also cannot be detected easily. There must be additional controls in place to handle failure situations. Conversely, messaging middleware solutions are supporting automatic message re-delivery in case of any kind of failure.

Limitations - integrity

File exchange protocols to not provide any means of message integrity checks. Files may be corrupted or malformed in transmission and such flaws cannot be detected easily without employing additional means of control. This limitation is usually overcome by middleware solutions. It is advised to think of message integrity early in a project to increase service’s reliability.

Limitations - security

File access permissions are usually determined by host operating system which may render serious management and governance issues.

Advantage – no infrastructure, less cost and simplicity

In most cases, required investments in infrastructure are much lower than in case of messaging middleware. File transfers are supported by most operating systems out of the box, so it is also the simplest way for information exchange in the short term.

4.3 Data by reference protocolsThe CSPA131-133defines an additional protocol for service invocation based on file exchange. It represents a mixture of file exchange and operation invocation.


61

This protocol allows service consumers to upload files using HTTP protocol and then notify a service so that it can begin to process that file. Clients can optionally poll for results of processing.

For service consumers, a typical session is comprised of three steps:

1. File to be processed is uploaded

2. Service is invoked using REST or SOAP

3. Consumer polls for the result (optional)

Strengths

There are number of advantages when using that protocol

Use of well-defined www standards: http/ftp protocols Use of robust protocols for data upload/download in case of large files The server-side solution doesn’t poll, instead it is notified about data which are ready to be

processed

Risk: low exposure

The presented protocol is neither defined formally nor a widely accepted, so it should not be used for services with a public exposure. It would also be difficult to maintain its implementation for inter-institutional usage.

For newly created services it should be considered to use more reliable means of communication such as messaging middleware. The data-by-reference protocol doesn’t address topics such as data integrity or guarantees of delivery. Usage of this protocol may lead to point-to-point integration and lack of re-use on a wider scale.

We may expect further extension and standardization of data-by-reference protocol in the future.


62

4.3.1 Service implementation description – how-toAnalyse exposure decisions, take into account open standards and interoperability and choose invocation protocols. Research weaknesses/opportunities when choosing best options to process large amounts of information.

A primary and secondary invocation protocols should be elected. Software architects should address protocol details. For instance, the HTTP protocol can support multiple “methods” which are GET, POST, PUT, DELETE and others. In case of messaging, the message formats can be represented as text or binary data, which has certain implications as to upper message size limits and capabilities such as data fragmentation or transactions.

4.4 Canonical Data ModelsThe service specification defines data models which are used to represent information processed by statistical services. At the implementation layer, the models must be transformed into technical data structures to define physical information formats.

The “Canonical Data Model” is an enterprise integration pattern (10) to minimize dependencies when integrating applications that use different data formats. It is sometimes referred to as “Canonical Protocol”12.

“The Canonical Data Model provides an additional level of indirection between application's individual data formats. If a new application is added to the integration solution only transformation between the Canonical Data Model has to created, independent from the number of applications that already participate. […] The use of Canonical Data Model may seem overly complicated if only a small number of applications participate in the integration solution. However, the solution quickly pays off as the number of applications increases.” (10)

It is especially important to use canonical data models for services with a public and inter-institutional exposure. Canonical models reduce overall number of message translations (see the design pattern). They create common vocabulary to allow data formats to be understood unambiguously for geographically spread consumers. It is the best to use canonical models defined by independent standardization bodies to promote interoperability and reinforce governance.

Currently Eurostat is on a broad scale adopting SDMX-ML (5) and its web services framework specification. There are other alternatives which could be considered (like for example the DDI (8)), but currently SDMX is always the preferred canonical data model.

SDMX is a standard developed along with the GSIM and other statistics-related standards by the statistical working group13. It has been designed as a generic data format to represent statistical information. It is maintained by an independent committee comprised of major players in the statistical data processing businesses.

4.4.1 Service implementation template – how-toChoose the canonical data model and describe its details. For instance in case of SDMX, choose which version of SDMX and which schemas – generic, compact or structure-specific are going to be used. There

12http://soapatterns.org/design_patterns/canonical_protocol 13http://sdmx.org/wp-content/uploads/2011/03/SDMX-Statistical-Working-Group-members.pdf


http://sdmx.org/wp-content/uploads/2011/03/SDMX-Statistical-Working-Group-members.pdf

http://soapatterns.org/design_patterns/canonical_protocol

63

is also a difference between SDMX-ML and web services14.Data models will usually be translated to physical structures defined by SDMX-ML.

The implementation description should contain references to XML schemas or other means of technical data model definition.

4.5 Non-canonical data modelsVarious existing solutions to process statistical information are taking benefit of “legacy” or non-canonical data models which have been historically built to support specific use cases. Those models may appear to be good enough to be re-used for services exposed at the level of European Commission or lower.

What are the risks?

Non-canonical data models are not built on the foundation of GSIM (or any other established terminology) and they are not adapted to the needs of GSBPM. Use of such formats may render services not enough aligned with business goals. Furthermore, non-canonical formats may significantly increase governance cost burden in inter-institutional integration.

New organizations trying to re-use existing services would always need to study and adapt to non-canonical models. It may be challenging for new organizations to understand legacy information formats, especially if they are not documented enough. There is also a problem of vocabulary differences between numerous information definitions which may impact communication between organizations.

The development of non-canonical models will usually have their own lifecycle. Wherever IT governance is not at enough high maturity level, changes in the models may have a wide impact on consumers.

SOA Adoption

Roadmap for SOA adoption should embrace ventures to unify the way the information is exchanged. In particular non-canonical data models should in the long term be either standardized or replaced by standardized canonical models. Non-canonical models shall be analysed in-depth, understood and then included in the SOA adoption roadmap.

4.5.1 Service implementation description – how-toIn the service implementation description, provide references to the formal data model definitions. It is especially important for non-canonical models to have detailed documentation which may be used for communication with external entities wanting to exploit statistical services based on these models. Corresponding XML schema files or any other structure definitions should be attached to the service implementation description.

Lack of documentation for non-canonical data models will create point-to-point (stove and pipe) type of integration which stands in opposition to basic SOA principles – interoperability and re-use.

14http://sdmx.org


http://sdmx.org/

64

4.6 Distribution“CSPA167:In general, there will be one Service Specification corresponding to a Service Definition, to ensure that standard data exchange can occur. However, it is recognised that there may be occasions where an additional Service Specification is required, it is likely that this will be associated with variations in the methodology encapsulated within the statistical service. At the implementation level, services may have different implementations (software dependencies, protocols, supported methodologies) reflecting the environment of the supplying organization. Each implementation must rigidly adhere to the data format specified in the Service Specification.”. (1)

CSPA anticipates two possible deployment scenarios for the service implementation. The first one which is referred to as “sharing”, which assumes multiple implementation of the same service for multiple organizations and the second one – “re-use” where a service is implemented only once and re-used among all possible consumers.

The CSPA specification assumes scenarios where multiple implementation of the same service may coexist at the same time. This can occur in case of technological shift in the implementing organization or when a need to support additional execution platforms like application servers exists. It may also happen that a number of organizations require the same kind of service, but implemented differently. It may be dictated by local circumstances like vendor lock or existing infrastructure.

Despite the fact that CSPA gives opportunity for multiple implementations, it should be carefully analysed in the light of costs, governance policies and future interoperability.

What are risks?

Increasing number of software solutions is leading to a soaring cost burden. Lifecycle management for many applications providing the same functionality will pose serious

management difficulties Governance policies cannot be respected easily; knowledge transfer becomes a challenging task

when consumers use multiple solutions instead of reusing one uniform service Data redundancies or inconsistencies may result when multiple systems are accessing data

stores


65

The alternative of having multiple implementations is viable during transitional phases, when one technology is replaced with another.

4.6.1 Sharing“CSPA039:Sharing means exchanging concepts, designs or software, where each user of a service creates and operates its own implementation of that service. There are levels of sharing. A limited form of sharing would be to provide another participant with the means to replicate (make a copy of) the asset (for example give the source code) (i.e. they share an aspect of the asset only). A more involved form of sharing would entail that the asset is made entirely common (in this case the asset is also reused).” (1).

Strengths

Each organization has its own distinguished properties. Statistical business processes are customized for a specific institution, even though in general a specific process is handled in the same manner in every organization. Service specification describes service functionality and every organization can use it to create their own implementation which also takes into account specific requirements.

Personal data protection policies may oblige institutions to keep the some information under special control disallowing data transfer outside of that institution.

Organizations which have made investments in infrastructure such as hardware, middleware or specific software may not want the change involving additional cost efforts. Technological shift in large organizations is difficult to achieve thus some organizations will prefer to re-implement services to host them in their established execution environments.

Weaknesses


66

Despite the fact that CSPA gives opportunity for multiple implementations, it should be carefully analysed in light of costs, governance policies and future interoperability.

Increasing number of software solutions leads to soaring cost burden. Lifecycle management for many applications providing the same functionality will pose serious

difficulties Governance policies cannot be respected easily; knowledge transfer becomes a challenging task

when consumers use multiple solutions instead of reusing one uniform service Data redundancies or inconsistencies may result when multiple systems are accessing data

stores

4.6.2 Reuse“CSPA040 Reuse means common use of a single implementation of a service, with only one organization acting as the service provider (the one who runs the service)” (1).

Strengths

Governance policies can be easily introduced to control data and software lifecycle Less communication channels between stakeholder Lack of data redundancies Lower total cost of ownership

Weaknesses

Data migration projects may be needed to load re-used databases. Not all organizations may afford the shift to new service providers (see the strengths of sharing)


67

Not all information may be freely re-used; in particular personal or sensitive medical information must not be transferred between different organizations

Statistical services once built can be reused by three different approaches as described in the ESS Enterprise Architecture Reference Framework (ESS EARF):

Interoperable: Coordination is through interoperability. The NSIs have the autonomy to design and operate their own statistical service, as long as they have the ability to exchange information and operate together effectively (through their respective information systems);

Replicated: Coordination is through duplication. The NSI has implemented its own copy of the statistical service and they run it in their own environment.;

Shared: There are common, distributed services, shared and accessible to all the NSIs. A centralised environment (e.g., an NSI or Eurostat platform) makes available the statistical service via an API or web service to be invoked remotely by Member States applications. The instance is shared.

Each of these options has implications for the development of statistical services as well as a number of non-functional requirements. For example, a shared instance of a specific statistical service could have implications in terms of security or performance relative to a replicated instance of the same statistical service. ESS EARF provides specific examples as to what architectural building blocks in the ESS might involve which approach to reuse.

In some cases reuse can be achieved by the use of Cloud-type environments. Sharing could be provided by access to an existing Cloud service (for example, multiple countries having accounts in a dissemination tool which hosts member country data); in other cases a copy of a Cloud service might be replicated by deploying it in a private cloud in the member state.

4.7 Service ContractCSPA138 Principle: “Implement using GSIM Statement: Manage standardized service contracts based on CSPA Logical Information Model (LIM)” (1)

CSPA is applying principle of design by contract15 to define formal software specification. Software analyst describes contracts for each of invokable operation to detail functions, metrics, pre-conditions, post-conditions and other properties which are discussed in the following paragraphs. The service being described is treated like a black box with its inputs, outputs and some processing logic. Contract is defining guarantees for the service consumers. The contract is then transformed by software implementers into technical information representations like WSDL or XML schema definitions.

Operation’s function

It describes functional capabilities of the operation. Operations are derived from business function descriptions specified at logical CSPA layer (see the chapter Business Function Identification).

15Design by contract® (DbC), also known as contract programming, programming by contract and design-by-contract programming, is an approach for designing software. It prescribes that software designers should define formal, precise and verifiable interface specifications for software components, which extend the ordinary definition of abstract data types with preconditions, post-conditions and invariants. These specifications are referred to as "contracts", in accordance with a conceptual metaphor with the conditions and obligations of business contracts. "Design by Contract" is a registered trademark of Eiffel Software in the United States.


68

Pre/Post conditions

This is the core concept of the “Design by contract”® developed by Bertrand Meyer, the designer of Eiffel programming language and one of the fathers of object-oriented programming.

Pre-conditions are defining the state of the system, before operation can be executed. Originally “the precondition expresses requirements that any call must satisfy if it is to be correct” (11). This is a “Description of the conditions and checks that a requester of the operation has to validate in order to assure optimal, secure, performant, and error-free execution”(6).

Meyer says that “the post-condition expresses properties that are ensured in return by the execution of the call”(11). In modern software specifications it is more understood as a state of a system after operation is executed.

For the sake of SOA service contracts we will assume that pre/post conditions are defined in functional terms and not technical terms.

CSPA advises that technical validation should always be a part of business processing and it discourages extensive use of XML Schema Definitions. Any changes in requirements for data validation result in the need to update service’s interface which in turn affects backward compatibility.

Technical definition of validation rules for the input / output parameters are later defined by software designers in the form of certain assertions in the source code or XML schemas.

Input/Output

Service specification analyses GSIM high-level information model which is used in the service to represent information. Software analyst implies technical input parameters and defines service output values directly based on GSIM objects defined at specification layer of CSPA.

Operation parameters represent information required to process requests. They are obtained from consumers when servicing operation invocations. Operation parameters may be defined for example in a tabular form in the service implementation description.

Outputs are defined in the same manner as input parameters. While describing service implementation, keep in mind the rule of thumb “CSPA074: Designs are output driven”(1). All outputs should be traceable to the GSIM objects and results of the “2.7Outcomes” analysis.

Metrics


Example of pre-condition: respondent’s surname and birth date are non-empty.

Example of post-condition: new survey is saved in the system and the business process to validate, accept and save survey results begins

69

Progress in achieving business goals should be measureable16. Goals must be somehow quantified and measured. This part of service contract defines how effectiveness of operation can contribute to measuring the achievement of business goals. Metrics are inferred by software analysts from key performance indicators. They are described as part of the contract to show how given operation is realizing business goals. All metrics and KPIs are traceable to business goals.

Metrics are translated by software implementers into physical components to measure and trace service activities.

Business exceptions

This paragraph of service implementation description addresses exceptional situations during business process execution. Such situations are usually described at use case level to indicate alternative scenarios in case of obstacles preventing from successful finalization of operation. Exceptions should not be confused with alternative flows – exceptions as opposed to alternative flows, always result in non-successful function realization

Compensation

Business process execution may create situations of non-transactional (in the sense of ACID17) data processing in two or more information stores. Long running transactions have always been posing a great challenge. Extended duration of business process and a number of separate tasks executed by different business units and maybe on different software platforms. Traditional transactional platforms are best suited for handling single business operations.

For complex operations involving long running business processes, compensation mechanism should be defined in cases where some operations may be accomplished successfully whereas others may fail. Business process administrators will use compensation to revert business operations to normal after event of incidents in the flow of business process.

Automatic compensation components may be built-in the BPMN or BPEL processes definitions, but manual handling should be considered as a viable alternative in some cases.

Invocation Protocols

16See http://docs.oracle.com/cd/E23943_01/dev.1111/e10224/bam_adapter.htm17Atomicity, Consistency, Isolation, Durability


Example: Metric “survey uploads”: number of successfully uploaded surveys per interviewer. This metrics is used to measure performance of interviewers.

Example of business exception: questions answered in the uploaded survey do not correspond to the topic.Exception handling: decision for the given survey is made by survey administrator

http://docs.oracle.com/cd/E23943_01/dev.1111/e10224/bam_adapter.htm

70

The corresponding section of service implementation description should outline supported communication channels which can be used to invoke operation. Invocation protocols are discussed in details in the chapter “4.1Invocation protocols”.

The list of available protocols should be treated as an advice to support interoperability. Non-standard invocation protocols should be proposed only as a last resort and after discussions with enterprise architecture team.

The list of supported protocols is the following:

HTTP SOAP, POST or PUT Messaging SOAP, XML or CSV File based HTTP POST or PUT

SOAP over HTTP is the most standard protocol which is preferred to be used for public and inter-institutional services.

In the service implementation description, give details of message formats (XML schemas or CSV format definitions) and URIs to invoke the single operation being described. Standard protocols such as SDMX are preferred for public and inter-institutional statistical data exchange.

4.8 Requirements for security“CSPA150For the purpose of this document, the security concern relates to controls that are put in place to mitigate the risk that a CSPA Service or the data it controls is misused.”

Security Mechanisms

This chapter defines security mechanisms which should be addressed in the service implementation. They are split in the following categories

Non-repudiation and integrity Authentication Self-registration Authorization Encryption

The “3.2Security”is describing the above concepts in details.


Example: Survey interviewers are identified by their identifier and password

Example: Survey interviewers undergo self-registration process in the ECAS

71

Corresponding chapters of service implementation description should address all security topics defined in the “3.2Security”. This is to define security mechanisms, software, middleware, infrastructure, etc.

Data Protection

SMP@EC (6) created confidentiality classification which is following the Personal Data Protection Regulation (EC) 45/200118 and the Confidentiality, Integrity, Availability and Security Requirements of Information Systems19 defined in the Commission Decision C(2006) 3602 of 16.08.2006, Annex I.

Seven confidentiality levels which define confidentiality levels will determine other qualities of service such as data integrity, service availability and personal data processing

Category Confidentiality Integrity Availability Personal Data

[Select only one] PUBLIC

MODERATE MODERATE NO[Select only one] LIMITED BASIC

[Select only one] LIMITED HIGH

CRITICAL CRITICAL

YES

[Select only one] EU RESTREINT

[Select only one] EU CONFIDENTIAL

STRATEGIC STRATEGIC

[Select only one] EU SECRET

[Select only one] EU TOP SECRET

The required confidentiality levels should be discussed in the service implementation description to support foundation requirements of European Commission according to personal information processing. Those security requirements are largely related to exposure decisions (see the chapter “Exposure”).

18http://www.cc.cec/RUPatEC_Standard/#practice.gen.european-commission.base/guidances/supportingmaterials/ data_protection_guidelines_ec_86A90824.html19http://www.cc.cec/RUPatEC_Standard/#practice.gen.european-commission.base/guidances/supportingmaterials/issp_ec_32191A27.html


Example: Survey upload system available at NSI of Estonia is establishing a trust domain with the survey service at EC so that user authentication occurs only once in NSI. User mapping mechanism is put in place; user mapping privileges belong to security officers at NSI.

http://www.cc.cec/RUPatEC_Standard/#practice.gen.european-commission.base/guidances/supportingmaterials/issp_ec_32191A27.html

http://www.cc.cec/RUPatEC_Standard/#practice.gen.european-commission.base/guidances/supportingmaterials/data_protection_guidelines_ec_86A90824.html

http://www.cc.cec/RUPatEC_Standard/#practice.gen.european-commission.base/guidances/supportingmaterials/data_protection_guidelines_ec_86A90824.html

72

4.9 Policies“Additional conditions that may apply to use the service in a specific context are formalized by a service policy. Such service policies may be included in the service description and most of the time will be maintained at a different stage (separation of duty). The Web Services Policy Framework (WS-Policy) is an open standard used for that purpose. [...] Services descriptions and service policies are often stored in a central repository, the service registry. This repository can be later queried by consumers to identify the services they need and gather the services description and technical details.” (7)

“Policies should be defined and applied to multiple services rather than redefining a new policy for each service to increase consistency and avoid redundancy.” (7)

Policies discussed in the implementation description will be translated by software assembler into executable rules for security, data format assertions or others which can all be categorized as

Access rights and security concerns: authentication headers and methods, digital signatures, WS-Security tokens, encryption, auditing

Quality of service: response time, throughput, error rate Compliance: validity of exchange messages according to WS-I, XML schema definition and

custom compliance rules20

They are derived from the service definitions and specification layers (see chapters “2.8Restrictions andpolicies”, “3.2Security”). The analysis of policy assertions is usually also taking into account “4.10Non-functional characteristics (QoS)”.

To become familiar with service policies at implementation layer, learn more about WS-Policy and assertion policies in popular SOA-enabled products such as Oracle Fusion for SOA or IBM Process Server. Policies are one of the pillars for SOA governance. They can be shared and re-used. The defined policies are discoverable by service consumers which make them a part of service contract between service provider and the service consumer.

Terms of Use

Basic service guarantees and limitations may be described by the “terms of use” statement. It is a human-readable statement which is uploaded to service catalogue and so it can later be resolved by consumers easily.

4.10 Non-functional characteristics (QoS)“Principle CSPA141:Include non functional requirements.-functional requirements form a key input in design decisions.” (1)

“The purpose of documenting the non-functional requirements is to clearly specify the expected non-functional requirements required by a consumer of a given SOA Service”.(6 p. 36)


20See “Security considerations for Simple Data Services”(7)http://docs.oracle.com/cd/E15523_01/integration.1111/e10224/sca_policy.htm#CDDECJAChttps://www.soa.com/solutions/policy_governance


http://docs.oracle.com/cd/E15523_01/integration.1111/e10224/sca_policy.htm#CDDECJAC

https://www.soa.com/solutions/policy_governance

73

Non-functional requirements are responding to business goals related to quality of service. “Key architectural decisions might need to be made to ensure that the QoS can be delivered” (6). The important qualities which must be supported by the service are derived from high-level business goals. Metrics or KPIs may also contribute to the analysis of non-functional characteristics.

What are risks?

The lack of definition for system qualities is leading to poor service levels which may compromise realization of business goals.

Both SMP@EC (1) and CSPA (6) are defining certain quality criteria which should be defined for a service at implementation layer. They are summarized in the following paragraphs. Ensure that the service implementer is able to deliver product at a right quality to support consumer’s requirements and that non-functional characteristics of the service is available.

4.10.1 Reliability“Reliability ensures integrity and consistency of the application and all its transactions” (12).

Reliability of a service is described in terms of a number of errors which may happen to occur or kinds of errors.

The concept or reliability is closely related to transactions (atomicity, consistency, isolation, durability). It addresses inconsistencies in data even in case of information stored in multiple data stores. Business transactions may last for days or months and so traditional transactional models as defined in the relational databases discipline cannot be applied for business processes. In the world of SOA it is more common to deploy compensation mechanism than to use database transactions.

For the data in transit, reliability may be understood as message integrity supported by checksums or digital signatures.


Example: Updates of single survey information should occur in transactions (“all or nothing” data updates)

Example: Files uploaded for data imputation should be accompanied by a checksum

Example: Number of data imputation errors cannot overflow 1 per 5 millions of records.

Example: Message delivery to consumers is reliable in a sense of guaranteed delivery (JMS).

74

The service description will create a list of requirements for reliability in relation to long and short term transactions, message transit and errors which may occur during processing. Specific KPIs may be defined as rate of errors in the unit of time (like for instance a number transaction failures per month).

4.10.2 Availability“Availability ensures that a service/resource is always accessible”(12).

“Typically measured by the probability that the system will be operational when needed” (6)

Requirements for availability should be described in terms of time the service is available without interruption. It can be provided as a ration A = uptime / total time * 100% (for instance per month or per year).

This requirement can be constrained in many ways. For instance it can apply to only working days or working hours.

Availability requirements will have a strong influence on the target software architecture. For instance to support services with very high availability requirements, it may be required to employ application server clusters, data replication techniques or hot stand-by servers.

4.10.3 Performance“The performance requirement is usually measured in terms of response time for a given screen transaction (in this case service call) per user. In addition to response time, performance can also be measured in transaction throughput, which is the number of transactions in a given time period, usually one second” (12).

It can be described either in terms of latency: “Refers to the maximum amount of time between the arrival of a request and the completion of that request” (6) or capacity “Refers to the number of concurrent requests that can be handled by the service in a given time period. It is possible to specify the maximum number of concurrent requests that can be handled by a service in a set block of time” (6).

“CSPA155No specific guidance is provided on the performance characteristics. However, they should be declared in the CSPA Service Implementation Description and it is recommended that examples of performance level are included.” Performance requirements must be modelled with respect to business goals and key performance indicators.


Example: the service should be available at minimum 95% of time during working hours 8:00 – 18:00 (only working days). This especially applies to the peak times between 10:30 and 16:00

Example: end-to-end response time for the “register survey” operation cannot exceed 3 seconds.

Example: number of survey records processed in one second must be 100 at minimum

75

Performance requirements are affecting software architecture and in particular decide about modularization and communication protocols.

Note on the use of XML in the context of performance

There are voices among IT architecture specialists expressing low performance of XML-based technologies such as SOAP. It is true that XML-based solutions are much less performant than analogous plain-text applications. Despite of that, the main focus of CSPA and SOA are interoperability and re-use. Performance is important, but should be considered only as a requirement of secondary importance. Today’s technology portfolio offers multiple performance optimizations such as XML streaming APIs (STAX), XML accelerators, binary XML and others. Handling of large data interchanges, may also assume more traditional means of communication. The use of plain HTTP protocol without SOAP (even for such communication pattern it is possible to provide WSDL) may be considered as a viable alternative to standard SOAP web-service software stacks.

In any case, the interoperability and re-use are paramount, because in the long term it is much easier to optimize performance than to maintain numerous software solutions having the same function but being non-interoperable.

4.10.4 Multilingual support“CSPA144The CSPA Services must be able to support input and output in multiple languages where applicable. If the service includes manual operations, it also must be possible to change the language of the GUI. Preferably, even the presentation style should be adaptable. Additionally, services must be documented at least in English in addition to the local language(s) of the organization developing the CSPA Service. It is highly recommended that organizations that have made translations of the documentation of a CSPA Services in additional languages make them available to the community.“ (1)

This CSPA requirement may also be understood in a broader scope. Services with a high exposure (public and inter-institutional) must concentrate on some aspects of international service use. A dedicated chapter of service implementation description should provide requirements related to multilingual support in the context of

Service description and specification Meta-data for data sets and other GSIM objects Descriptions of business processes (GSBPM) Descriptions in artefact catalogues, especially policies and terms of use User interfaces for human tasks in business processes Textual queries Customized communication such as subscribed e-mails, surveys, etc. Error messages

Service description should provide a list of requirements for multilingual support. Those requirements are backtracked to business goals and exposure decisions.


Example: data set name should be provided in 3 official EU languages: English, Frenchand German

76

4.10.5 Error Handling“CSPA157Error handling, in this case, relates to situations where the service fails. The service must report this to the communication infrastructure if applicable. Error handling is left to the communication platform to handle as required. Generally there will be protocol specific requirements for flagging errors. The error codes and their meanings need to be documented in the CSPA Service Implementation Description.”(1).

4.10.6 Process MetricsThis part of service implementation description should define requirements regarding capabilities of the service to measure performance of processes and collection of other metrics to assess how business goals are approached.

4.11 Technical dependenciesService implementation descriptions will be created to illustrate how the created software solution is related to other components in the operating environment. CSPA states it is should embrace a “List of technical requirements of the service in terms of:

- Operating system(s) (specify version)

- Runtime platforms – any additional software that has to be installed on the machine the service is installed on (e.g. SAS, R, Java virtual machine, .net runtime, J2EE container, etc. – Specify version)

- Database(s)

- Other dependencies (libraries, packages etc.)”(1).

4.11.1 Service implementation description – how-toSoftware architects design hosting environment to deploy services based on existing frameworks. The service implementation should capture planned solutions at multiple layers:


Example: number of surveys uploads per month should be monitored

Example: number of SDMX queries for exposed data sets should be monitored

77

The upper platform layer consists of products such as Web servers, application servers, and various types of middleware.

The lower platform layer is comprised of the operating system environment and associated low-level systems services.

The hardware platform layer includes computing hardware such as servers, storage hardware such as storage arrays, and networking hardware such as switches and routers, and associated peripherals. 21

4.12 SOA Layering“CSPA165CSPA Service Implementation Description […].It includes a precise description of all dependencies to the underlying infrastructure, non-functional characteristics and any relevant information about the configuration of the application being wrapped, when applicable. “(1).

Reference SOA architectures like for example of the European Commission DG DIGIT (IPCIS) (7) and Open Group (9) are defining the concept of logical layering. A standard service typology is proposed in these specifications to define technical building blocks for creating composite SOA applications. The “layering” defines key abstractions of the implementation.

The software architect will describe service implementation SOA layering in order to:

Define high-level service design Define relationships between system components and service models Identify existing service for re-use and new ventures to deliver new functions

Well-established reference architecture should be used for SOA logical layering. European Commission has defined SOA Reference Architecture which outlines standard service typology and its implementation using specific Oracle products. Other reference architectures may be used to derive

21SunTone 3-D Architecture Methodology


78

more detailed descriptions as well - for instance Open Group SOA Reference Architecture (9) or IBM SOA Foundation (4).

Service implementation description may include a diagram defining logical layers for SOA solution. An example diagram embedded as a MS Visio object is depicted further in this chapter. This diagram qualifies certain architectural components to layers indicating the role of each component in the architecture. Furthermore, this diagram illustrates dependencies between layers and components.

The second diagram exhibits results of service identification. Services are qualified to one of pre-defined service types. The below diagram is a MS Visio embedded object which can be re-used to describe service implementation. This diagram has been inspired by the Open Group SOA Reference Architecture (9 p. 67).


79

4.12.1 Service implementation description – how-toCreate layering and service type diagrams to enable service implementers to assemble high quality executables. Service layering should be created based on Open Group (9) SOA Reference Architecture.

4.13 Service creation scenario“CSPA165 CSPA Service Implementation Description […]. It includes a precise description of all dependencies to the underlying infrastructure, non-functional characteristics and any relevant information about the configuration of the application being wrapped, when applicable. “(1).

“CSPA174 CSPA provides guidance on the way that organizations should go about building new or wrapping existing Statistical Services” (1)

This chapter is addressed to software architects who are responsible for application wrapping (for existing legacy solutions). Sometimes, software vendors may find it easier to repeat a pattern already applied in other projects to expose application as a service. This and the following chapters are outlining a number of architectures which can easily be applied in particular situations where a new service needs to be created. This is a practical guideline which software architects may choose to apply or decide for another, more applicable in a given situation solution.


80

The CSPA is promoting to re-use existing assets and this is also one of basic principles of applied SOA. Before development of brand new solution, an in-depth analysis of existing assets should be conducted. Service Specification assumes that the service identification would be one of the most important parts of service description (see the chapter “Identifying assets for re-use”). Results of bottom-up service identification may be used to leverage existing assets in a broader context.

In the following paragraphs discussing multiple exposure scenarios it is assumed that an existing service is fully supporting business requirements, but in reality an in-depth analysis should always be conducted to align IT with business. In result existing solutions may need to be refactored or “wrapped” to support business requirements.

The list of scenarios should not be treated as definitive. Members of SOA center of excellence at European Commission may use their project experience to develop new scenarios in order to improve service delivery.

4.13.1 Scenario - upgrade existing application to a serviceAn existing service with a well-defined interface which is fully supporting required business functions is exposed as a service using technological capabilities of the platform the service is hosted on. Existing application interface is transformed to a WSDL and a new web-service endpoint is created to serve web-service calls to invoke existing function.

To give a better understanding, one could imagine an existing application hosted on an Apache HTTP server instance. The program is written in the PHP programming language and currently existing application functions can only be invoked by human users through internet browsers.

In our scenario, the PHP application is extended. A new PHP executable file is generated from WSDL. Software developers create implementation of the generated web service endpoint methods to invoke existing functions. The service can now be used by any web-service compliant client.

A similar approach may be applied to any platform supporting web services. Unfortunately it bears a high risk of non-business aligned services, so the WSDL file should result in an in-depth functional analysis.


81

4.13.2 Scenario - upgrade existing JEE application to a serviceThis is an example of an existing JEE application which is not providing any interoperable machine-to-machine interfaces. The JEE technology is component-oriented which enables to seamlessly expose any of existing business components (EJB, POJO) as a web-service.

A new WSDL is created to support interoperable invocation of an existing service. The web-service stub is created (often automatically generated) top-down from WSDL and then a new web service endpoint is generated and included in the JEE application as an additional EJB module. The enterprise bean which is acting as service interface(a “wrapper”) looks up the target component using JNDI. That target component is implementing business logic which has to be reused. It is assumed that the component can implements either remote or local EJB interface. The web service endpoint is implemented in a way so that it invokes operations in the existing service and alternatively provides some limited data transformation capabilities. The existing application must be repackaged but components implementing business logic don’t need to be altered.

Risk: non-business aligned function

Software architects should carefully consider this scenario as it may lead to operations which do not support business requirements

Opportunities

This scenario allows re-use an existing service in a broader context without its re-implementation. The same scenario might be considered to expose existing application which already has a publicly accessible interface but using non-interoperable data formats.

4.13.3 Scenario - expose existing application as a service indirectly (proxy service)An existing application is re-used “as-is”. A new “wrapping” component is created using a SOA platform to expose existing function as a web-service. AWSDL is created from scratch, based on the results of top-


82

down architectural decomposition. The web-service component to serve requests is generated from that WSDL. Legacy application is accessed using adapter for Java Connector Architecture adapter, JMS, Java Database Connectivity API, FTP, messaging or by any other means supported by the existing SOA/ESB solution. A mediation component is configured to translate data from a legacy format to the canonical data model.

Usually data mapping occurs in mediation components and is physically implemented as an XSL transformation, data mapping table or a custom executable program to transform information may as well be written in a scripting or java language.

A similar scenario has already been applied in Eurostat. The SDMX Dissemination Web Service implements that strategy. In that implementation, an SDMX-compliant web-service has been deployed on a separate application server to access two existing legacy data stores – Comext and NavTree. Unfortunately in that scenario, the ESB hasn’t yet been used for data mapping.

In cases where new requirements for business logic are defined, the wrapping is not enough to support new functions. Such a new service may be implemented as a separate business application and deployed on an additional application server to provide new functions. The service is communicating with the existing application using JCA adapters or by any other means enabling access to its functions.


83

This pattern will usually be applied to existing CORBA22, PL/SQL23 and similar technologies.

Opportunities

Existing legacy application may expose its functions in a wider range of exposure The “proxy” component may implement additional business requirements or align existing

function with new requirements

4.13.4 Scenario - deliver a new ESS serviceThis chapter is devoted to the scenario where a completely new service is going to be defined. The bottom-up decomposition shows that some of existing business functions can be re-used, but many other must be implemented from scratch. The new service is defined using top-down architectural decomposition. Service orchestration mechanisms are exploited to coordinate complex interactions between new and existing components. Workflows reflecting business operations must be modelled, to support users in their daily operations. Users in general work with the new application to finalize tasks they are assigned and which are later handed over to other employees to continue the workflow (business process).

A new SOA composite application is created. It is built up of components implementing BPMN business processes, automated algorithms modelled as BPEL, business decision tables and adapters to access information in external data stores or legacy applications. The BPMN workflow defines human tasks which are presented using web forms. The web portal is exposing some of the existing web applications using the portlet technology. The composite application assembled from Business Process, Business Activity, Decision Service, Data Service, Connectivity Service and Infrastructure service models. It is advised that SOA architects learn more about SOA composite applications and “SCA – service component architecture” before planning for new SOA implementations.

22http://docs.oracle.com/javase/7/docs/technotes/guides/idl/corba.html 23http://docs.oracle.com/cd/B14099_19/web.1012/b14027/plsqlservices.htm


http://docs.oracle.com/cd/B14099_19/web.1012/b14027/plsqlservices.htm

http://docs.oracle.com/javase/7/docs/technotes/guides/idl/corba.html

84

The SOA composite application is deployed in the SOA reference platform maintained at European Commission.

Key performance indicators are defined to measure performance of business processes. Composite application has built-in notification and tracing facilities to provide metrics for indicators derived from business goals. They are used to define new reports which can be used by decision makers.


85

5 Using the ServicesUsing statistical services involves finding those services, managing or setting up the right architecture for their use, and putting the service into use. These activities can be carried out by different roles. For example, finding the service will be carried out by business leaders or roles closer to the business architecture. Setting up the context of the service within the architecture and environments of the organisation will be carried out by IT architects, and putting the service into use will include both the technology infrastructure staff that deploys the service and the users of the service.

5.1 Discovering servicesThere are many cases in which an NSI will reuse a statistical service produced by a different NSI:

1. They have a specific business need and they search in the Service Catalogue to find out whether a statistical service already exists to meet that need

2. They are part of a consortium that is developing a statistical service and they are investing in the service to use it

In the first case, three outcomes are possible:

The NSI finds a suitable service that meets their needs. In this case they contact the Builder or Service Provider organisation in order to assist assembling and configuring the service.

The NSI finds a service that closely resembles their needs at the Definition level, but has some different Specification or Implementation (for example, a non-functional requirement about security). In this case they contact the Builder or Service Provider organisation to assess whether the service can be enriched in order to meet the needs of the user NSI. In this case it is possible that the activity produces a new development cycle.

The NSI does not find a service that meets their needs. In this case they hav e to define the service with a Service Definition, and raise it for prioritisation. If users are found and the candidate is successful, a consortium is set up and a development cycle starts. In some cases the organisation will already be following some CSPA principles and the service might be relatively mature by when it is released to third parties.

5.2 Reusable services in the local architectureThe architects of each NSI have to give consideration to how the statistical service works in the context of their own NSI. Some of the concerns at this stage are the same ones found when identifying and adopting a new service, and concern all four domains of the Enterprise Architecture. Before deploying and making the service live, architects will want to consider:

If the service is replicated, does it become available for a particular system or is it made available as a reusable service across the NSI (for example, a local web service)?

If the service is shared, does it meet the security and performance requirements of the NSI? How is the service going to be managed: who owns it, how is it going to be kept up to date? Does the service have information or data dependencies with other services in the application

architecture of the NSI? Do the appropriate resources and commitment exist in order to assemble and configure the

service?


86

Is the business ready to use the service? Do the appropriate business roles, information feeds, tools, etc., that are needed to make use of the service exist?

If the service needs to store data, is a suitable environment or storage available? Does the environment/storage meet security standards?

Does the NSI have an established way of orchestrating services that uses the service? Does the service have technology dependencies that are not part of the favoured technologies

of the NSI? If so, how is this conflict resolved? Do other services or applications that interact (or exist in close proximity) with the adopted

service need changes in order for the NSI to adopt the service?

5.3 Using a statistical service

Once the service meets the architectural standards of the adopting organisation it can be tested to ensure it is ready for live deployment. Early testing of the statistical service by expert users should be allowed for to understand how it meets their needs. If any new user stories arise at this stage they can be collected to be fed back to the building organisation for a new iteration. However, care must be given to understand whether the user stories relate to the local implementation of the service or the service implementation itself. Where possible, adapt the local implementation of the service while maintaining the reusable components untouched.

Uses and implementations of reusable services should be promoted by the Centre of Excellence to facilitate their adoption by other NSIs.


87

6 Bibliography1. United Nations Economic Commission for Europe (UNECE).Common Statistical Production Architecture v1.5. 2015.

2. —. The Generic Statistical Business Process Model. 2013.

3. —. Generic Statistical Information Model (GSIM): Specification. 2013.

4. John Ganci, Amit Acharya, Jonathan Adams (IBM).Patterns: SOA Foundation Service Creation Scenario. s.l. : IBM Redbooks, 2006.

5. Statistical Data and Metadata eXchange.SDMX Specification 2.1. s.l. : Statistical Data and Metadata eXchange, 2011.

6. Jean Gigot, Luis Sequeira, Pablo Crespo.Service Modelling Practice Guide for the European Commission (SMP@EC). s.l. : Eurostat DIGIT.01 MIA, 2013.

7. (DIGIT), Geoffroy de Lamalle.IPCIS SOA Reference Architecture. s.l. : European Commission, 2012.

8. DDI Alliance.Data Documentation Initiative. s.l. : DDI Alliance, 2009.

9. Open Group.SOA Reference Architecture. 2011. ISBN 1-937218-01-0.

10. Hope, Gregor.Enterprise Integration Patterns: designing, building and deploying messaging solutions. 2004. 0-321-20068-3.

11. Meyer, Bertrand.Applying "Design by Contract". s.l. : Interactive Software Engineering.

12. Cade, Mark.Sun Certified Enterprise Architect for Java EE study guide. s.l. : Sun Microsystems Inc., 2002. ISBM 978-0-13-148203-6.

13. ESS Statistical Production Reference Architecture. 2015

14. ESS EA Reference Framework. 2015

15. ESSnet SCFE Deliverable D1.1 - Guidance: Sharing Common Functionalities in the European Statistical System


88

7 AnnexesAnnex I: Service Definition Description Template

Annex II: Service Specification Description Template

Annex III: Service Implementation Description Template


Date post:	18-Mar-2020
Category:	Documents
Upload:	others
View:	4 times
Download:	0 times

Guidelines for describing Statistical Services · Web viewThe main foundation for the methods...

Documents