+ All Categories
Home > Documents > Database access and data retrieval ( a users view )

Database access and data retrieval ( a users view )

Date post: 30-Dec-2015
Category:
Upload: mark-pace
View: 47 times
Download: 3 times
Share this document with a friend
Description:
Outline 1 – General overview of fusion databases 2 – Data storage/retrieval methods and datastructures 3 – SDAS at ISTTOK. Database access and data retrieval ( a users view ). R. Coelho Associação EURATOM/IST, Instituto de Plasmas e Fusão Nuclear. I - General overview of fusion databases. - PowerPoint PPT Presentation
32
Database access and data retrieval Lisbon 18/02/09 R. Coelho 1/29 Database access and data retrieval (a users view) Outline 1 – General overview of fusion databases 2 – Data storage/retrieval methods and datastructures 3 – SDAS at ISTTOK R. Coelho Associação EURATOM/IST, Instituto de Plasmas e Fusão Nuclear
Transcript
Page 1: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 1/29

Database access and data retrieval (a users view)

Outline

1 – General overview of fusion databases

2 – Data storage/retrieval methods and datastructures

3 – SDAS at ISTTOK

R. Coelho

Associação EURATOM/IST, Instituto de Plasmas e Fusão Nuclear

Page 2: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 2/29

I - General overview of fusion databases

Databases play a fundamental role in fusion plasma research

Essential for storage of seminal/standard benchmarking discharges.

Assist the construction/deduction of elementary scaling laws and design phase of fusion devices (what to expect on confinement, MHD, transport,…)

Assist the modeling effort by providing a validated set of input experimental data (cross sections, machine dependent data,…) and experimental plasma data on which to validate the codes.

Databases offer a clear display of community achievements

Page 3: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 3/29

Fusion databases : 3 notable examples

International Multi-Tokamak Profile Database (ITPA)

Atomic Data and Analysis Structure (ADAS)

Experimental Nuclear Reaction Data (EXFOR)

Page 4: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 4/29

International Multi-Tokamak Profile Database (ITPA)

• Objectives– To provide all the information required for transport codes to

simulate discharges from a variety of tokamaks.

– Provide data to be compared against the predicted outputs from the codes.

– Provide data and the modelling results to be used as part of the ITER physics basis.

• Coverage– Released publically in 1998.

– Built from 201 shots from 21 devices. Recent data has been added to secondary but remains for “working group” only access.

Page 5: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 5/29

International Multi-Tokamak Profile Database (ITPA)

• Storage/accessing– MDS+ server, data stored as MDS+ trees.

– Relational database with comments, 0D and 1/2D metadata assists the database queries.

http://tokamak-profiledb.ukaea.org.uk/

C M Roach, M Walters, R V Budny, F Imbeaux,

TW Fredian et al, Nuc. Fus., 48, 125001 (2008)

Page 6: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 6/29

Atomic Data and Analysis Structure (ADAS)

• Objectives– Provide interconnected set of computer codes and data collections

for modelling the radiating properties of ions and atoms in plasmas.

– Assist in the analysis and interpretation of spectral emission and support detailed plasma models (crucial in plasma edge).

• Coverage– Plasmas ranging from the interstellar medium through the solar

atmosphere and laboratory thermonuclear fusion devices to technological plasmas.

Page 7: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 7/29

Atomic Data and Analysis Structure (ADAS)

• Accessing– A key range of routines for accessing the database and delivering

data to user codes is included. FORTRAN, C, C++, IDL and MATLAB are supported.

http://open.adas.ac.uk/index.php

Assisting fusion since JET was born…(1983)

Page 8: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 8/29

Experimental Nuclear Reaction Data (EXFOR)

• Objectives– Provide an extensive compilation of experimental nuclear reaction

data.

• Coverage– Neutron induced reactions have been compiled systematically

since the discovery of the neutron.

– Charged particle and photon reactions have been covered less extensively

– Data from 17700 experiments, its' bibliographic information, as well as experimental information about the data. The status (e.g., the source of the data), and history (e.g., date of last update) of the data set is also included.

Page 9: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 9/29

Experimental Nuclear Reaction Data (EXFOR)

• Repository– Stored at International Network of Nuclear Reaction Data Centres

(NRDC). http://www-nds.iaea.org/exfor/exfor.htm

Page 10: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 10/29

II - Data storage/retrieval methods

MDS+

HDF5

Universal Access Layer method

Paradigm for data retrieval methodologies

Page 11: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 11/29

MDSplus (MDS+) http://www.mdsplus.org

Page 12: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 12/29

MDSplus (MDS+)

SOME CONCEPTS

• The Data Hierarchy - Trees, Nodes, and Models. A self-descriptive hierarchy called a TREE, consisting of large numbers of named NODES which make up the branches (structure) and leaves (data) of each tree.

– MDSplus SHOTS are trees created from a special type of tree called a MODEL, a template which contains all of the structure and setup data for an experiment or code.

• Node Characteristics - Self Description : metadata including the data type, array dimensions, data length, units, independent axes, the parents and children of the node, tag names, the date when the data was stored, the name of the user who wrote data, and so forth.

Page 13: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 13/29

MDSplus (MDS+)

TREE EXAMPLE

• The node on the far right "Ip" is an example of a MEMBER, a type of node used to contain data

• Child and member nodes as analogous to the directories and files on a typical operating system.

Page 14: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 14/29

MDSplus (MDS+)

DETAILS ON THE API

• The basic calls as they would be ordered in an application are, in generic syntax:

 

mdsconnect,'server_name'

mdsopen,'tree_name',shot_number

result = mdsvalue('expression')

mdsput,'node_name','expression'

mdsclose,[[Documentation_beginners_tree_name,shot]

mdsdisconnect

Page 15: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 15/29

MDSplus (MDS+)

ACCESSING JET DATA(workaround since not a native MDS+ server storage)

• MATLAB

>> mdsconnect('mdsplus.jet.efda.org')

>> [y,status]=mdsvalue('_sig=jet("ppf/magn/ipla",40573)')

>> [x,status]=mdsvalue('dim_of(_sig)')

>> mdsdisconnect

 

• IDL

IDL> mdsconnect,'mdsplus.jet.efda.org'

IDL> y=mdsvalue('_sig=jet("ppf/magn/ipla",40573)')

IDL> x=mdsvalue('dim_of(_sig)')

IDL> plot,x,y

IDL> mdsdisconnect

Page 16: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 16/29

HDF5 http://www.hdfgroup.org/index.html

• HDF5 is a self-describing file format and library for storing scientific data.

• A versatile data model that can represent very complex data objects and a wide variety of metadata (different datatypes on the same tree) with direct access to parts of the file without parsing the entire file.

• A completely portable file format with no limit on the number or size of data objects in the collection.

• A software library that runs on a range of computational platforms, from laptops to massively parallel systems, and implements a high-level API with C, C++, Fortran 90, and Java interfaces.

Page 17: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 17/29

Page 18: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 18/29

Universal Access Layer (UAL)

MOTIVATION

• HDF5 and MDSplus represent successful tools for a common data format and organization, thus allowing effective data sharing among different applications.

• But will these standards survive the lifespan of ITER ? A more generic approach is envisaged and been implemented on the ITM-TF.

• Consistent Physical Objects (CPO) - a generic view in trees and sub-trees of the data organization, transparent to the actual method used for data storage.

G. Manduchi et al, Fusion Engineering and Design 83, 462-466 (2008)

Page 19: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 19/29

Universal Access Layer (UAL)

DATA STRUCTURE

Page 20: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 20/29

Universal Access Layer (UAL)

DATA STRUCTUREMSE CPO

Page 21: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 21/29

PARSING THE DATA STRUCTURE

• CPO tree-like hierarchical structure is defined through language independent XML schemas. These can be easily parsed to each programming language.

Page 22: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 22/29

Universal Access Layer (UAL)

DATA FLOW (D.COSTER)

• The multi-level UAL manages the CPO I/O between codes as a common data bus and the data retrieval (MDS+ or HDF5 stored data)

Page 23: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 23/29

Universal Access Layer (UAL)

CPO I/O

• euitm_open(name,shot,run)

 

• euitm_get(path, output_structure) – the location of the CPO is specified by the string argument “path”– output_structure is language dependent and will hold the output data.

•  

• • euitm_put(path, input_structure) – the location of the CPO is specified by the string argument “path”– input_structure is language dependent and will hold the input data. CPO is

specified by the string argument “path”.

Page 24: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 24/29

Universal Access Layer (UAL)

ACCESSING EXPERIMENTAL DATA

Cortesy of J.Signoret and F.Imbeaux

Page 25: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 25/29

Metodologies for data retrieval

 WHAT IS A SIGNAL ?

“any kind of data that describes a particular measurement during a discharge and contains some information about plasma properties”,e.g. 2/3D data time-series data, contour maps, images…

 OUTPUT PER SHOT ?

Diagnostics at JET top 10 Gbytes/shot….much smaller than the expected values for ITER !

 WHAT IS MEASURED ?

Physical properties manifest as patterns with a direct parallel between the physical behaviour and the structural shapes that are generated (spikes in D emission during Edge localised modes (ELMs), Soft X-ray and ECE emission during sawtooth crash (ST).

Page 26: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 26/29

Metodologies for data retrieval

Traditional approach

• Query founded on shot/signal

• Manual inspection of structural shapes/features

• Very tedious and long process

 

Page 27: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 27/29

Metodologies for data retrieval

Pattern recognition approach

• Data with technical and scientific criteria guidance.

• “Pattern oriented” compliant, just as people behave when they analyze data.

• Relies on enclosed techniques for data retrieval :– Feature extraction

– single entity (temporal segment inside a waveform or a set of

pixels within an image)

– compound entity (more than one segment/signal)

– Classification system (supervised/unsupervised)

– Similarity measure (metrics proximity measure)

J.Vega et al, Fusion Eng. And Design 83, 382 (2008)

Page 28: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho 28/33

III – Shared Data Access System (SDAS)

Why another Data Retrieval Software?

The problem

• Scientists need to access data from different laboratories;

– Each laboratory has its own way of retrieving data;

– Scientists have to spend time and effort learning how the different data access schemes work, change their analysis code for each experiment and manage updated versions for each different program and library required;

• Does not mean that every association must store and retrieve data in the same way.

• The main data index is changing from shot number to time and events, where the pulse number is just one among the most relevant events against data is catalogued.

 

Page 29: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho29/29

Shared Data Access System (SDAS)

Why another Data Retrieval Software?

The solution

• Hide all complexity from end-users;

• Scientists only have to learn once how to access data;

• Users don't ask data for information directly to the association's database but to a software layer;

• The software layer provides the same data access functions in all associations;

• Data blocks are tagged against specific events which happen during the life cycle of a discharge

 

Page 30: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho

Without SDAS

With SDAS

Page 31: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho31/29

SDAS Technology

• SDAS is based on Remote Procedure Calls (RPC);

• The SDAS server is formed by an XML-RPC server

and by a connector to the storage mechanism;

• Data is indexed by time and events;

• SDAS server and libraries available on Python,

Java and C++;

• Read and Write support (for post processed data)

• Supported in several data analysis programs:

– Matlab, IDL, Octave, Mathematica

• Documentation in wiki: http://cdaq.cfn.ist.utl.pt:8085/

• Currently being used in ISTTOK/PT, Compass/CZ

• and TJ-II/ES 

Page 32: Database access and data retrieval  ( a users view )

Database access and data retrieval Lisbon 18/02/09 R. Coelho32/33

Data access

• SDAS libraries are easily integrated in programs such as MatLab, Mathematica and IDL;

• SDAS provides over 20 functions which allow to:

– Search parameters and events;

– Retrieve single and multiple data


Recommended