+ All Categories
Home > Documents > Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk...

Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk...

Date post: 03-Jan-2016
Category:
Upload: valentine-claude-ferguson
View: 221 times
Download: 6 times
Share this document with a friend
Popular Tags:
23
Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow Seminar November 20, 2002
Transcript
Page 1: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Evaluation of the C++ binding to the

Oracle Database System

Performance Issues and Benchmarks

Dirk Geppert and Krzysztof Nienartowicz, IT/DB

CERN IT Fellow Seminar November 20, 2002

Page 2: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Outline - Part II

• Motivation• To test - what and why?• Issues• Results• OCCI at CERN • Summary & Conclusions

Page 3: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Motivation

• To understand possibilities and perils while dealing with C++ binding to RDBMS• Relational solution to deal with bulk dataRelational solution to deal with bulk data

- Still tables, tablespaces, extents, indices, views, triggers Still tables, tablespaces, extents, indices, views, triggers somewhere down there…somewhere down there…

• Server side processing- Indices, complex queries, stored procedures

• Impedance mismatch - Is the OCCI natural way to go fast? - Are SQL99 types efficient enough to deal with objects in

Oracle?

• Objects mapped onto tables- limitations, advantages try to take the best of two worlds

- Stay pure ( (psuedo)pure C++, Java, SQL, PLSQL, XML) ?- Or Mix (C++ <-> Java <-> SQL <-> PLSQL <-> XML) ?

Page 4: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

What to test?

• Relational - OCCI Relational access pattern copied from

DAO/ODBC/JDBC- We call some form of SQL, resultsets as output- No mapping of objects on the client, some objects may

partake in the queries on the server though

• vs. Associative • Objects returned from queries as resultsets

- Iteration over results from queries or stored procedures- Logic identical to Relational…

• vs. Navigational mode- Similar to OODB, access to “roots” table or one root object,

then by Refs

• In a real world we should consider mixing Associative with Navigational mode to get optimal performance!!!

- (application dependent rethink 100x first!!!)

Page 5: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Navigational mode architectural understanding, aims

To optimize client’s side:

• limit number of roundtrips by:• Cache usage• Prefetching• Implicit

pinning/unpinning of objects

• Conscious commit/flush policy

SQL generated(single)

Ref<OurObj>

Applicationmain loop

requested object

request for objectRef<OtherObj>

Check if inApp’s memory request for

Ref<OtherObj>

Applicationspace

Local cache management,OCCI/OCI driven

Oracle server space

requested object

Checks if the OCIRef orOCIComplexObj object

in the cache

cre

ate

/se

t/g

et

ob

ject

Application’slibrary space,

OCCI controlled

Gets requestedobjects using oracle

db engine

• To optimize server side performance• Low level layout of objects

- Tablespaces, Datafiles (manual, automatic, size of extents, striping, index separtion)

- scope of Refs (local index addressing), REF constraints

• Associative access vs. navigational - preselection on the server with optimised SQL queries vs. navigation

optimisation

Page 6: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

«datatype»Number

«uses»

«type»Ref

Object:innerNode_o

«enumeration»varray

-n1 : Number-podEmbedded : podType_o

innerNode_o -pInnerNode_o

1

1-numberArray

1

1

Innernode is the root class for the access from the table level.It uses Ref<innerNode_o> to create cyclic single linked list,which create "the ring" or the "cycle" - notions used by us unequivocally.

Table

cycle_t

Table of objects

-n1 : Number

podType_o

«uses»Test case

• Navigational access using Ring of nodes structure:• InnerNode objects with

embedded varray of numbers in a cyclic linked list

• Easy to model worst case scenario• Cache size influence• Embedded dynamic array

access• Non native Oracle Number

type performance, casting impact

• Prefetching affect• Small server side cache

• Aim:• To test raw IO performance within HEP like model in OCCI

- Model simplified during initial test with 9.0 due to numerous bugs…

InnerNode(1) InnerNode(2) InnerNode(3) InnerNode(n-1) InnerNode(n)...

Page 7: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Cache impact• Cache size:

• Impacts how many objects can stay in memory without roundtrip to server

• Issue: Embedded objects do not count! - Problem with memory usage if embedded objects tend to be big –

application may occupy much more memory space then wanted- Calculate cache size for “root” objects or use REFs instead

• Cache size calculation• Max cache size, optimal cache size (8MB)

- Max size: How many object may stay resident until they are flushed to server

- Optimal size: how much memory for objects should have stayed after hitting max limit

Optimal size

InnerNode(1) InnerNode(2) InnerNode(3) InnerNode(n-1) InnerNode(n)...

Cache managed memory

Maximal size

Page 8: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Lost memories…

• Since only cache objects are garbage collected deallocation of embedded objects is user responsibility!

• Lack of custom deallocation code may cause huge leaks!!!

• Always bear in mind that

cache_size != memory occupied

InnerNode(1) InnerNode(3) InnerNode(n-1) InnerNode(n)...

Cache managed memory

Heap

vara

rra

y

vara

rra

y

Em

bO

bj

vara

rra

y

vara

rra

y

InnerNode(2)InnerNode(1) InnerNode(3) InnerNode(n-1) InnerNode(n)...

Cache managed memory

Heap

vara

rra

y

vara

rra

y

Em

bO

bj

vara

rra

y

vara

rra

y

One of memory deallocation problems in version 9.0

Fixed in 9.2…

Page 9: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Results, approach

• Environment• Server

- Solaris 2.8, with Oracle 9.2.0.0 0 – Beta. - Sun 280R with 2x750MHz CPUs, 1GB RAM.

- Storage: EMC CLARiiON FC4500 system, RAID5, with 2 storage processors, 512 MB of cache each. 14 disks.

• Used 4 disks for striped tablespace.- Raw OS writing is 50MB/s and 33MB/s for

reading roughly.

• 1 to 9 Clients running on 1 to 3 sibling Suns.

Page 10: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Results, approach, numbers• Tests methodology

• Test conducted for various settings for- Maximum cache size: 5%, 15%, 30%, 50%, 75%, over

100% of all resident objects’ size- Size of ring between 10 and 100K - Size of embedded number array between 10 and 5k

• Maximal size of objects per test case ~25 Millions• repeated 5-10 times per test case on average• All in all hundreds of billions of objects

creations/traversals/deletions, some days of test runs

• Results• gathered in the database, tens of queries, tens of

multidimensional plots- http://knienart.home.cern.ch/knienart/OCCIResults/

OCCITests9.2.mdb

Page 11: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Versions speed comparison, figure

Page 12: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Server’s vs. client’s side processing , fig.

Page 13: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Cache speed, fig.

Page 14: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Write speed, fig

Page 15: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Read speed, fig

Page 16: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Read to write ratio, fig.

Page 17: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Cache size consideration…

Page 18: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Worlds in collision…

• Very often one must know database configuration/limitations to avoid bottleneck while trying to optimise

• Example:• Short vs. long objects• If size of

Embedd_object < max_small save in same block/extentElse save in separate overflow area

• Optimisation turnsout to be decelerator

(for write, accelerator for read… see Read to Write ratio fig)Hint: Always differentiate between REFs and embedded objects

access pattern…

InnerNode(1) InnerNode(2) InnerNode(3) InnerNode(n-1) InnerNode(n)...

Table contagious row space

vara

rray

vara

rray

vara

rray

vara

rray

vara

rray

InnerNode(1) InnerNode(2) InnerNode(3) InnerNode(n-1) InnerNode(n)

Table contagious row space

Additional table space

...

vara

rra

y

vara

rra

y

vara

rra

y

vara

rra

y

vara

rra

yoverflow extent

Page 19: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

CERN Applied OCCI

• Compass migration• Bulk inserts using OCCI relational mode• Framework of dozens distributed OCCI clients managed

by single manager, hot statistics, framework state in DB, using MPI for management/coordination

- OCCI pros: Exception handling, easily embedded SQL, server side processing, easy persistency layer separation for experiments

• 5 billion rows…

• Conditions DB• Transparent OCCI access without changes to users’ API

• Linux RAC

Page 20: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Some Linux RAC results…

• Interesting study of how distributed application performance is dependent on an underlying hardware:• No network separation for RAC messages and data flow• Single shared drive pool for multiple RAC nodes• File block level coherency thanks to Sistina distributed FS• 20-150 Clients running from 10 nodes using MPI

synchronization for controlled DB stressing• Stress put on ingesting, inserted up to 200*106 records with

few KB sized BLOBs• Results:

Usually 5-8 times slower than single instance (stable though – no significant degradation when number of clients processes increased from i.e. 20 to 50)

Page 21: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Performance related work

• OCCI could be used as a convenient front end for NTUPLE analysis

• NTUPLES analysis using SQL• Multidimensional queries optimisation

- Using bitmap indices, function indicies, materialized views, etc

• comparison with Root- Automatic creation of schema for Oracle (evolution of

software written for Alpha++)- Automatic import of Root NTuples into Oracle- Root speed tests- Database tests, refer to:

- http://knienart.home.cern.ch/R2O/

Page 22: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Summary

• OCCI is quite effective if general rules are obeyed:• Use cache carefully• Be conscious what happens with memory• Understand server side clues• Preselect on the server if you can• Server side speed can be achieved if we do not kill performance with

roundtrips • Physical DB design is taken into account during app analysis/design• Bound to Oracle only, no source code available whatsoever…• Is C++ the future to keep objects anyway? B&V say Java…

• Benchmarks• Were good for learning of Oracle internals• And for optimisation of performance• And for finding jeopardy while designing OCCI application

• Learnt:• There is almost always way to optimise by tuning server without changes in

the OCCI code• Raw performance worse but comparable to OODB, huge speedups possible

using old nifty RDB techniques (indices etc)• Speed can change dramatically even with a minor version change!• Oracle types are not C++ types – OCCI conversion is pretty expensive -

10i?

Page 23: Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Tools: • mpatrol• strace

• Inescapable for the first releases of OCCI…


Recommended