MGRID
Implementing MGRID
CHAR(11), Cambridge, July 12, 2011
Portavita
2/34
• Chronic disease management
• Largest online electronic health record (EHR) in
Netherlands
• Largest telemedicine project in Europe
Portavita
3/34
Portavita
4/34
Portavita
5/34
Portavita’s growth
6/34
Jun-2002 Jun-2003 May-2004 May-2005 May-2006 May-2007 Apr-2008 Apr-2009 Apr-2010 Mar-20110
50000
100000
150000
200000
Anticoagulation
DiabetesCOPDCVRMTotal
Number of Patients NL
Scaling up
7/34
• Several months
orientation period,
talked with VARs
• Took three months to
implement
Cost of scaling up
8/34
Benefits of scaling out
9/34
• parallelize tuple streams for
increased speed
• eliminate single server memory
bandwidth, processor and IO
bottlenecks
• control query latency with
shard size
• machines can be each others
replicas
• no negative economy of scale
The MGRID Solution
10/34
• Make parallel PostgreSQL• that can scale out
• that has built-in redundancy
• that allows online adding of
hardware
• that supports all features of core
PostgreSQL (ACID, stored
procedures, etc)
• That supports medical data• ISO-21090 Healthcare Datatypes
Healthcare Datatypes
Physical Quantities: example
12/34
create table patient (name text, height pq, weight pq);
CREATE TABLE
insert into patient values
(’Jack’, ’1.92 m’, ’92 kg’)
,(’Julia’, ’150 cm’, ’50 kg’)
,(’John’, ’188 cm’, ’84.3 kg’)
,(’Luke’, ’78 cm’, ’11800 g’);
INSERT 0 4
create or replace function bmi(height pq, weight pq)
returns pq
as $$
select convert($2, ’kg’) / convert($1, ’m’)^2;
$$ language sql immutable;
CREATE FUNCTION
select *, bmi(height, weight) from patient where height > ’1.70 m’
order by weight;
name | height | weight | bmi
------+--------+---------+---------------------------
John | 188 cm | 84.3 kg | 23.8512901765504753 kg/m2
Jack | 1.92 m | 92 kg | 24.9565972222222222 kg/m2
(2 rows)
Physical Quantities: example
12/34
create table patient (name text, height pq, weight pq);
CREATE TABLE
insert into patient values
(’Jack’, ’1.92 m’, ’92 kg’)
,(’Julia’, ’150 cm’, ’50 kg’)
,(’John’, ’188 cm’, ’84.3 kg’)
,(’Luke’, ’78 cm’, ’11800 g’);
INSERT 0 4
create or replace function bmi(height pq, weight pq)
returns pq
as $$
select convert($2, ’kg’) / convert($1, ’m’)^2;
$$ language sql immutable;
CREATE FUNCTION
select *, bmi(height, weight) from patient where height > ’1.70 m’
order by weight;
name | height | weight | bmi
------+--------+---------+---------------------------
John | 188 cm | 84.3 kg | 23.8512901765504753 kg/m2
Jack | 1.92 m | 92 kg | 24.9565972222222222 kg/m2
(2 rows)
/* PQ contains most units used in science and engineering and can be used
* outside the medical vertical. E.g. what is the mean travel time of light ,
* from the sun to the earth?
*/
select convert(pq ’1 AU’ / ’[c]’, ’s’);
convert
------------------------
499.0047838061356433 s
(1 row)
Physical Quantities
13/34
• PQs used to document observations
• Based on Unified Code for Units of Measure• 294 units – a.o. units from SI, ISO 1000, ISO 2955, ANSI X3.50,
CGS, unified U.S. & British Imperial units
• Operations supported:• Comparison: <, > and friends
• Arithmetic: +, −, /, ∗, power
• Aggregation: min, max, avg, sum, var, stddev
• Indexable
Intervals and sets of point in time: example
14/34
select canonical(ivl_ts ’[2004;2005[’ + ivl_ts ’[2005;2006[’) as plus,
canonical(ivl_ts ’[2002;2010]’ - ivl_ts ’[2004;2005]’) as minus;
plus | minus
-------------+-------------------------
[2004;2006[ | [2002;2004[;]2005;2010]
(1 row)
create table medication (name text, effectivetime ivl_ts );
insert into medication values (’Pete’, ’[20100316;20100514] ’),
(’Pete’, ’[20100420;20100701] ’),
(’Pete’, ’[20101220;20110119] ’),
(’John’, ’[20100516;20100614] ’),
(’John’, ’[20100620;20100801] ’),
(’John’, ’[20101220;20110119] ’);
INSERT 0 6
select * from medication where effectivetime @> ’20100620’;
name | effectivetime
------+---------------------
Pete | [20100420;20100701]
John | [20100620;20100801]
(2 rows)
select name, canonical(’2010’ - sum(effectivetime)) as nomeds
from medication group by name;
name | nomeds
------+-------------------------------------------------------------
John | [20100101;20100516[;]20100614;20100620[;]20100801;20101220[
Pete | [20100101;20100316[;]20100701;20101220[
(2 rows)
Coded values
15/34
• Controlled vocabularies in medical informatics• record information unambiguously
• allow machine reasoning
• HL7v3 Coded value implementation
• Support for a large number of codesystems:• Systemized Nomenclature of Medicine – Clinical Terms
• HL7v3 vocabularies all Editions
• Logical Observation Identifiers Names and Codes
• you can add your own
• Supports• hierarchical code systems
• code system versioning
• Indexable
Coded values: example
16/34
select name, code(disorder), codesystemname(disorder),
displayname(disorder) from observation;
name | code | codesystemname | displayname
--------+-----------+----------------+---------------------
Willem | 71620000 | SNOMED -CT | Fracture of femur
Yeb | 66308002 | SNOMED -CT | Fracture of humerus
Henk | 262994004 | SNOMED -CT | Leg sprain
(3 rows)
select name, displayname(disorder) from observation
where disorder << ’284003005|Fracture of bone’::cv(’SNOMED -CT’);
name | displayname
--------+---------------------
Willem | Fracture of femur
Yeb | Fracture of humerus
(2 rows)
select name, displayname(disorder) from observation
where disorder << ’127279002|Injury of lower extremity’::cv(’SNOMED -CT’);
name | displayname
--------+-------------------
Willem | Fracture of femur
Henk | Leg sprain
(2 rows)
Parallel Processing
The Idea – a sketch
18/34
Serial Processing Parallel Processing
From serial queries to parallel queries
19/34
Relay
Cells
Partitioned
Queries
Partitioned
Results
Queries / Results
using PostgreSQL API
• layout defines distribution• of tables
• on cells
• via attributes
• using a degree of
parallelism (dop)
• relay grid gateway• provides a standard
PostgreSQL interface for clients
• plans distributed queries
• combines grid results
• cells hold partitioned data
• redundancy group• one complete copy of the data
Performance
Test platform
21/34
• Simple grid• one location
• one redundancy group
• and upto 10 hosts
• Each host is• AMD X3 720
• 16GB PC6400 DDR2
• 3x WD RE3 250GB SATA
• XFS, barrier = off
• 1Gb network
Test database
22/34
• Consider the pgbench ERD:
pgbench_accounts
aid
bid
abalance
filler
pgbench_branches
bid
bbalance
filler
pgbench_history
tid
bid
aid
delta
mtime
filler
pgbench_tellers
tid
bid
tbalance
filler
• With a layout “per_account”• distribution key accounts.aid and history.aid
Select test
23/34
• Determine read only / select speed
• Query
SELECT abalance, filler FROM pgbench_accounts WHERE aid = :aid
• Platform configuration• single host or
• layout per_account dop ∈ [2, 4, 9] and #hosts = dop
• pgbench configuration:• #clients ∈ [8, 16, 32, 64, 96]• scale_factor ∈ [100, 200, 400, 800, 1300, 1800]
Select results
24/34
TPC-B test
25/34
• Determine “TPC-B” transaction speed
• Query
BEGIN
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid
SELECT abalance FROM pgbench_accounts WHERE aid = :aid
UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid
UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP)
END
• Platform and test configuration as before
TPC-B results
26/34
Mixed load test
27/34
• Determine results for a Portavita like mixed load
Query =
select 90%
tpc-b 9.9%
complex 0.1% of the time
• Where complex isSELECT a.bid,avg(abalance) AS b FROM pgbench_accounts a
WHERE a.bid= :bid GROUP BY a.bid ORDER BY a.bid
• Platform configuration as before
• pgbench configuration:• #clients = 16
• scale_factor ∈ [200, 400, 800, 1800]
Mixed load latencies, 1 server
28/34
single, scale 200 single, scale 400
single, scale 800 single, scale 1800
Mixed load latencies, grid, constant shard size
29/34
single, scale 200 grid2, scale 400
grid4, scale 800 grid9, scale 1800
Back to Portavita
Deploy with redundancy
31/34
redundancy group
hosts
switch
switch
redundancy group
hosts
switch
switch
redundancy group
hosts
switch
switch
redundancy group
hosts
switch
switch
location location
private
data
network
VPN
application
domain,
load balancer
public / application
data
network
WAN
admin
network
Conclusions
32/34
• Parallel PostgreSQL is a solution for mixed OLTP / OLAP
use cases, provided your data is partitionable
• Control complex query response time with shard size
• Healthcare Datatypes as UDTs (instead of ORM)
increases developer productivity
Questions
References
34/34
• E.F. Codd - Relational Database: A Practical Foundation for
Productivity, ACM Turing Award Lecture, 1981
• Urs Hölzle - The Google Linux Cluster, 2002
• M. Stonebraker, R. Cattel - 10 Rules for Scalable Performance
in ‘Simple Operation’ Datastores, 2011
• Wikipedia - Memory Wall
• G. Smith - pgbench-tools
• J.D. McCalpin - STREAM: Sustainable Memory Bandwidth in High
Performance Computers
• G. Smith - Stream scaling - Automate memory bandwidth testing
with STREAM using various core counts
• Y.T. Havinga, W.P. Dijkstra and A. de Keijzer - Adding HL7 version 3
data types to PostgreSQL, 2010
• G. Schadow - The Unified Code for Units of Measure, 2009
Contact
MG R IDMG R IDir. Willem DijkstraPartner
www.mgrid.net
T +31 886 474 302
F +31 886 474 301
M +31 611 144 118
Oostenburgervoorstraat 100
1018 MR Amsterdam
PO BOX 1287
1000 BG Amsterdam
The Netherlands
MG R IDMG R IDir. Yeb HavingaPartner
www.mgrid.net
T +31 886 474 303
F +31 886 474 301
M +31 652 523 546
Oostenburgervoorstraat 100
1018 MR Amsterdam
PO BOX 1287
1000 BG Amsterdam
The Netherlands
Backup slides
HL7v3 reference information model
1/6
Source: Grahame Grieve
Interval and sets of point in time
2/6
• Point in time is relevant to every query
• HL7v3 Point in Time implementation
• Operations supported:• Comparison: overlaps, contains
• Arithmetic: +, −, intersect
• Aggregation: sum
• Construction: intervalafter and friends
• Indexable
Mixed load results
3/6
Mixed load results – zoom
4/6
Mixed load test 2
5/6
• Determine results for a Portavita like mixed load
Query =
select 90%
tpc-b 9.9%
complex 0.1% of the time
• Where complex isSELECT h.tid as teller , SUM(delta), a.aid as account, AVG(abalance)
FROM pgbench_history h
JOIN pgbench_accounts a ON a.aid=h.aid
WHERE h.bid = :bid GROUP BY h.tid, a.aid ORDER BY h.tid
• Platform and test configuration as before
Mixed load results 2
6/6