+ All Categories
Home > Technology > Slice: OpenJPA for Distributed Persistence

Slice: OpenJPA for Distributed Persistence

Date post: 19-May-2015
Category:
Upload: pinaki-poddar
View: 1,337 times
Download: 1 times
Share this document with a friend
Description:
Slice : Distributed Persistence for JPA
Popular Tags:
56
© 2009 IBM Corporation Conference materials may not be reproduced in whole or in part without the prior written permission of IBM. WebSphere Services Technical virtual Conferences world class skill building and technical enablement Scale Your JPA Applications with Scale Your JPA Applications with Distributed Database Partitions Distributed Database Partitions Session Number: D05 Session Number: D05 Dr. Pinaki Poddar Dr. Pinaki Poddar [email protected] [email protected]
Transcript
Page 1: Slice: OpenJPA for Distributed Persistence

© 2009 IBM CorporationConference materials may not be reproduced in whole or in part without the prior written permission of IBM.

WebSphere Services Technical virtual Conferencesworld class skill building and technical enablement

Scale Your JPA Applications with Distributed Scale Your JPA Applications with Distributed Database PartitionsDatabase Partitions

Session Number: D05Session Number: D05

Dr. Pinaki PoddarDr. Pinaki [email protected]@us.ibm.com

Page 2: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation2 Pinaki Poddar, SWG/IBM

Overview

Brief tour of JPA

– Design-time Features

– Runtime Behavior

Role of JPA in JEE

– Scalability

Horizontal Distributed Data Partition as a scaling strategy

Slice: JPA for Distributed, Partitioned Databases

– Using Slice

– Under the hood

– Future work

Q & A

Page 3: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation3 Pinaki Poddar, SWG/IBM

@Entity

public class Customer {

@Id

private long id;

private String name;

@OneToMany(mappedBy="customer")

private List<Order> orders;

}

@Entity

public class Order {

@Id

private long id;

@ManyToOne

private Customer customer;

}

JPA uses POJO for Domain Model

Annotate as @Entity

Define persistent identity

Annotate relational mapping

Use full power of Java

– No interface to implement

– No class to inherit from

– Use Collection, List, Set, Map

– Use generics

Convention over configuration

– Implied database naming

– Implied persistence property

Page 4: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation4 Pinaki Poddar, SWG/IBM

Persistence Unit

Persistence Unit

• Set of persistent classes• Mapping metadata• Database & other configurations

*.class

META-INF/persistence.xml

orm.xml

Page 5: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation5 Pinaki Poddar, SWG/IBM

How to obtain a Persistence Unit?

EntityManagerFactory emf =Persistence.createEntityManagerFactory(“test”);

Instantiate via bootstrap

InitialiContext ctx = new InitialContext();EntityManagerFactory emf = ctx.lookup(“myEMF”));

Look up in JNDI

EntityManagerFactoryconstruction is costly

WARNING

@PersistenceUnit(unitName=“test”)private EntityManagerFactory emf;

Inject as a resource

Page 6: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation6 Pinaki Poddar, SWG/IBM

Persistence Context

Persistence Context

• Session/Transaction • Cache of managed instances • Persistent operations

•find()•persist()•merge()•remove()•refresh()•createQuery()•…

Page 7: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation7 Pinaki Poddar, SWG/IBM

How to obtain a Persistence Context?

@PersistenceContextprivate EntityManager em;

Inject as a resource

EntityManager em = emf.createEntityManager()

Construct from Persistence Unit

Page 8: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation8 Pinaki Poddar, SWG/IBM

Persistence Context manages instances in a group

Persistence Unit

Persistence Context

Account pc1 = em1.find(Account.class,1245);

Persistence Context

Account pc2 = em2.find(Account.class,1245);

em1 = emf.createEntityManager();em2 = emf.createEntityManager();

Account

ID NAME AMOUNT

2347

John $ 12000.57

1245

Mary $ 34568.89

SELECT ID,NAME FROM ACCOUNT t WHERE t.ID=1245

Page 9: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation9 Pinaki Poddar, SWG/IBM

A Question about Identity

pc1 == pc2 ?

pc1.equals(pc2) ?

Important questions

• Reference-based identity• Value-based identity

Java supports two identities• Persistence-based identity

JPA adds another identity

•Persistence Identity defines uniqueness within a Persistence Context•An instance is managed by one and only one Persistence Context at a time

Page 10: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation10 Pinaki Poddar, SWG/IBM

Life of Persistence Context

em = emf.createEntityManager();

begin(); commit(); flush(); clear();

close();

begin(); commit();

Extended Persistence Context

Transactional Persistence Context

Time

Page 11: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation11 Pinaki Poddar, SWG/IBM

Query executes in a Persistence Context

Persistence Context is factory for Query

Query is expressed in JPQL (Java Persistence Query Language)

Selected instances are added to the persistence context

Selected instances are returned in a ListEntityManager em = …;

String jpql = “SELECT p FROM Person c WHERE p.name=:name”;

List result = em.createQuery()

.setParameter(“name”, “John”)

.getResultList();

Page 12: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation12 Pinaki Poddar, SWG/IBM

Transaction Scaling in JPA

em = emf.createEntityManager()

begin() commit() flush() clear()

close()

begin() commit()

Extended Persistence Context

Transaction-scoped Persistence Context

L2 Data CacheDatabase Transaction

Time

Page 13: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation13 Pinaki Poddar, SWG/IBM

Optimistic Versioning scales transaction

begin();commit();

pc1 = find(Item.class, 1234);

qty:5version: 56

Item:1234

qty:30version: 56

Item:1234

begin(); commit();pc2 = find(Item.class, 1234);

qty:5version: 56

Item:1234

qty:87version: 56

Item:1234

UPDATE ITEM SET QTY=30, VERSION=57 WHERE ID=1234 AND VERSION=56

ID 1234 QTY 5 VERSION 56 ID 1234 QTY 30 VERSION 57

pc2.setQty(87);

pc1.setQty(30);

1

3

42

5

6

7

9

8

UPDATE ITEM SET QTY=87, VERSION=57 WHERE ID=1234 AND VERSION=56

10

OptimisticException

Page 14: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation14 Pinaki Poddar, SWG/IBM

Multi-level caches favor read-mostly sessions

em1 = emf.createEntityManager();

em1.begin() em1.commit() em2.begin()

em2.commit()

L2 Data Cache

Database Access & Transaction

Time

em1.query()

em2 = emf.createEntityManager();

em2.find() em2.find() em2.remove()

Page 15: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation15 Pinaki Poddar, SWG/IBM

Scaling against Data Volume

Data is growing rapidly

– compounded annual growth rate of worldwide capacity of compliant records from 2003 to 2006

64%

– Unbounded nature of the Web

•From a web site, a company can generate several gigabytes of data each day

Page 16: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation16 Pinaki Poddar, SWG/IBM

Distributed Horizonal Partition

Horizontal Partition

– put different rows into different tables

Distributed Horizontal Partition

– put different rows into different databases

Page 17: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation17 Pinaki Poddar, SWG/IBM

divide et impera

Natural partition exists in many domains

– Geographical (Customers by State)

– Temporal (PurchaseOrders by Month)

– Personal (Blog Posts by User)

Partition is natural in some scenarios

– Hosted Platforms

– Software-As-Service

Page 18: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation18 Pinaki Poddar, SWG/IBM

Overview

Brief tour of JPA

– Design-time Features

– Runtime Behavior

Role of JPA in JEE

– Scalability

Horizontal Distributed Data Partition as a scaling strategy

Slice: JPA for Distributed, Partitioned Databases

– Using Slice

– Under the hood

– Future work

Q & A

Page 19: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation19 Pinaki Poddar, SWG/IBM

What is Slice?

Slice is an OpenJPA module to transact with distributed, horizontally partitioned databases

Incubated as Apache Lab project in Jan 2008

Included as OpenJPA module since June 2008

Slice is bundled with OpenJPA within WAS v7

Slice is not the best thing since sliced bread

Page 20: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation20 Pinaki Poddar, SWG/IBM

What is OpenJPA?

An implementation of JPA Specification

Default persistence provider for WebSphere EJB3 Feature Pack v 6.1 and WAS v 7.x

Apache Project since May 2007

– http://openjpa.apache.org

Operational codebase since 2002

Rich, extended, ahead-of-the-curve feature set

Powerful configurability

Page 21: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation21 Pinaki Poddar, SWG/IBM

Architectural Tiers of JPA-based service

JPA-based User Application

OpenJPA

Standard JPA API

JDBC API

Page 22: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation22 Pinaki Poddar, SWG/IBM

Architectural Tiers of Slice-based service

Slice-based User Application

OpenJPA

Standard JPA API

JDBC API

Slice

OpenJPAis a plugabbleplatform

Page 23: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation23 Pinaki Poddar, SWG/IBM

Features of Slice

Slice-based User Application

OpenJPA

Standard JPA API

JDBC API

Slice

No change toApplication code orDomain Model

User-definedDistribution &Replication Policy

Flexible per-SliceConfiguration

Parallel QueryExecution

HeterogeneousDatabases

Master-basedSequence

Targeted Query

Page 24: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation24 Pinaki Poddar, SWG/IBM

Overview

Brief tour of JPA

– Design-time Features

– Runtime Behavior

Role of JPA in JEE

– Scalability

Horizontal Distributed Data Partition as a scaling strategy

Slice: JPA for Distributed, Partitioned Databases

– Using Slice

– Under the hood

– Future work

Q & A

Page 25: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation25 Pinaki Poddar, SWG/IBM

Using Slice

No change in Application Code

No change to Domain Model

OK, almost

Page 26: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation26 Pinaki Poddar, SWG/IBM

<?xml version="1.0" encoding="UTF-8"?><persistence xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.0" xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_1_0.xsd"> <persistence-unit name="test“ transaction=“RESOURCE_LOCAL”> <provider>org.apache.openjpa.persistence.PersistenceProviderImpl</provider> <class>domain.EntityA</class> <class>domain.EntityB</class> <properties> <property name="openjpa.ConnectionDriverName" value="com.mysql.jdbc.Driver"/> <property name="openjpa.ConnectionURL" value="jdbc:mysql://localhost/test"/> <property name="openjpa.jdbc.SynchronizeMappings" value="buildSchema"/> <property name="openjpa.Log" value="SQL=TRACE"/> </properties> </persistence-unit>

Persistence Unit Configuration

List of knownPersistent types

Vendor-specific configuration

Governed by XML Schema

JPA Provideris pluggable

META-INF/persistence.xmlIdentified byUnit Name

Page 27: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation27 Pinaki Poddar, SWG/IBM

Per-Slice Configuration <properties>

<property name="openjpa.BrokerFactory" value=“slice"/>

<property name=“openjpa.slice.Names” value=“One,Two,Three”/> <property name=“openjpa.slice.Master” value=“One”/>

<property name="openjpa.slice.DistributionPolicy" value=“acme.org.MyDistroPolicy"/>

<property name="openjpa.ConnectionDriverName" value="com.mysql.jdbc.Driver"/> <property name="openjpa.slice.One.ConnectionURL" value="jdbc:mysql://localhost/slice1"/> <property name=“openjpa.slice.Two.ConnectionURL” value=“jdbc:mysql://localhost/slice2”/>

<property name=“openjpa.slice.Three.ConnectionDriverName” value=“com.ibm.db2.jcc.DB2Driver”/> <property name=“openjpa.slice.Three.ConnectionURL” value=“jdbc:db2://mac3:50000/slice3”/>

<property name="openjpa.jdbc.SynchronizeMappings" value="buildSchema"/> </properties> </persistence-unit>

META-INF/persistence.xml

Activate Slice

Declare slices

Configure each slice

Configure common behavior

Define Data Distribution Policy

Page 28: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation28 Pinaki Poddar, SWG/IBM

Rules of Configuring slices

Each slice is identified by a logical name

All slice names can be specified by openjpa.slice.Names

Or determined implicitly– openjpa.slice.XYZ.abc declares a slice with logical name XYZ

Each slice can be configured independently

Each slice property defaults to common configuration

– If openjpa.slice.XYZ.abc is not specified then it defaults to value of openjpa.abc property

A master slice is either configured by openjpa.slice.Master property

Or automatically detected by convention/heuristic as the first slice

Unreachable slices are ignored at startup if openjpa.slice.Lenient property is set to true.

Page 29: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation29 Pinaki Poddar, SWG/IBM

How to distribute data across slices?

01: EntityManager em = …;

02: em.getTransaction().begin();

03: Person person = new Person();

04: person.setName(“John”);

05: person.setAge(42);

06: Address addr = new Address();

07: addr.setCity(“New York”);

08: person.setAddress(addr);

09: em.persist(person);

10: em.getTransation().commit();

01: public class MyDistributionPolicy implements DistributionPolicy {

02: public String distribute(Object pc, List<String> slices, Object ctx) {

03: return ((Person)pc).getAge() > 40)

04: ? slices.get(0) : slices.get(1);

05: }

06: }

@Entitypublic class Person { private String name; private int age; @OneToOne (cascade=ALL) private Address address;}

@Entitypublic class Address { private String city;}

Use

r A

pp

licati

on

Domain Classes

Data Distribution Policy

Page 30: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation30 Pinaki Poddar, SWG/IBM

Distribution Policypublic interface DistributionPolicy { /** * Gets the name of the slice where the given newly persistent * instance will be stored. * * @param pc The newly persistent or to-be-merged object. * @param slices name of the configured slices. * @param context persistence context managing the given instance. * * @return identifier of the slice. This name must match one of the * configured slice names. * @see DistributedConfiguration#getSliceNames() */ String distribute(Object pc, List<String> slices, Object context); }

Slice will call this method while persisting or merging a root instance.The instance and its closure will be stored in the returned slice.

Page 31: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation31 Pinaki Poddar, SWG/IBM

Collocation Constraint

All instances reachable from a root instance, at the time of persist(), are stored in the same slice

– Because Slice can not join across databases

Compliant Domain Models are referred as Constrained Tree Schema

– Customer has Orders has LineItems– http://www.devwebsphere.com/devwebsphere/2008/01/constrained-tre.html

CustomerCustomer OrderOrder LineItemLineItem0+ 1+

1 1

Page 32: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation32 Pinaki Poddar, SWG/IBM

What if schema is not a Constrained Tree Schema?

CompanyCompany DepartmentDepartment EmployeeEmployee

AddressAddress

CountryCountry

• Partition into databases per Department• Tree Schema Constraint is violated• In which database should Company and Country reside?

Page 33: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation33 Pinaki Poddar, SWG/IBM

Replicate Master Data across slices

Annotate Company and Country as @Replicated

By default, @Replicated entities are stored in all slices

– or implement ReplicationPolicy

@Entity

@org.apache.openjpa.persistence.Replicated

public class Company {..}

Page 34: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation34 Pinaki Poddar, SWG/IBM

Replication Policypublic interface ReplicationPolicy { /** * Gets the name of the slices where the given newly persistent * instance will be replicated. * * @param pc The newly persistent or to-be-merged object. * @param slices name of the configured slices. * @param context persistence context managing the given instance. * * @return identifier(s) of the slice. Each name must match one of the * configured slice names. * @see DistributedConfiguration#getSliceNames() */ String[] replicate(Object pc, List<String> slices, Object context); }

Slice will call this method while persisting any @Replicated instance.

Page 35: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation35 Pinaki Poddar, SWG/IBM

Distributed Query

Each query is executed across all slices in parallel

Performance upper bound is the size of the largest partition not the size of the entire dataset.

Page 36: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation36 Pinaki Poddar, SWG/IBM

Distributed Query

Results from individual slices are appended

NAME AGE JOIN_YEAR

ROB 22 2008

LEUNG 37 2005

BILL 29 2001

NAME AGE JOIN_YEAR

HARI 31 2002

SHIVA 35 1999

JOSE 41 1987

NAME AGE JOIN_YEAR

JOHN 35 2001

MARY 24 2007

SANDRA 43 1975

MARY 24 2007

BILL 29 2001

ROB 22 2008

MARY 24 2007

BILL 29 2001

ROB 22 2008

slice1

slice3

slice2

List result = em.createQuery(“SELECT e FROM Employee e WHERE e.age < 30”) .getResultList();

Page 37: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation37 Pinaki Poddar, SWG/IBM

Distributed Query (Sorting)

Results from individual slices are sorted across all slices for ORDER BY queries

NAME AGE JOIN_YEAR

ROB 22 2008

LEUNG 37 2005

BILL 29 2001

NAME AGE JOIN_YEAR

HARI 31 2002

SHIVA 35 1999

JOSE 41 1987

NAME AGE JOIN_YEAR

JOHN 35 2001

MARY 24 2007

SANDRA 43 1975

MARY 24 2007

BILL 29 2001

ROB 22 2008

BILL 29 2001

MARY 24 2007

ROB 22 2007

slice1

slice3

slice2

List result = em.createQuery(“SELECT e FROM Employee e WHERE e.age < 30 ORDER BY e.name”).getResultList();

Page 38: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation38 Pinaki Poddar, SWG/IBM

Distributed Top-N Query

Top-N Result from each slice is merged (with ordering, if any) for LIMIT BY queries

NAME AGE JOIN_YEAR

ROB 22 2008

LEUNG 37 2005

BILL 29 2001

NAME AGE JOIN_YEAR

HARI 31 2002

SHIVA 35 1999

JOSE 41 1987

NAME AGE JOIN_YEAR

JOHN 35 2001

MARY 24 2007

SANDRA 43 1975

ROB 22 2008

BILL 29 2001

slice1

slice3

slice2

MARY 24 2007

JOHN 35 2001

HARI 31 2002

SHIVA 35 1999

ROB 22 2008

MARY 24 2007

List result = em.createQuery(“SELECT e FROM Employee e ORDER BY e.age”) .setMaxResult(2).getResultList();

Page 39: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation39 Pinaki Poddar, SWG/IBM

Distributed Top-N Query

Top-N Results from individual slices are appended for LIMIT BY queries without an ORDER BY clause.

NAME AGE JOIN_YEAR

ROB 22 2008

LEUNG 37 2005

BILL 29 2001

NAME AGE JOIN_YEAR

HARI 31 2002

SHIVA 35 1999

JOSE 41 1987

NAME AGE JOIN_YEAR

JOHN 35 2001

MARY 24 2007

SANDRA 43 1975

ROB 22 2008

BILL 29 2001

slice1

slice3

slice2

MARY 24 2007

JOHN 35 2001

HARI 31 2002

SHIVA 35 1999

ROB 22 2008

MARY 24 2007

List result = em.createQuery(“SELECT e FROM Employee e”) .setMaxResult(2).getResultList();

Page 40: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation40 Pinaki Poddar, SWG/IBM

Targeted Query Query and find() can be targeted to a subset of slices by hints

NAME AGE JOIN_YEAR

ROB 22 2008

LEUNG 37 2005

BILL 29 2001

NAME AGE JOIN_YEAR

HARI 31 2002

SHIVA 35 1999

JOSE 41 1987

NAME AGE JOIN_YEAR

JOHN 35 2001

MARY 24 2007

SANDRA 43 1975

slice1

slice3

slice2

SANDRA 43 1975

JOHN 35 2001

JOSE 41 1987

SHIVA 35 1999

List result = em.createQuery(“SELECT e FROM Employee e WHERE e.age > 34”)

.setHint(“openjpa.slice.Targets”, “slice1,slice3”)

.getResultList();

SANDRA 43 1975

JOHN 35 2001

JOSE 41 1987

SHIVA 35 1999

Page 41: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation41 Pinaki Poddar, SWG/IBM

Aggregate Query Aggregate results are supported when aggregate

operation is commutative to partition

NAME AGE JOIN_YEAR

ROB 22 2008

LEUNG 37 2005

BILL 29 2001

NAME AGE JOIN_YEAR

HARI 31 2002

SHIVA 35 1999

JOSE 41 1987

NAME AGE JOIN_YEAR

JOHN 35 2001

MARY 24 2007

SANDRA 43 1975

slice1

slice3

slice2

78

37

107

22278

37

107

Number sum = (Number)em.createQuery(“SELECT SUM(e.age) FROM Employee e

WHERE e.age > 30”).getSingleResult();

Page 42: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation42 Pinaki Poddar, SWG/IBM

Aggregate Query Aggregate results are not supported when aggregate

operation is not commutative to partition

NAME AGE JOIN_YEAR

ROB 22 2008

LEUNG 37 2005

BILL 29 2001

NAME AGE JOIN_YEAR

HARI 31 2002

SHIVA 35 1999

JOSE 41 1987

NAME AGE JOIN_YEAR

JOHN 35 2001

MARY 24 2007

SANDRA 43 1975

slice1

slice3

slice2

37.0

37.0

35.6

36.5

37.0

37.0

35.6

3WRONG!

Number sum = (Number)em.createQuery(“SELECT AVG(e.age) FROM Employee e

WHERE e.age > 30”).getSingleResult();

Page 43: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation43 Pinaki Poddar, SWG/IBM

Distributed Aggregate Query Limitations

Commutativity

– ability to change the order of operations without changing the end result.

SUM() or MAX() is commutative to partition– SUM(D) = SUM(SUM(D1), SUM(D2), SUM(D3))

where Partition(D) = {D1,D2,D3}

But AVG() is not– AVG(D) != AVG(AVG(D1), AVG(D2), AVG(D3))

Page 44: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation44 Pinaki Poddar, SWG/IBM

Query for Replicated Entities

Replicated instances are detected and queried in a single slice

Number sum = (Number)em.createQuery(“SELECT COUNT(c) FROM Coutry c”)

.getSingleResult();

CODE POPULATION

US 300M

GERMANY

82M

INDIA 1200M

CODE POPULATION

US 300M

GERMANY

82M

INDIA 1200M

CODE POPULATION

US 300M

GERMANY

82M

INDIA 1200M

slice1

slice3

slice2

3

3 @Entity

@Replicated

public class Country {..}

Page 45: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation45 Pinaki Poddar, SWG/IBM

Updates

Slice remembers original slice of each instance.

– SlicePersistence.getSlice(Object pc) returns the logical slice name for the given argument.

If an instance is modified then the update occurs in the original slice.

Replicated instances are updated to many slices– SlicePersistence.isReplicated(Object pc)

Commit will not be invoked for a slice if no update exists for that slice

Page 46: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation46 Pinaki Poddar, SWG/IBM

Database and Transaction

Slices can be in heterogeneous database platforms

– Each slice can use its own JDBC driver

A Master slice is identified for sequence generation

Commits are executed in parallel without any warranty

If all JDBC drivers are XA-compliant then a 2-phase commit provision is available

– Each slice transaction is not seen by the Application Server’s Transaction Manager.

Page 47: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation47 Pinaki Poddar, SWG/IBM

Overview

Brief tour of JPA

– Design-time Features

– Runtime Behavior

Role of JPA in JEE

– Scalability

Horizontal Distributed Data Partition as a scaling strategy

Slice: JPA for Distributed, Partitioned Databases

– Using Slice

– Under the hood

– Future work

Q & A

Page 48: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation48 Pinaki Poddar, SWG/IBM

Core Architectural constructs of OpenJPA

EntityManagerFactory

BrokerFactory

EntityManager

Broker

StoreManager

JDBCStoreManager

JDBC API

OpenJPAConfiguration

creates

creates

delegates delegates

configured by

Page 49: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation49 Pinaki Poddar, SWG/IBM

Distributed Template Design Pattern

public class DistributedTemplate<T> implements T, Iterable<T> { protected List<T> _delegates = new ArrayList<T>(); public boolean execute(String arg0) {

boolean ret = true;for (T t:this) ret = t.execute(arg0) & ret;return ret;

}

public Iterator<T> iterator() { return _delegates.iterator(); }}

• Distributed Template Design Pattern as main metaphor• on JDBC artifacts (Statement, ResultSet)• major OpenJPA artifacts such as StoreManager, Query.

Page 50: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation50 Pinaki Poddar, SWG/IBM

Slice extends OpenJPA by Distributed Template

EntityManagerFactory

BrokerFactory

EntityManager

Broker

DistributedStoreManager

JDBCStoreManager

JDBC API

JDBCStoreManagerJDBCStore

Manager

DistributedConfiguration

delegates delegates

creates

creates

configures

applies Distributed Template Pattern

Not aware of partitioned Databases

Page 51: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation51 Pinaki Poddar, SWG/IBM

Overview

Brief tour of JPA

– Design-time Features

– Runtime Behavior

Role of JPA in JEE

– Scalability

Horizontal Distributed Data Partition as a scaling strategy

Slice: JPA for Distributed, Partitioned Databases

– Using Slice

– Under the hood

– Future work

Q & A

Page 52: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation52 Pinaki Poddar, SWG/IBM

Future Work: Evolving data distribution

Gradual Redistribution

– Complete migration from one slice to another is currently supported

– Gradual migration of data from one slice to another

• Read from one slice, write to another

Page 53: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation53 Pinaki Poddar, SWG/IBM

Future Work: Courage under Fire

Graceful degradation

– can ignore unreachable slices at bootstrap

– can cope with unreachable slices at runtime

– can not reconnect dynamically

Page 54: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation54 Pinaki Poddar, SWG/IBM

Future Work: Heterogeneity

Heterogeneous Schema

– assumes each slice has identical schema

– relax this assumption

Join relation across slices

– this one is hard problem

Page 55: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation55 Pinaki Poddar, SWG/IBM

Overview

Brief tour of JPA

– Design-time Features

– Runtime Behavior

Role of JPA in JEE

– Scalability

Horizontal Distributed Data Partition as a scaling strategy

Slice: JPA for Distributed, Partitioned Databases

– Using Slice

– Under the hood

– Future work

Q & A

Page 56: Slice: OpenJPA for Distributed Persistence

Application Integration & Middleware

© 2009 IBM Corporation56 Pinaki Poddar, SWG/IBM

Thank You!


Recommended