NoSQL for Java Developers

Post on 16-Jul-2015

819 views 2 download

Tags:

transcript

Webinar  NoSQL  For  Java  Developers

Simon  Baslé  -­‐  SDK  Engineer

1

Agenda

▪Motivation  ▪ The  new  API,  what  does  it  look  like?  ▪ Example  ▪ RxJava  and  the  asynchronous  API  ▪ Error  handling  and  batching  ▪ Live  Demo  (N1QL  DSL)  ▪ Plans  for  the  future

2

Let's Dive In !

3

Motivationfor  the  2.0  SDK  generation

4

Benefits  of  a  SDK?  

• Work  with  the  database  using  familiar  language  and  

patterns  associated  with  it  

• It  does  the  heavy  lifting  for  you  (cluster  topology  

awareness,  op  routing,  dealing  with  protocol(s),  ...)  

• Mutualize  work  on  performance  and  offer  good  

abstractions5

Previous  generation  (1.4.x)  based  on  Spymemcached

• a  lot  of  legacy,  Spymemcached  is  maybe  too  

memcached-­‐focused  

• async,  but  with  java.util.concurrent.Future  (limited)  

• asynchronous  interdependent  dataflows  are  hard  to  build

6

Time  to  freshen  up  the  API!  

• Find  a  better,  more  expressive  way  of  doing  async  

• Offer  coherent  well-­‐thought  abstractions  

• Plan  for  the  future,  evolvability  

• Don't  forget  about  performance!

7

2.0  is  a  complete  rewrite  with  asynchronicity  at  its  core...  

RxJava Netty

8

2.0  is  a  complete  rewrite  ...but  a  synchronous  API  built  on  top  of  it  is  offered  too  

9

Architectural  Overview

10

Architectural  Overview

Core▪ Core  abstracts  low-­‐level  

▪ Fully  async  &  message-­‐oriented  

▪ Low  overhead,  high  performance  

▪ Java  6+11

Architectural  Overview

Core

Java

▪ Java  Client  is  a  higher  level  abstraction  

▪ Exposes  RxJava  Observables  in  the  Asynchronous  API  

▪ Adds  a  Synchronous  API  on  top  of  it  

▪ Java  6+  (especially  great  for  clients  in  Java  8!) 12

Architectural  Overview

Core

Java Hadoop

Spring

▪ All  that  can  be  leveraged  to  build  connectors  

▪ Spring  Data,  Hadoop  in  the  works

13

Architectural  Overview

Core

Java Scala JRuby,  ... Hadoop

Spring Play!2

▪ Other  SDKs  /  connectors  could  be  built  

▪ even  by  you,  the  community

14

The  New  API  -­‐  What  does  it  look  like?(for  now  the  Synchronous  API)

15

Cluster  and  BucketEntry  points  Class  naming  refers  to  Couchbase  concepts

16

Cluster  and  BucketEntry  points  Class  naming  refers  to  Couchbase  concepts

▪ To  bootstrap,  connect  and  also  manage  nodes,  use  Cluster  – just  needs  one  or  more  IPs  or  hostnames  to  connect  

▪ To  operate  on  data,  use  Bucket  

▪ A  Bucket  is  thread  safe,  should  be  reused  instead  of  re-­‐created

17

The  Document<T>where  the  data  goes

18

The  Document<T>where  the  data  goes

▪ Represents  the  data  in  Couchbase  

▪ Both  content()  and  metadata  :  id(),  cas(),  expiry()  

▪ CAS:  Compare-­‐and-­‐Swap.  Sequence  number  that  allows  to  prevent  mutation  clashes

19

The  Document<T>where  the  data  goes

▪ JsonDocument  ▪ BinaryDocument  ▪ SerializableDocument  ▪ ...  

▪ The  JsonDocument's  content  is  a  JsonObject  ▪ simple  Map-­‐like  API  to  deal  with  JSON  ▪ there's  also  a  RawJsonDocument  to  deal  with  JSON  as  String

20

Key-­‐Value  Operationsthe  gist  of  the  API

21

Key-­‐Value  Operationsthe  gist  of  the  API

▪ Get  (but  also  getFromReplica,  getAndLock,  touch)  

▪ Upsert  (but  also  insert,  replace)  

▪ Remove  

▪ Append,  Prepend  

▪ Unlock...22

Key-­‐Value  Operationsthe  gist  of  the  API

▪ Each  operation  returns  a  Document,  by  default  JsonDocument,  with  the  updated  CAS  

▪ Overrides  to  choose  a  target  Document  type  

▪ Overrides  to  customize  the  timeout  for  each  operation  

▪ Mutating  operations  have  overrides  to  choose  replication  factor  and  persistence  constraints  to  wait  for  before  returning 23

Management  APIs"ops  stuff"

24

Management  APIs"ops  stuff"

▪ cluster.createManager("admin",  "pass")  ▪ info  ▪ hasBucket  ▪ getBucket  ▪ insert/remove/updateBucket  

▪ bucket.bucketManager()  ▪ info  ▪ flush  ▪ get/insert/upsert/removeDesignDocument  ▪ publishDesignDocument 25

Example

26

connect  to  cluster  

prepare  JSON  

insert  a  document  vs.  upsert  a  document  

clean  up

27

Cluster cluster = CouchbaseCluster.create("127.0.0.1"); Bucket testBucket = cluster.openBucket("testBucket");

//create some JSON content for a new DocumentJsonObject content = JsonObject.create() .put("type", "test") .put("value", "this is a test"); JsonDocument testDoc = JsonDocument.create("test", content);

//insert the fresh DocumenttestBucket.insert(testDoc);

28

//try again with same key? try { testBucket.insert(testDoc); } catch (DocumentAlreadyExistsException e) { //expected //upsert works whenever document exists or not testBucket.upsert(testDoc); }

//cleans up ALL shared resources // (also calls close() on each Bucket)cluster.disconnect();

29

Q&A  Intermission  1any  questions  at  this  point?

30

RxJava  &  the  Asynchronous  APIModern  Async  Dataflows

31

RxJava  101aka  "Async  Goodness"

32

RxJava  101aka  "Async  Goodness"

▪ Observable<T>    and    Observer<T>  

▪ Observable  =  a  Stream  of  Data  

▪ Use  Rx  operators  to  manipulate  the  stream

33

RxJava  101compare  with  Iterator

Single Multiple

Sync  ("pull") T Iterable<T>

Async  ("push") Future<T> Observable<T>

34

RxJava  101Observer<T>  interface

Event Iterator<T> Observer<T>

data  retrieval T  next() onNext(T)

discover  error throw  Exception onError(Throwable)

complete returns onCompleted()35

Observableexposes  operators

▪ Transformation  

▪ Filtering  

▪ Combination  

▪ Managing  Errors

36

Transform!"map  is  the  first  step"

37

Transform!"map  is  the  first  step"

▪ Observable<T>  source  

▪ Observable<R>  transformed  =  source.map(mapFunction)  

▪ mapFunction  transforms  each  T  into  an  R  ▪ eg.  T  =  String  and  R  =  Integer,  mapFunction  returns  string.length()

38

Transform!"map  is  the  first  step"

39

Filterwhen  you  only  need  part  of  the  data

40

Filterwhen  you  only  need  part  of  the  data

▪ first()  

▪ take(int  n)  

▪ skip(int  n)  

▪ distinct()  

▪ ...

41

FilterThe  Heavy  Artillery

▪ Observable<T>  filter(  filterPredicate  )  

▪ Only  emit  the  Ts  that  match  the  filterPredicate

42

FilterThe  Heavy  Artillery

43

Combineseveral  streams

44

Combineseveral  streams

▪ merge(otherObservableOfT)  

▪ concat(otherObservableOfT)  

▪ mergeDelayErrors,  zipWith,  join,  ...

45

Combinemerge

46

Combineconcat

47

Back  to  Transformations"improve  your  Rx-­‐Fu  with  flatMap"

48

Back  to  Transformations"improve  your  Rx-­‐Fu  with  flatMap"

▪ Like  a  map...  ▪ ...for  when  your  transformation  must  also  be  asynchronous

49

Back  to  Transformations"improve  your  Rx-­‐Fu  with  flatMap"

▪ Like  a  map...  ▪ ...for  when  your  transformation  must  also  be  asynchronous  

▪ transforms  each  item  to  an  Observable<R>  (not  just  an  R)  

▪ flattens  them  all  in  the  output  stream  to  give  a  flat  stream  of  Rs

50

Back  to  Transformations"improve  your  Rx-­‐Fu  with  flatMap"

51

The  Very  Last  Stepsubscription

52

The  Very  Last  Stepsubscription

▪ You  have  your  final  Observable  representing  your  async  data  flow  

▪ Now  is  time  to  start  emission  and  describe  what  to  do  with  data  in  the  end  

▪ Always  subscribe  !

53

The  Very  Last  Stepsubscription

▪ either  a  full  implementation  of  Observer  

▪ or  use  Observable<R>.subscribe  "shortcuts"  :  just  pass  in  an  Action  for  the  part  of  the  API  that  are  relevant  (eg.  just  onNext)

54

©2014  Couchbase  Inc. 55

I  Know  Rx-­‐Fu!

Q&A  Intermission  2any  questions  at  this  point?

56

Putting  That  Into  Actionhow  does  it  weave  into  the                  SDK?

57

Putting  That  Into  Action

▪ to  access  the  Async  API,  use  async()  on  both  Cluster  and  Bucket  

▪ now  every  operation  returns  an  Observable<something>  

▪ apply  the  power  of  Rx  to  work  with  your  data  as  a  stream

how  does  it  weave  into  the                  SDK?

58

Async  Example"the  2014  club"

59

Async  Example

I  have  a  list  of  user  IDs  

get  their  detailed  profile  

and,  for  those  who  registered  last  year  (2014),  

give  them  a  badge  

(make  sure  to  persist  it)  

then  finally  display  the  2014  Club

60

Async  Example"the  2014  club"

Observable.from  list  of  IDs  

flatMap  to  get  every  profile  

filter  on  registration  year  

mutate  to  add  the  badge  via  a  map  

flatMap  to  upsert  the  profiles  

map  to  the  stream  of  updated  IDs  

subscribe  in  order  to  print  out   61

Async  Example"the  2014  club"

Observable<String> oneYearClub =Observable.from(Arrays.asList("1", "18")) .map(userId -> "user::" + userId) .flatMap(docId -> bucket.async().get(docId)) .filter(userDoc -> userDoc.content().getInt("yearRegistered") == Calendar.getInstance().get(Calendar.YEAR) - 1) .map(userDoc -> { userDoc.content().getArray("badges").add("oneYearClub"); return userDoc; }) .flatMap(userDoc -> bucket.async().upsert(userDoc)) .map(userDoc -> userDoc.id().replace("user::", "")); oneYearClub.subscribe(id -> System.out.println("Welcome user #" + id + " to the One Year Club!"));

62

Async  Example"the  2014  club"

Observable<String> oneYearClub =Observable.from(Arrays.asList("1", "18")) .map(userId -> "user::" + userId) .flatMap(docId -> bucket.async().get(docId)) .filter(userDoc -> userDoc.content().getInt("yearRegistered")== Calendar.getInstance().get(Calendar.YEAR) - 1) .map(userDoc -> { userDoc.content().getArray("badges").add("oneYearClub"); return userDoc; }) .flatMap(userDoc -> bucket.async().upsert(userDoc)) .map(userDoc -> userDoc.id().replace("user::", ""));oneYearClub.subscribe(id -> System.out.println("Welcome user #" + id + " to the One Year Club!"));

Observable.from(Arrays.asList("1", "18")) .map(userId -> "user::" + userId)

63

Async  Example"the  2014  club"

Observable<String> oneYearClub =Observable.from(Arrays.asList("1", "18")) .map(userId -> "user::" + userId) .flatMap(docId -> bucket.async().get(docId)) .filter(userDoc -> userDoc.content().getInt("yearRegistered")== Calendar.getInstance().get(Calendar.YEAR) - 1) .map(userDoc -> { userDoc.content().getArray("badges").add("oneYearClub"); return userDoc; }) .flatMap(userDoc -> bucket.async().upsert(userDoc)) .map(userDoc -> userDoc.id().replace("user::", ""));oneYearClub.subscribe(id -> System.out.println("Welcome user #" + id + " to the One Year Club!"));

.flatMap(docId -> bucket.async().get(docId))

64

Async  Example"the  2014  club"

Observable<String> oneYearClub =Observable.from(Arrays.asList("1", "18")) .map(userId -> "user::" + userId) .flatMap(docId -> bucket.async().get(docId)) .filter(userDoc -> userDoc.content().getInt("yearRegistered") == Calendar.getInstance().get(Calendar.YEAR) - 1) .map(userDoc -> { userDoc.content().getArray("badges").add("oneYearClub"); return userDoc; }) .flatMap(userDoc -> bucket.async().upsert(userDoc)) .map(userDoc -> userDoc.id().replace("user::", ""));oneYearClub.subscribe(id -> System.out.println("Welcome user #" + id + " to the One Year Club!"));

.filter(userDoc -> userDoc.content().getInt("yearRegistered") == Calendar.getInstance().get(Calendar.YEAR) - 1)

65

Async  Example"the  2014  club"

Observable<String> oneYearClub =Observable.from(Arrays.asList("1", "18")) .map(userId -> "user::" + userId) .flatMap(docId -> bucket.async().get(docId)) .filter(userDoc -> userDoc.content().getInt("yearRegistered") == Calendar.getInstance().get(Calendar.YEAR) - 1) .map(userDoc -> { userDoc.content().getArray("badges").add("oneYearClub"); return userDoc; }) .flatMap(userDoc -> bucket.async().upsert(userDoc)) .map(userDoc -> userDoc.id().replace("user::", ""));oneYearClub.subscribe(id -> System.out.println("Welcome user #" + id + " to the One Year Club!"));

.map(userDoc -> { userDoc.content().getArray("badges") .add("oneYearClub"); return userDoc; })

66

Async  Example"the  2014  club"

Observable<String> oneYearClub =Observable.from(Arrays.asList("1", "18")) .map(userId -> "user::" + userId) .flatMap(docId -> bucket.async().get(docId)) .filter(userDoc -> userDoc.content().getInt("yearRegistered")== Calendar.getInstance().get(Calendar.YEAR) - 1) .map(userDoc -> { userDoc.content().getArray("badges").add("oneYearClub"); return userDoc; }) .flatMap(userDoc -> bucket.async().upsert(userDoc)) .map(userDoc -> userDoc.id().replace("user::", ""));oneYearClub.subscribe(id -> System.out.println("Welcome user #" + id + " to the One Year Club!"));

.flatMap(userDoc -> bucket.async().upsert(userDoc))

67

Async  Example"the  2014  club"

Observable<String> oneYearClub =Observable.from(Arrays.asList("1", "18")) .map(userId -> "user::" + userId) .flatMap(docId -> bucket.async().get(docId)) .filter(userDoc -> userDoc.content().getInt("yearRegistered")== Calendar.getInstance().get(Calendar.YEAR) - 1) .map(userDoc -> { userDoc.content().getArray("badges").add("oneYearClub"); return userDoc; }) .flatMap(userDoc -> bucket.async().upsert(userDoc)) .map(userDoc -> userDoc.id().replace("user::", ""));oneYearClub.subscribe(id -> System.out.println("Welcome user #" + id + " to the One Year Club!"));

.map(userDoc -> userDoc.id().replace("user::", ""));

68

Async  Example"the  2014  club"

Observable<String> oneYearClub =Observable.from(Arrays.asList("1", "18")) .map(userId -> "user::" + userId) .flatMap(docId -> bucket.async().get(docId)) .filter(userDoc -> userDoc.content().getInt("yearRegistered")== Calendar.getInstance().get(Calendar.YEAR) - 1) .map(userDoc -> { userDoc.content().getArray("badges").add("oneYearClub"); return userDoc; }) .flatMap(userDoc -> bucket.async().upsert(userDoc)) .map(userDoc -> userDoc.id().replace("user::", ""));oneYearClub.subscribe(id -> System.out.println("Welcome user #" + id + " to the One Year Club!"));

oneYearClub.subscribe(id -> System.out.println( "Welcome user #" + id + " to the One Year Club!"));

Observable<String> oneYearClub = ...;

69

What  If  there  are  Errors?

70

Error-­‐handling  &  Batching

71

Error  Handlingthe  basics

▪ Basic  Error  Handling  is  using  Observer.onError(Throwable  t)  

▪ One  can  also  use  the  default  value  pattern  ▪  Observable<T>.onErrorReturn(T  default)

72

Error  Handlingmore  advanced

▪ Defer  to  a  backup  stream  on  error  :  ▪ Observable<T>.onErrorResumeNext(Observable<T>  backup)

73

Error  Handlingmore  advanced

▪ Defer  to  a  backup  stream  on  error  :  ▪ Observable<T>.onErrorResumeNext(Observable<T>  backup)

74

Error  Handlingmore  advanced

▪ Retry  on  error  :  ▪ Observable<T>.retry(retryPredicate)  

▪ retryPredicate  takes  the  number  of  retries  and  the  last  error  to  determine  if  another  retry  should  happen  

▪ if  that's  the  case,  there's  a  re-­‐subscription

75

Error  Handlingmore  advanced

▪ Retry  on  error  :  ▪ Observable<T>.retry(retryPredicate)

76

Error  Handlingbuild  on  this  for  production-­‐ready  strategies

▪ Exponential  backoff  ▪ using  retry  overrides  to  retry  with  a  delay  that  grows  

▪ Throttling  ▪ when  rate  of  data  creation  >  rate  of  data  consumption  ▪ divide  timeline  into  windows,  only  emit  first/last  item  of  each  window  

▪ Concept  of  backpressure  ▪ when  rate  of  data  creation  >  rate  of  data  consumption  ▪ a  way  to  let  the  Observer  signal  source  at  what  pace  data  production  is  manageable

77

Batching"where  is  bulk  get  gone?"

78

Batching"where  is  bulk  get  gone?"

▪ Now  relying  on  RxJava  primitives  for  bulk  

▪ Underlying  RingBuffer  +  Netty  is  pretty  capable  for  rapid-­‐burst  requests  

▪ In  a  word,  use  Observable.from(collection  of  keys)  and  flatMap  to  a  get

79

Batching"where  is  bulk  get  gone?"

Observable<JsonDocument>  allDocs  =  Observable.from(allKeys)          .flatMap(new  Func1<String,  JsonDocument>()  {                  public  Observable<JsonDocument>  call(String  id)  {                          return  bucket.async().get(id);                  }          });

80

Batching"where  is  bulk  get  gone?"

Observable<JsonDocument>  allDocs  =  Observable.from(allKeys)          .flatMap(new  Func1<String,  JsonDocument>()  {                  public  Observable<JsonDocument>  call(String  id)  {                          return  bucket.async().get(id);                  }          });  

The  best  thing  is,  it  is  now  possible  on  every  operations,  not  just  get!81

Querying  JSON  with  N1QLdeveloper  preview

82

Let's do a Live Demo!

83

*  Totally  made  with  Special  Effects

*

Let's do a Live Demo!

84

Plans  for  the  futurea  glimpse  at  things  to  come

85

2.1"Overall,  More  Goodness"  

Geo,  N1QL,  ...  +  lots  of  bug  fixes

86

Finalizing  N1QL  Supportgive  your  feedback  on  developer  preview

87

Spring  Data  Connector,  2.0porting  the  adapter  to  the  new  client

88

Thank You!the End

89

Final  Q&A?

90

via  Wikipedia  via  Wikimedia  via  Flickr  via  Wikimedia  ,  still  from  the  motion  picture  

CC  BY-­‐SA  David  Shankbone  CC  BY-­‐SA  Donovan  Govan  

CC  BY-­‐ND  NCinDC  CC  BY  Evan  Lovely  

all  rights  reserved  Warner  Bros

Diver  Kitchen  Funnel    

Then&Now    Computer-­‐Using  Cat  

"The  Matrix",  

End  Credits

91