Date post: | 09-Jan-2017 |
Category: |
Engineering |
Upload: | cihan-b |
View: | 162 times |
Download: | 2 times |
GLOBAL SECONDARY INDEXESNEW HIGH PERFORMANCE INDEXER
Cihan BiyikogluDir. Product Management
©2015 Couchbase Inc. 2
Goals Get to know Global Secondary Indexes (GSI) –
the new high performance indexer for N1QL Look at Indexing lifecycle & management with
GSI Cover top best practices and tips with GSI
©2015 Couchbase Inc. 3
Agenda Overview
Indexing in Couchbase Server 4.0 Couchbase Server 4.0 Architecture Indexers in Couchbase Server 4.0 Indexing today and Indexing with GSI
Working with Global Secondary Indexes GSI Architecture GSI Lifecycle - Creation & Maintenance Index Availability & Rebalance Index Placement and Load Balancing Monitoring GSI Best Practices for GSI
Q&A
Overview
©2015 Couchbase Inc. 5
Couchbase Server Cluster Architecture
STORAGE
Couchbase Server 1
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Managed CacheStorage
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 2
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 3
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 4
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 5
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 6
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
©2014 Couchbase Inc.
Couchbase Server Cluster Architecture
STORAGE
Couchbase Server 1
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Managed CacheStorage
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 2
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 3
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 4
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 5
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
STORAGE
Couchbase Server 6
SHARD7
SHARD9
SHARD5
SHARDSHARDSHARD
Managed Cache
Cluster ManagerCluster Manager
Data Service
Index Service
Query Service
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
Managed CacheStorage
©2015 Couchbase Inc. 7
Indexing in Couchbase Server 4.0 Multiple Indexers
GSI – Index ServiceNew indexing for N1QL for low latency queries without compromising on mutation performance (insert/update/delete)Independently partitioned and independently scalable indexes in Indexing Service
Map/Reduce Views – Data ServicePowerful programmable indexer for complex reporting and indexing logic. Full partition alignment and paired scalability with Data Service.
Spatial View – Data ServiceIncremental R-tree indexing for powerful bounding-box queriesFull partition alignment and paired scalability with Data Service
New
©2015 Couchbase Inc. 8
Query and Index TodayOnce upon a time in a User Profile System…. Q1: Find the top 10 most “active” customer by
#logins in Jan 2015
{…“customer_name” : ”Cihan”,“total_logins”: {…
“aug_2015”:100,…}
“type” : “customer_profile”…}
…
Q1Active @ Jan
2015
©2015 Couchbase Inc. 9
Query and Index TodayINDEX ON Customer_bucket(customer_name, total_logins.jan_2015)WHERE type=“customer_profile”;
SELECT customer_name, total_logins.jan_2015 FROM customer_bucketWHERE type=“customer_profile”ORDER BY total_logins.jan_2015 DESC LIMIT 10;
…
Q1Active @ Jan
2015
Q1: Execution Plan on N nodes• Scatter: Execute Q1 on N nodes• Gather: gather N results• Finalize: Execute Q1 on
governor node
1
2 2 2 2 2
3
123
©2015 Couchbase Inc. 10
Query and Index with GSIINDEX ON Customer_bucket(customer_name, total_logins.jan_2015)WHERE type=“customer_profile”;
SELECT customer_name, total_logins.jan_2015 FROM customer_bucketWHERE type=“customer_profile”ORDER BY total_logins.jan_2015 DESC LIMIT 10;
…
Q1Active @ Jan
2015
Q1: Execution Plan on N nodes• Execute Q1 on N1QL Service
node• Scan index on Index Service node
12
13
©2015 Couchbase Inc. 11
Introducing Global Secondary IndexesWhat are Global Secondary Indexes? High performance indexes for low latency queries with powerful caching, storage and independent placement.
Power of GSI Fully integrated into N1QL Query Optimization and
Execution Independent Index Distribution for Limiting scatter-
gather Independent Scalability with Index Service – more on
this later Powerful caching and storage with ForestDB
©2015 Couchbase Inc. 12
Which to choose – GSI vs Views
Workloads New GSI in v4.0
Map/Reduce Views
Complex Reporting
Just In Time Aggregation Pre-aggregated
Workload Optimization
Optimized for Scan Latency & Throughput
Optimized for Insertion
Flexible Index Logic
N1QL Functions Javascript
Secondary Lookups
Single Node Lookup Scatter-Gather
Tunable Consistency
Staleness false or ok or everything in between
Staleness false or ok
©2015 Couchbase Inc. 13
Which to choose – GSI vs Views
Capabilities New GSI in v4.0
Map/Reduce Views
Partitioning Model Independent – Indexing Service
Aligned to Data – Data Service
Scale Model Independently Scale Index Service
Scale with Data Service
Fetch with Index Key Single Node Scatter-Gather
Range Scan Single Node Scatter-Gather
Grouping, Aggregates With N1QL Built-in with Views API
Caching Managed Not Managed
Storage ForestDB Couchstore
Availability Multiple Identical Indexes load balanced
Replica Based
Deep Dive
GSI Architecture
©2015 Couchbase Inc. 16
Data Service
Projector & Router
Indexing Service
Query ServiceIndex Service
SupervisorIndex maintenance &
Scan coordinator
Index#2
Index#1
Query Processorcbq-engine
Bucket#1
Bucket#2
DCP Stream Index#4Index#3
...Bucket#2
Bucket#1
Projector and Router: 1 Projector and Router per node1 stream of changes per buckets per supervisor
ForestDBStorage Engine Supervisor
1 Supervisor per nodeMany indexes per Supervisor
©2015 Couchbase Inc. 17
Deeper Dive into [email protected] - Architecture Track
Deep Dive into Global Secondary Indexing Architecture in Couchbase
Server 4.0
John Liang, Architect, Couchbase
GSI Lifecycle
©2015 Couchbase Inc. 19
Indexing Lifecycle Primary vs Secondary
Primary Index is a full list of document keys within a given bucketCREATE PRIMARY INDEX index_nameON bucket_name USING GSI|VIEWWITH `{"nodes”: [“node_name”], “defer_build”:true}`; //GSI-ONLY
Secondary Index is an index on a field/expression on a subset of documents for lookups
CREATE INDEX index_nameON bucket_name (field/expression, …)USING GSI|VIEWWHERE filter_expressionsWITH `{"nodes”: [“node_name”], “defer_build”:true}`; //GSI-ONLY
©2015 Couchbase Inc. 20
Deferred Index Building Index building can be deferred to build multiple
indexes all at once with greater scan efficiency.
CREATE INDEX … WITH {…“defer_build”:true};CREATE INDEX … WITH {…“defer_build”:true};…BUILD INDEX ON bucket_name(index_name1, …) USING GSI;
DEMOQuick tour of GSI
GSI Partitioning and Placement
©2015 Couchbase Inc. 23
GSI Placement and PartitioningPlace GSI Indexes using NODES clause
Each GSI reside on 1 node You can specify the node using nodes clause
You can scale out the index by creating identical indexes (load balanced)
CREATE INDEX i1 … WITH {“nodes”:”node1”};
CREATE INDEX i1 … WITH {“nodes”:”node1”};CREATE INDEX i2 … WITH {“nodes”:”node2”};
…
©2015 Couchbase Inc. 24
GSI Placement and PartitioningPartition Indexes Manually with WHERE
clause
You can partition with the WHERE clause and place on various nodes for scaling outCREATE INDEX i1 … WHERE zipcode between “94000” and
“94999” …CREATE INDEX i2 … WHERE zipcode between “95000” and
“96000” …
GSI Availability and Rebalance
©2015 Couchbase Inc. 26
GSI Availability and RebalanceUse multiple identical indexes for availability
GSI Availability Create multiple identical indexes on separate nodes for
availability GSI will auto divert traffic if any copy goes down
GSI & Rebalance Removing/Failing a node with index service, remove the
GSI indexes on that node Adding a node with index service, won’t automatically
move some indexes to the node.
Monitoring GSI
©2015 Couchbase Inc. 28
Monitoring GSI IndexesIndex Size and Maintenance StatsIndex Scan Stats
Best Practices with GSI
©2015 Couchbase Inc. 30
New Consistency Settings! View Stale-ness
Ok: unbounded – query what’s available in the index/view now
False: query after all changes up to the request timestamp (and maybe more) has been indexed for a given index or view.
New Indexes with Couchbase Server 4.0 Improves granularity of the consistency logical-
timestamp. New: Scan Consistency can be set to any logical
timestamp
Indicate stale=false to stale=ok and everything in between
©2015 Couchbase Inc. 31
Flexible Consistency Settings Time
t1 insert (k1, v1)…
t2 do other business logic computation…
t3 issue query/read on (k1,v1) with t3 vs t1Catch up all the
indexes to t3 and then issue query
Identical to “stale=false”
Catch up all the indexes to t1 and then issue query
Improved efficiency over “stale=false”
©2015 Couchbase Inc. 32
Complex Types and GSI Indexing Complex Types
Sub-documents attributesCREATE INDEX ifriend_id ON default(friends.id) USING GSI;
SELECT * FROM default WHERE friends.id= "002819”;
{ "id": "00000000000001", "desc": ”---", "type": "friends", "tags": [0,1,2,3,4,5,6,7,8,9], "friends": { "id": "002819", "class": "005" } }
©2015 Couchbase Inc. 33
Complex Types and GSI Indexing Complex Types
Compound KeysCREATE INDEX ifriends_id_class ON default(friends.id, friends.class) USING GSI;
SELECT * FROM default WHERE friends.id="002819" and friends.class="005”;
{ "id": "00000000000001", "desc": ”---", "type": "friends", "tags": [0,1,2,3,4,5,6,7,8,9], "friends": { "id": "002819", "class": "005" } }
©2015 Couchbase Inc. 34
Complex Types and GSI Indexing Complex Types
Sub-documentsCREATE INDEX ifriend ON default(friends) USING GSI;
SELECT * FROM default WHERE friends= {"class": "005","id":"002819"};
{ "id": "00000000000001", "desc": ”---", "type": "friends", "tags": [0,1,2,3,4,5,6,7,8,9], "friends": { "id": "002819", "class": "005" } }
©2015 Couchbase Inc. 35
Complex Types and GSI Indexing Complex Types
ArraysCREATE INDEX itags_sorted ON default(ARRAY_SORT(tags)) USING GSI;
SELECT * FROM default WHERE tags= ARRAY_SORT([0,1,2,3,4,5,6,7,8,9]);
{ "id": "00000000000001", "desc": ”---", "type": "friends", "tags": [0,1,2,3,4,5,6,7,8,9], "friends": { "id": "002819", "class": "005" } }
Get Started Today Couchbase Server 4.0 & N1QL
Couchbase.com/beta
Thank you.