Date post: | 21-Apr-2017 |
Category: |
Data & Analytics |
Upload: | thelabdude |
View: | 10,921 times |
Download: | 3 times |
• Lucene/Solr committer. Work for Lucidworks; focus on hardening SolrCloud, devops, big data architecture / deployments
• Operated smallish cluster in AWS for Dachis Group (1.5 years ago, 18 shards ~900M docs)
• Solr Scale Toolkit: Fabric/boto framework for deploying and managing clusters in EC2
• Co-author of Solr In Action with Trey Grainger
About Me
Agenda
1. Quick review of the SolrCloud architecture
2. Indexing & Query performance tests
3. Solr Scale Toolkit (quick overview)
4. Q & A
Solr in the wild …
https://twitter.com/bretthoerner/status/476830302430437376
SolrCloud distilled
Subset of optional features in Solr to enable and
simplify horizontal scaling a search index using
sharding and replication.
Goalsperformance, scalability, high-availability,
simplicity, elasticity, and
community-driven!
Collection == distributed index
A collection is a distributed index defined by:• named configuration stored in ZooKeeper
• number of shards: documents are distributed across N partitions of the index
• document routing strategy: how documents get assigned to shards
• replication factor: how many copies of each document in the collection
Collections API:
curl "http://localhost:8983/solr/admin/collections?
action=CREATE&name=logstash4solr&replicationFactor=2&
numShards=2&collection.configName=logs"
SolrCloud High-level Architecture
ZooKeeper• Is a very good thing ... clusters are a zoo!• Centralized configuration management• Cluster state management• Leader election (shard leader and overseer)• Overseer distributed work queue• Live Nodes
• Ephemeral znodes used to signal a server is gone
• Needs at least 3 nodes for quorum in production
ZooKeeper: State Management
• Keep track of live nodes /live_nodes znode• ephemeral nodes
• ZooKeeper client timeout
• Collection metadata and replica state in /clusterstate.json
• Every Solr node has watchers for /live_nodes and /clusterstate.json
• Leader election• ZooKeeper sequence number on ephemeral znodes
Scalability Highlights
• No split-brain problems (b/c of ZooKeeper)• All nodes in cluster perform indexing and execute queries; no
master node• Distributed indexing: No SPoF, high throughput via direct updates
to leaders, automated failover to new leader• Distributed queries: Add replicas to scale-out qps; parallelize
complex query computations; fault-tolerance• Indexing / queries continue so long as there is 1 healthy replica per
shard
Cluster sizing
How many servers do I need to index X docs?
... shards ... ?
... replicas ... ?
I need N queries per second over M docs, how many servers do I need?
It depends!
Testing Methodology• Transparent repeatable results
• Ideally hoping for something owned by the community
• Synthetic docs ~ 1K each on disk, mix of field types• Data set created using code borrowed from PigMix
• English text fields generated using a Zipfian distribution
• Java 1.7u67, Amazon Linux, r3.2xlarge nodes• enhanced networking enabled, placement group, same AZ
• Stock Solr (cloud) 4.10• Using custom GC tuning parameters and auto-commit settings
• Use Elastic MapReduce to generate indexing load• As many nodes as I need to drive Solr!
Cluster Size # of Shards # of Replicas Reducers Time (secs) Docs / sec
10 10 1 48 1762 73,780
10 10 2 34 3727 34,881
10 20 1 48 1282 101,404
10 20 2 34 3207 40,536
10 30 1 72 1070 121,495
10 30 2 60 3159 41,152
15 15 1 60 1106 117,541
15 15 2 42 2465 52,738
15 30 1 60 827 157,195
15 30 2 42 2129 61,062
Indexing Performance
Visualize Server Performance
Direct Updates to Leaders
Replication
Indexing Performance Lessons• Solr has no built-in throttling support – will accept work until it falls over; need to
build this into your indexing application logic• Oversharding helps parallelize indexing work and gives you an easy way to add
more hardware to your cluster• GC tuning is critical (more below)• Auto-hard commit to keep transaction logs manageable• Auto soft-commit to see docs as they are indexed• Replication is expensive! (more work needed here)
GC Tuning• Stop-the-world GC pauses can lead to ZooKeeper session expiration (which is bad)• More JVMs with smaller heap sizes are better! (12-16GB max per JVM ~ less if you
can)• MMapDirectory relies on sufficient memory available to the OS cache (off-heap)• GC activity during Solr indexing is stable and generally doesn’t cause any stop-
the-world collections … queries are a different story• Enable verbose GC logging (even in prod) so you can troubleshoot issues:-verbose:gc –Xloggc:gc.log -XX:+PrintHeapAtGC -XX:
+PrintGCDetails \-XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps \-XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime \-XX:+PrintGCApplicationConcurrentTime
GC Flags I use with Solr-Xss256k \
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC \-XX:MaxTenuringThreshold=8 -XX:NewRatio=3 \
-XX:CMSInitiatingOccupancyFraction=40 \-XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 \
-XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 \-XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=12m \
-XX:CMSFullGCsBeforeCompaction=1 \-XX:+UseCMSInitiatingOccupancyOnly \
-XX:CMSTriggerPermRatio=80 \-XX:CMSMaxAbortablePrecleanTime=6000 \
-XX:+CMSParallelRemarkEnabled \-XX:+ParallelRefProcEnabled \
-XX:+UseLargePages -XX:+AggressiveOpts
Sizing GC Spaces
http://kumarsoablog.blogspot.com/2013/02/jvm-parameter-survivorratio_7.html
Query Performance• Still a work in progress!
• Sustained QPS & Execution time of 99th Percentile (coda hale metrics is good for this)
• Stable: ~5,000 QPS / 99th at 300ms while indexing ~10,000 docs / sec
• Using the TermsComponent to build queries based on the terms in each field.
• Harder to accurately simulate user queries over synthetic data• Need mix of faceting, paging, sorting, grouping, boolean clauses, range queries, boosting,
filters (some cached, some not), etc ...
• Does the randomness in your test queries model (expected) user behavior?
• Start with one server (1 shard) to determine baseline query performance.• Look for inefficiencies in your schema and other config settings
Query Performance, cont.
• Higher risk of full GC pauses (facets, filters, sorting)
• Use optimized data structures (DocValues) for facet / sort fields, Trie-based numeric fields for range queries, facet.method=enum for low cardinality fields
• Check sizing of caches, esp. filterCache in solrconfig.xml
• Add more replicas; load-balance; Solr can set HTTP headers to work with caching proxies like Squid
• -Dhttp.maxConnections=## (default = 5, increase to accommodate more threads sending queries)
• Avoid increasing ZooKeeper client timeout ~ 15000 (15 seconds) is about right
• Don’t just keep throwing more memory at Java! –Xmx128G
Call me maybe - Jepsen
• Solr tests being developed by Lucene/Solr committer Shalin Mangar (@shalinmanger)
• Prototype in place:• No ack’d writes were lost!• No un-ack’d writes succeeded
See: https://github.com/LucidWorks/jepsen/tree/solr-jepsen
https://github.com/aphyr/jepsen
Solr Scale Toolkit
• Open source: https://github.com/LucidWorks/solr-scale-tk
• Fabric (Python) toolset for deploying and managing SolrCloud clusters in the cloud
• Code to support benchmark tests (Pig script for data generation / indexing, JMeter samplers)
• EC2 for now, more cloud providers coming soon via Apache libcloud
• Contributors welcome!
• More info: http://searchhub.org/2014/06/03/introducing-the-solr-scale-toolkit/
Provisioning cluster nodes
• Custom built AMI (one for PV instances and one for HVM instances) – Amazon Linux
• Block device mapping• dedicated disk per Solr node
• Launch and then poll status until they are live • verify SSH connectivity
• Tag each instance with a cluster ID and username
fab new_ec2_instances:test1,n=3,instance_type=m3.xlarge
Deploy ZooKeeper ensemble
• Two options:• provision 1 to N nodes when you launch Solr cluster
• use existing named ensemble
• Fabric command simply creates the myid files and zoo.cfg file for the ensemble
• and some cron scripts for managing snapshots
• Basic health checking of ZooKeeper status:echo srvr | nc localhost 2181
fab new_zk_ensemble:zk1,n=3
Deploy SolrCloud clusterfab new_solrcloud:test1,zk=zk1,nodesPerHost=2
• Uses bin/solr in Solr 4.10 to control Solr nodes• Set system props: jetty.port, host, zkHost, JVM opts• One or more Solr nodes per machine• JVM mem opts dependent on instance type and # of Solr
nodes per instance• Optionally configure log4j.properties to append messages
to Rabbitmq for SiLK integration
Automate day-to-day cluster management tasks• Deploy a configuration directory to ZooKeeper• Create a new collection• Attach a local JConsole/VisualVM to a remote JVM• Rolling restart (with Overseer awareness)• Build Solr locally and patch remote
• Use a relay server to scp the JARs to Amazon network once and then scp them to other nodes from within the network
• Put/get files• Grep over all log files (across the cluster)
Wrap-up and Q & A
• LucidWorks: http://www.lucidworks.com -- We’re hiring!
• Solr Scale Toolkit: https://github.com/LucidWorks/solr-scale-tk
• SiLK: http://www.lucidworks.com/lucidworks-silk/
• Solr In Action: http://www.manning.com/grainger/
• Connect: @thelabdude / [email protected]