Date post: | 05-Jul-2015 |
Category: |
Engineering |
Upload: | varun-thacker |
View: | 198 times |
Download: | 1 times |
Meet Solr for the first time again Varun Thacker
Apache Solr has a huge install base and tremendous momentum
most widely used search solution on the planet. 8M+
total downloads
Solr is both established & growing
250,000+monthly downloads
Solr has tens of thousands of applications in production.
You use Solr everyday.
2500+open Solr jobs.
Activity Summary30 Day summary
Aug 18 - Sep 17 2014
• 128 Commits • 18 Contributors
via https://www.openhub.net/p/solr
12 Month Summary Sep 17, 2013 - Sep 17, 2014
• 1351 Commits • 29 Contributors
Search - Until recently
• Large organizations (Enterprise)
• Expensive
• Complex
• $$$$$
New Age Search• Everyone… startups, websites
• Special use cases
• E-commerce
• Mails and personal data
• Personal data - Across devices
• Social and Local!
• Analytics
Decision making!
• Short time frame
• Confidence measure:
• Getting started quick
• Configure and see the tip of the iceberg
• Issues only uncover later in the story
Until recently…• Getting started:
• Download
• java -jar start.jar
• SolrCloud, getting started….
• Download
• Copy example directory ‘x’ times over.
• java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar
• java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar
• It runs!
Times… they are a changin…
• Download
• cd solr
• Standalone: bin/solr start
• SolrCloud, example, interactive:
• bin/solr start -e cloud (< 2 minutes!)
Let’s index some data
• Flexible JSON Indexing - Solr supports any JSON document and the document can be indexed in the required format in Solr
• More reading: https://lucidworks.com/blog/indexing-custom-json-data/
Managed Schema
• Solr is the schema owner
• REST APIs - Hide the implementation details
• Schema-less mode
• Update and Addition of Fields and FieldTypes
• More reading: https://lucidworks.com/blog/schemaless-solr-part-1/
Configuration APIs
• Configure Solr using APIs
• solrconfig.xml… What did you say?
Solr Scale Toolkit
• Easily deploy SolrCloud clusters
• Live patching and rolling restarts
• Dependency on AWS soon to go away
• Chef or Puppet still are valid approaches
• More reading: http://lucidworks.com/blog/introducing-the-solr-scale-toolkit/
Talking about the Admin UI…
• Already improved from 3.x
• Uploading documents
• Collections API is coming soon
Collection Actions
Recently Added Features• Document expiration and Time To Live (TTL)
• Cursors: Efficient Deep Paging
• Export Sorted Result Sets
• SSL support in SolrCloud
• Distributed Pivot Faceting
• Suggester v2
• CollapsingQParserPlugin
• ReRankingQParserPlugin
• Collections API improvements
There’s so much more coming up…
• Schema Bulk API
• Distributed IDF
• Query DSL
• Cross Data-center replication
• Cluster Backup and Restore
• SOLR - Make an application, not ‘war’.
It’s easy.. and stable!
• Benchmarking
• Tons of users testing it
• Evolving test framework
Solr scalability is unmatched.
• 10TB+ Index Size • 10 Billion+ Documents • 100 Million+ Daily Requests
Where is it headed?• Download
• See that server directory?
• Use start scripts
• Send a document, or a few…
• Things don’t really look the way they should?
• Use the schema APIs
• Add fields… not enough?
• Add field types and then add fields
• Configure Solr using REST APIs
For Production:
• Use Solr Scale Toolkit to deploy, patch and manage!
• Configure Solr using REST APIs
Lucidworks Fusion
Intelligent Search Services/API
Recommendation Module Signal Processing Analytics Service
Discovery Engine
Analytics StoreEnrichment Services⚒
Analyst Workbench
eCommerce Solution
Admin/ Management
SiLK Log Analysis
Search/ Discovery
Partner Solutions
Connector Framework
Connect @
https://twitter.com/varunthacker
http://in.linkedin.com/in/varunthacker