Meet Solr For The Tirst Again

Post on 05-Jul-2015

198 views 1 download

description

Anyone who has tried integrating search in their application knows how good and powerful Solr is but always wished it was simpler to get started and simpler to take it to production. I will talk about the recent features added to Solr making it easier for users and some of the changes we plan on adding soon to make the experience even better.

transcript

Meet Solr for the first time again Varun Thacker

Apache Solr has a huge install base and tremendous momentum

most widely used search solution on the planet. 8M+

total downloads

Solr is both established & growing

250,000+monthly downloads

Solr has tens of thousands of applications in production.

You use Solr everyday.

2500+open Solr jobs.

Activity Summary30 Day summary

Aug 18 - Sep 17 2014

• 128 Commits • 18 Contributors

via https://www.openhub.net/p/solr

12 Month Summary Sep 17, 2013 - Sep 17, 2014

• 1351 Commits • 29 Contributors

Search - Until recently

• Large organizations (Enterprise)

• Expensive

• Complex

• $$$$$

New Age Search• Everyone… startups, websites

• Special use cases

• E-commerce

• Mails and personal data

• Personal data - Across devices

• Social and Local!

• Analytics

Decision making!

• Short time frame

• Confidence measure:

• Getting started quick

• Configure and see the tip of the iceberg

• Issues only uncover later in the story

Until recently…• Getting started:

• Download

• java -jar start.jar

• SolrCloud, getting started….

• Download

• Copy example directory ‘x’ times over.

• java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar

• java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar

• It runs!

Times… they are a changin…

• Download

• cd solr

• Standalone: bin/solr start

• SolrCloud, example, interactive:

• bin/solr start -e cloud (< 2 minutes!)

Let’s index some data

• Flexible JSON Indexing - Solr supports any JSON document and the document can be indexed in the required format in Solr

• More reading: https://lucidworks.com/blog/indexing-custom-json-data/

Managed Schema

• Solr is the schema owner

• REST APIs - Hide the implementation details

• Schema-less mode

• Update and Addition of Fields and FieldTypes

• More reading: https://lucidworks.com/blog/schemaless-solr-part-1/

Configuration APIs

• Configure Solr using APIs

• solrconfig.xml… What did you say?

Solr Scale Toolkit

• Easily deploy SolrCloud clusters

• Live patching and rolling restarts

• Dependency on AWS soon to go away

• Chef or Puppet still are valid approaches

• More reading: http://lucidworks.com/blog/introducing-the-solr-scale-toolkit/

Talking about the Admin UI…

• Already improved from 3.x

• Uploading documents

• Collections API is coming soon

Collection Actions

Recently Added Features• Document expiration and Time To Live (TTL)

• Cursors: Efficient Deep Paging

• Export Sorted Result Sets

• SSL support in SolrCloud

• Distributed Pivot Faceting

• Suggester v2

• CollapsingQParserPlugin

• ReRankingQParserPlugin

• Collections API improvements

There’s so much more coming up…

• Schema Bulk API

• Distributed IDF

• Query DSL

• Cross Data-center replication

• Cluster Backup and Restore

• SOLR - Make an application, not ‘war’.

It’s easy.. and stable!

• Benchmarking

• Tons of users testing it

• Evolving test framework

Solr scalability is unmatched.

• 10TB+ Index Size • 10 Billion+ Documents • 100 Million+ Daily Requests

Where is it headed?• Download

• See that server directory?

• Use start scripts

• Send a document, or a few…

• Things don’t really look the way they should?

• Use the schema APIs

• Add fields… not enough?

• Add field types and then add fields

• Configure Solr using REST APIs

For Production:

• Use Solr Scale Toolkit to deploy, patch and manage!

• Configure Solr using REST APIs

Lucidworks Fusion

Intelligent Search Services/API

Recommendation Module Signal Processing Analytics Service

Discovery Engine

Analytics StoreEnrichment Services⚒

Analyst Workbench

eCommerce Solution

Admin/ Management

SiLK Log Analysis

Search/ Discovery

Partner Solutions

Connector Framework

Connect @

https://twitter.com/varunthacker

http://in.linkedin.com/in/varunthacker

varun.thacker@lucidworks.com