Elasticsearch for Westcoast

Post on 23-Jan-2018

1,904 views 0 download

transcript

Charlie Hull, Managing Director, FlaxNick Gushlow, Systems Architect, WestcoastElasticsearch London Meetup

charlie@flax.co.ukwww.flax.co.uk/blog+44 (0) 8700 118334Twitter: @FlaxSearch

Search is never SimpleElasticsearch for Westcoast

Building open source search applications since 2001

Independent, honest advice and analysis

Expert design & development, Apache Solr committers

Test-driven relevancy and performance tuning

Custom training & mentoring for your staff

Flexible support up to 24/7/365 with SLAs

Come to our Meetups!

Charlie Hull, Managing Director & co-founder of Flax

Nick Gushlow, Systems Architect at Westcoast

Who are we?@FlaxSearch

Why Westcoast needed a new search engine

The source data & the plan

The trouble with....

Building an admin panel for search The search goes live

Lessons learned

What we'll cover today@FlaxSearch

Largest privately owned IT distributor in UK & Ireland

£1.5 billion turnover

Apple, HP, Lenovo, Microsoft, Samsung, Toshiba

Includes XMA / QC Supplies and Viglen

Who are Westcoast?@FlaxSearch

Old SQL based search not accurate enough

8000 searches per day, 90% SKU based

You searched iPad …. did you actually want an iPad?

Customers used Google / competitors to find part numbers,

Static traffic numbers – 3500 user per day 7am to 7pm

Increase web revenue further, currently £40m

Why a new search engine?@FlaxSearch

Business approved a project to implement a change to ‘improve search’

Google Search Appliance

SLI

Apptus

Fredhopper

Elasticsearch

Time for a change@FlaxSearch

Live pricing

XML data sheets

Business user management interface

Synonyms / Exclusions

Boosts

Search vs Search vs Search

Requirements@FlaxSearch

0.5m products

Nested data (attributes)

Supplied as XML, one file per product

The source data@FlaxSearch

0.5m products

Nested data (attributes)

Supplied as XML, one file per product

BUT!– Live Pricing API will restrict results at search time

– Different for every end customer

– Based on hard to explain business rules

The source data@FlaxSearch

Elasticsearch

Java client

Custom Java indexer (Dropwizard)

Search application (Dropwizard)

Admin panels (AngularJS)

Agile process

The plan@FlaxSearch

First, do your search

Send 5000 results to legacy pricing API

Merge the pricing information with search results

Now build your facets (including on price)

Hang on, doesn't Elasticsearch do facets for you?

The trouble with facets@FlaxSearch

Front end systems built by third party– Solution: Search app with JSON API (defined by them)– Encrypted JSON for use during sessions

More trouble with facets @FlaxSearch

Front end systems built by third party– Solution: Search app with JSON API (defined by them)– Encrypted JSON for use during sessions

Data for all the facets must be supplied to the UI– Full result counts for applied facets need to be returned, in

the order they were applied– Solution: lots of searches

More trouble with facets @FlaxSearch

Front end systems built by third party– Solution: Search app with JSON API (defined by them)– Encrypted JSON for use during sessions

Data for all the facets must be supplied to the UI– Full result counts for applied facets need to be returned, in

the order they were applied– Solution: lots of searches

Custom facets for some customers– Solution: an index of facet definitions

More trouble with facets @FlaxSearch

Boost for individual items– Easy! Define in the source data

Term boosts– e.g. some Macbooks over other Macbooks– Harder – but still defined in source data

The trouble with boosting@FlaxSearch

A great way to run search projects!

...unless not everyone can do Agile

The trouble with Agile@FlaxSearch

Allows Westcoast to adjust– Synonyms / Exclusions– Remove items from index– Test searches– Test synonyms then push to live• Synonyms are index side as default query is AND

Built in AngularJS

Building an Admin panel@FlaxSearch

@FlaxSearch

@FlaxSearch

@FlaxSearch

A single node for Elasticsearch

A single node for index & search applications

Ultimately mirrored for failover

Query load very low (1 QPS)– But this may change!

Business hours support by Flax

The search goes live@FlaxSearch

@FlaxSearch

@FlaxSearch

Elasticsearch results were good

Business maintenance, large, boring, never ending work

Changing customer behaviour is slow

Search results over 30% faster on average

Time savings for sales staff

Post Live@FlaxSearch

Integrating with legacy systems is hard

Business rules can be hard to understand & harder to explain

Not everything can be done with search

If you want to do Agile, make sure everyone else can

Lessons learned@FlaxSearch

Thankyou!

Any questions?

charlie@flax.co.ukwww.flax.co.uk/blog+44 (0) 8700 118334Twitter: @FlaxSearch

@FlaxSearch

Plug

3rd & 4th February 2016, Cambridge UK

Open source search for Bioinformatics

Free event near Cambridge on Wellcome Genome Campus covering both Solr & Elasticsearch, talks & hands-on workshops

http://www.ebi.ac.uk/pdbe/about/events

@FlaxSearch

Plug #2

20th March 2016, Padua, Italy

First International Workshop on Recent Trends in News Information Retrieval

One-day workshop as part of the European Conference on Information Retrieval (ECIR 2016) – submission deadline end of January

http://research.signalmedia.co/newsir16/index.html

(including a free test dataset of 1m news articles!)

@FlaxSearch