Full Text Search using Azure Search. Shankar Subramanyam Senior Consultant...

Post on 18-Dec-2015

223 views 0 download

Tags:

transcript

Full Text Search using Azure Search

Shankar SubramanyamSenior Consultant

shankarblr@gmail.com | @Shankarblr

Enthusiast:• Web/Cloud Technologies• Financials• Economics

About me

Agenda

• Overview of Full text search using Azure Search• Demo – Build Movies Catalog using Azure Search• Q & A

What is full text search?

• In text retrieval, full text search refers to techniques for searching a single computer-stored document or a collection in a full text database. Full-text search is distinguished from searches based on metadata or on parts of the original texts represented in databases (such as titles, abstracts, selected sections, or bibliographical references).

• Reference - http://en.wikipedia.org/wiki/Full_text_search

Full text search platform

• Lucene • Solr• ElasticSearch• Azure Search• FSIS• Database Full text search • etc.

Key words in search

• Indexes• Documents• Fields

• Types of searchability• Retrievable • Non-retrievable

• Tokenization - Analyzer • Facets• Scoring

What is Azure Search ?

• Azure Search is a PaaS service • ElasticSearch as a managed service• High performance• Horizontally scalable• Administration and querying

• Via REST API• Via C#, using Azure Search Client Library (NuGet: AzureSearchClient)

• Secured using API keys• Query keys (multiple)• Management keys (two

Demo – Create Search Service in Azure portal

1. Login to Azure Portal2. Click New 3. Select Data + Storage4. Select Search as shown

1. Fill all the information 2. For Pricing Tire - Make

sure to select Free3. Click Create to create

Azure Search Service

Service Name and API Key

Azure Search Architecture

Index(schema type W)

Field

Service (a.k.a. resource)

Index(schema type X)

Index(schema type Y)

Index(schema type Z)

Document(type X)

Document(type X)

Document(type X)

Document(type X)

FieldField

Field

{json}

… …

… …

Inverted index

aardvark

hood

red

little

riding

robin

women

zoo

Little Red Riding Hood

Robin Hood

Little Women

0 1

0 2

0

0

2

1

0

1

2

{(2, 1)}

0 1

{(2, 0)}

{(0, 0)}

0

Terms/ Words Documents

A z u re S e a rc h is s tru c tu re d

• A search index has a predefined structure• It is not dynamic• Each document can have below operations

• Search• Suggestion • Lookup • Count

• Each field in the index has characteristics defined when created

• Filterable?• Searchable?• Faceted?• Retrievable?• Sortable?

Field Characteristics: Key

• Required!• Can only be on one field for the document• Can be used to look up a document directly

• Update• Delete

Field Characteristics: Searchable

• Makes the field full-text-search-able• Allowed data types are string and collection • Breaks the words of the field for indexing purposes

• “Big Red Jeep” will become separate components

• A search for “big”, “red”, “jeep”, or “big jeep” will hit this record

• Searchable fields cause bloat!• Only make it searchable if it needs to be

Field Characteristics: Filterable

• Doesn’t under go word breaking• Exact matches only• Only searches for “big red jeep” will hit a “big red jeep”

record

Field Characteristics: Sortable

• By default, results are sorted by score

Field Characteristics: Facetable

• Data the data type except Geography points are facetable

• Used to rank records by other notions• Jeeps that sold by this {dealer}• Jeeps that are this {color}

Field Characteristics: Suggestions

• Used for auto-complete• Only for string or collection of string• False by default• Causes bloat in the index!

Field Characteristics: Retrievable

• Allows the field to be returned in the search results• Key fields must be retrievable

Data Type / Properties Matrix

Demo : Movies Catalog Create Index

Query parameters Request - GET /indexes/[index name]/docs?[query parameters]

Parameter – • search =[string] $orderby=[string]• searchMode=any|all facet=[string] • searchFields=[string] api-version=[string] • $skip=# scoringProfile=[string]• $top=# highlight=[string] • $count=true|false scoringParameter=[string] • $select=[string] highlightPreTag=[string]• $filter=[string] highlightPostTag=[string]

Example : https://rocks.search.windows.net/indexes/movies/docs?search=*&$count=true&$orderby=Year desc&api-version=2015-02-28&facet=Year

Demo : Movies Catalog Consume Index

Q & A

References

• Azure Search REST API• http://msdn.microsoft.com/en-us/library/azure/dn798935.aspx• http://azure.microsoft.com/en-in/documentation/articles/searc

h-api-2014-10-20-preview/

• Azure Search Client Library Getting Started• https://code.msdn.microsoft.com/Getting-Started-with-Azure-5

0b624b7

• Inverted indexhttp://en.wikipedia.org/wiki/Inverted_index