Date post: | 18-Jul-2015 |
Category: |
Technology |
Upload: | codecampiasi |
View: | 174 times |
Download: | 0 times |
Search
Cloud powered
LIVIU MAZILURADU PINTILIE
April 25, 2015Cloud powered search
© EXPERT NETWORK
CODECAMP
Challenges in distributed applications
SQL Azure Federation
HDInsight
DocumentDB
Previous subjects
April 25, 2015Cloud powered search
© EXPERT NETWORK
Azure Search
The need for search
Search explained
Development
Case Scenarios
Agenda
April 25, 2015Cloud powered search
© EXPERT NETWORK
The need for search
Why do we search for data?
How do we store it to search efficiently?
What’s important?
April 25, 2015Cloud powered search
© EXPERT NETWORK
Is this a search engine?
where [field] like “%codecamp%”
April 25, 2015Cloud powered search
© EXPERT NETWORK
WHAT IS A SEARCH ENGINE?
Efficient indexing of data On all fields / combination of fields
Analyzing data Text Search
Tokenizing
Stemming
Filtering
Understanding locations
Relevance scoring
April 25, 2015Cloud powered search
© EXPERT NETWORK
Lucene
Document: collection of fields
Field: string based key-value pair
Collection: set of documents
Inverted index: a term can list the number of documents it contains
Score: relevancy for each document matching the query
April 25, 2015Cloud powered search
© EXPERT NETWORK
How searching works
Id Title UserId ViewCount Tags
1 Controller Action ambiguity
even with [HttpPost]
decoration? (ASP.NET MVC4)
5 352 asp.net asp.net-mvc
asp.net-mvc-4 f#
2 Why can't I use a scrollwheel
on a webpage?
6 109 c# javascript asp.net
asp.net-mvc-4 twitter-
bootstrap-3
3 Access session variable of one
site in another"
7 78 asp.net .net
4 Check if SIM card exists 5 209 c# windows-phone-8
April 25, 2015Cloud powered search
© EXPERT NETWORK
Inverted indexHow searching works
Title
Access session variable of one site in another" 3
Check if SIM card exists 4
Controller Action ambiguity even with [HttpPost] decoration? (ASP.NET
MVC4)
1
Why can't I use a scrollwheel on a webpage? 2
UserID
5 1, 4
6 2
7 3
ViewCount
78 3
109 2
209 4
352 1
April 25, 2015Cloud powered search
© EXPERT NETWORK
Inverted indexHow searching works
Title
Access session variable of one site in another" 3
Check if SIM card exists 4
Controller Action ambiguity even with [HttpPost] decoration? (ASP.NET
MVC4)
1
Why can't I use a scrollwheel on a webpage? 2
UserID
5 1, 4
6 2
7 3
ViewCount
78 3
109 2
209 4
352 1
Query: UserID = 5
April 25, 2015Cloud powered search
© EXPERT NETWORK
Full text search
Id Tags
1 asp.net asp.net-mvc asp.net-mvc-
4 f#
2 c# javascript asp.net asp.net-mvc-
4 twitter-bootstrap-3
3 asp.net .net
4 c# windows-phone-8
How searching works
Term Doc
.net 3
asp.net 1, 2, 3
asp.net-mvc-4 1, 2
c# 2, 4
f# 1
javascript 2
mvc 1
twitter-bootstrap-3 2
windows-phone-8 4
April 25, 2015Cloud powered search
© EXPERT NETWORK
Full text search
Id Tags
1 asp.net asp.net-mvc asp.net-mvc-
4 f#
2 c# javascript asp.net asp.net-mvc-
4 twitter-bootstrap-3
3 asp.net .net
4 c# windows-phone-8
How searching works
Term Doc
.net 3
asp.net 1, 2, 3
asp.net-mvc-4 1, 2
c# 2, 4
f# 1
javascript 2
mvc 1
twitter-bootstrap-3 2
windows-phone-8 4
Query: “javascript” in Tags
April 25, 2015Cloud powered search
© EXPERT NETWORK
Full text search
Id Tags
1 asp.net asp.net-mvc asp.net-mvc-
4 f#
2 c# javascript asp.net asp.net-mvc-
4 twitter-bootstrap-3
3 asp.net .net
4 c# windows-phone-8
How searching works
Term Doc
.net 3
asp.net 1, 2, 3
asp.net-mvc-4 1, 2
c# 2, 4
f# 1
javascript 2
mvc 1
twitter-bootstrap-3 2
windows-phone-8 4
Query: “asp.net” in Tags
April 25, 2015Cloud powered search
© EXPERT NETWORK
Auto-completionUses
April 25, 2015Cloud powered search
© EXPERT NETWORK
Auto-correction
PhrasingIframe security – Security in an Iframe
Word-level distancegrey/gray
color/colour
Uses
April 25, 2015Cloud powered search
© EXPERT NETWORK
Elasticsearch
Distributed: aggregated results of search performed on multiple shards/indices
Schema Less: is document oriented. Supports JSON format
RESTful: supports REST interface
Faceted Search: support for navigational search functionality
Replication: supports index replication
Fail over: replication and distributed nature provides inbuilt fail over.
Near Real time: supports near real time updates
April 25, 2015Cloud powered search
© EXPERT NETWORK
Distributed & highly available
• Multiple servers (nodes) running in a cluster • Acting as single service
• Nodes in cluster that store data or nodes that just help in speeding up search queries.
• Sharding• Indeces are sharded (# shards is configurable)
• Each shard can have zero or more replicas • Replicas on different servers (server pools) for failover
• One in the cluster goes down? No problem.
Elasticsearch
April 25, 2015Cloud powered search
© EXPERT NETWORK
Azure search
Elasticsearch as a managed service
Platform as a service (PaaS)
Admin by Rest API
Data exchange with JSON
April 25, 2015Cloud powered search
© EXPERT NETWORK
Where are we at
Service Ease of use Scalability Easy Administration
Manual search (SQL) No No Partial
Elasticsearch Yes Yes No
AzureSearch Yes Yes Yes
April 25, 2015Cloud powered search
© EXPERT NETWORK
Resource model
ServiceIndex (schema type 1)
Index (schema type 2)Document
DocumentField1
Field2
Field3
Field4
Indexers
Azure Search
April 25, 2015Cloud powered search
© EXPERT NETWORK
Management PortalDemo
April 25, 2015Cloud powered search
© EXPERT NETWORK
Index creation
POST https://codecamp-en.search.windows.net/indexes
"name": "stackoverflow-posts",
"fields": [ {
"name": "name_of_field",
"type": “data_type",
"searchable": true (default where applicable) | false ,
"filterable": true (default) | false,
"sortable": true (default where applicable) | false
"facetable": true (default where applicable) | false ,
"key": true | false (default),
"retrievable": true (default) | false } ] …
Azure Search
April 25, 2015Cloud powered search
© EXPERT NETWORK
Index documents
Indexers
Data sources: Azure SQL Database, DocumentDB
Connects data sources with target search indexes
An indexer can be used in the following ways:one-time copy of the data to populate an index
sync an index with changes from the data source on a schedule
invoke on-demand to update an index as needed
Azure Search
April 25, 2015Cloud powered search
© EXPERT NETWORK
CRUD Operations
Add, Update, Delete
POST https://codecamp-en.search.windows.net/indexes/stackoverflow/docs/index
{
"@search.action": "upload (default) | merge | mergeOrUpload | delete",
"key_field_name": "unique_key_of_document", (key/value pair for key field from index schema)
"field_name": field_value (key/value pairs matching index schema)
}
Azure Search
April 25, 2015Cloud powered search
© EXPERT NETWORK
Searching through data
GET https://codecamp-en.search.windows.net/indexes/stackoverflow/docs?
search=[string] + (AND operator “code" and “camp")
| (OR operator “code" or “camp" or both)
- (NOT operator. “code–camp" “code" term and/or do not have “camp" )
* (Suffix operator. “cod*" - starts with “cod", ignoring case)
" (Phrase search operator)
( ) (Precedence operator - code+(camp|workshop)
searchMode=any|all
searchFields=[string]
Azure Search
April 25, 2015Cloud powered search
© EXPERT NETWORK
Filtering results
$filter=[string] - Odata syntax
$skip=#
$top=#
$count=true|false
$orderby=[string]
$select=[string]
Azure Search
April 25, 2015Cloud powered search
© EXPERT NETWORK
Emphasizing results
facet=[string] (field names)count
sort
values
interval
highlight=[string] (field names)
highlightPreTag=[string] (default is em)
highlightPostTag=[string]
Azure Search
April 25, 2015Cloud powered search
© EXPERT NETWORK
Suggestions
GET https://codecamp-en.search.windows.net/indexes/stackoverflow/docs/suggest
search=[string]
suggesterName=[string]
fuzzy=[boolean]
searchFields=[string]
Azure Search
April 25, 2015Cloud powered search
© EXPERT NETWORK
Stackoverflow Posts
5.215.584 records
212 MB in Title column
118 MB in Tags column
10,5 GB in Body column
Sample Data
Column Name Data Type
Id int
CreationDate datetime
Score float
ViewCount int
Body nvarchar
OwnerUserId int
Title nvarchar
Tags nvarchar
April 25, 2015Cloud powered search
© EXPERT NETWORK
Search APIDEMO
April 25, 2015Cloud powered search
© EXPERT NETWORK
Scaling
Capacity measured in Search Units
1 Search Unit1 Partition
1 Replica
Horizontal scaling by increasing the number of partitions and/or replicas
Cloud powered search
April 25, 2015Cloud powered search
© EXPERT NETWORK
Storage
Partition limitations:15 million documents
25 GB data
Every Index is split by default in 12 shards
Each partition can store 1,2,3,4,6,12 shards
Cloud powered search
April 25, 2015Cloud powered search
© EXPERT NETWORK
SCENARIOS
Online retail/ecommerce
User generated/social content
Not just for the web
Hybrid Applications
USE CASE
April 25, 2015Cloud powered search
© EXPERT NETWORK
Conclusions
The need for search
Search explained
Development
Case Scenarios
April 25, 2015Cloud powered search
© EXPERT NETWORK
Questions
?
April 25, 2015Cloud powered search
© EXPERT NETWORK
THANK YOU