Date post: | 16-Jul-2015 |
Category: |
Technology |
Upload: | roy-russo |
View: | 172 times |
Download: | 4 times |
Introduction To ElasticSearchReal-Time Search and Analytics
Roy Russo DevNexus 2015
Who Am I
bullRoy Russo
bullVP Engineering Predikto
bullCo-Author - Elasticsearch in Action
-Due ~April 2015
bullElasticHQorg
bullOther ()
2
Silverpop JBoss AltiSource Labs
Why Am I Here
bullWhat is Search
bullWhat is Elasticsearch
bullReal-World Use
bullScale Out
bullInteracting with Elasticsearch
3
Search is about
filtering information
and determining
relevance
4
How does a Search Engine
Work
5
Select FROM make WHERE name LIKE lsquoTeslarsquo
Search Engines use Magic
6
Where Magic == Inverted Index
Itrsquos FM
Inverted Index
bullTake some documents
bullTokenize them
bullFind unique tokens
bullMap tokens to documents
7
apple oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Inverted Index
8
apples oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Search for ldquoapple peachrdquo
Relevance
bullHow many tokens per document
bullHow many tokens relative to the number of total
tokens in the document
bullWhat is the frequency of token across all
documents
9
Relevance in Elasticsearch
bullAt Search Time
bullAt Index Time
bullTerm Frequency
-Term Document
bullInverse Document Frequency (IDF)
-Term All Documents in the collection
bullField-Length Norm
10
What is Elasticsearch
11
Elasticsearch ishellip
bullSearch and Analytics engine
bullDocument Store
-Every field is indexedsearchable
bullDistributed
12
What Elasticsearch is not
bullKey-Value Store
-Redis Riak
bullColumn Family Store
-C HBase
bullGraph Database
-Neo4J
13
ElasticSearch in a Nutshell
bullBased on Apache Lucene
bullDistributed
bullDocument-Oriented
bullSchema free
bullHTTP + JSON
bull(Near) Real-time search
bullEcosystem
-Hosting Monitoring Apps Clients (SDK)
14
Where can I get it
bullFree and Open Source
bullhttpswwwelasticco
bullhttpsgithubcomelasticelasticsearch
bullBacked by a Company Elastic
-Training
-Support
-AuthAuthZ
-Marvel for Monitoring
15
How do I run it
bullDownload it
- httpswwwelasticcodownloads
bullbinelasticsearch
bullhttplocalhost9200
16
status 200
name Tesla
cluster_name elasticsearch_royrusso
version
number 142
build_hash 927caff6f05403e936c20bf4529f144f0c89fd8c
build_timestamp 2014-12-16T141112Z
build_snapshot false
lucene_version 4102
tagline You Know for Search
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Who Am I
bullRoy Russo
bullVP Engineering Predikto
bullCo-Author - Elasticsearch in Action
-Due ~April 2015
bullElasticHQorg
bullOther ()
2
Silverpop JBoss AltiSource Labs
Why Am I Here
bullWhat is Search
bullWhat is Elasticsearch
bullReal-World Use
bullScale Out
bullInteracting with Elasticsearch
3
Search is about
filtering information
and determining
relevance
4
How does a Search Engine
Work
5
Select FROM make WHERE name LIKE lsquoTeslarsquo
Search Engines use Magic
6
Where Magic == Inverted Index
Itrsquos FM
Inverted Index
bullTake some documents
bullTokenize them
bullFind unique tokens
bullMap tokens to documents
7
apple oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Inverted Index
8
apples oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Search for ldquoapple peachrdquo
Relevance
bullHow many tokens per document
bullHow many tokens relative to the number of total
tokens in the document
bullWhat is the frequency of token across all
documents
9
Relevance in Elasticsearch
bullAt Search Time
bullAt Index Time
bullTerm Frequency
-Term Document
bullInverse Document Frequency (IDF)
-Term All Documents in the collection
bullField-Length Norm
10
What is Elasticsearch
11
Elasticsearch ishellip
bullSearch and Analytics engine
bullDocument Store
-Every field is indexedsearchable
bullDistributed
12
What Elasticsearch is not
bullKey-Value Store
-Redis Riak
bullColumn Family Store
-C HBase
bullGraph Database
-Neo4J
13
ElasticSearch in a Nutshell
bullBased on Apache Lucene
bullDistributed
bullDocument-Oriented
bullSchema free
bullHTTP + JSON
bull(Near) Real-time search
bullEcosystem
-Hosting Monitoring Apps Clients (SDK)
14
Where can I get it
bullFree and Open Source
bullhttpswwwelasticco
bullhttpsgithubcomelasticelasticsearch
bullBacked by a Company Elastic
-Training
-Support
-AuthAuthZ
-Marvel for Monitoring
15
How do I run it
bullDownload it
- httpswwwelasticcodownloads
bullbinelasticsearch
bullhttplocalhost9200
16
status 200
name Tesla
cluster_name elasticsearch_royrusso
version
number 142
build_hash 927caff6f05403e936c20bf4529f144f0c89fd8c
build_timestamp 2014-12-16T141112Z
build_snapshot false
lucene_version 4102
tagline You Know for Search
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Why Am I Here
bullWhat is Search
bullWhat is Elasticsearch
bullReal-World Use
bullScale Out
bullInteracting with Elasticsearch
3
Search is about
filtering information
and determining
relevance
4
How does a Search Engine
Work
5
Select FROM make WHERE name LIKE lsquoTeslarsquo
Search Engines use Magic
6
Where Magic == Inverted Index
Itrsquos FM
Inverted Index
bullTake some documents
bullTokenize them
bullFind unique tokens
bullMap tokens to documents
7
apple oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Inverted Index
8
apples oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Search for ldquoapple peachrdquo
Relevance
bullHow many tokens per document
bullHow many tokens relative to the number of total
tokens in the document
bullWhat is the frequency of token across all
documents
9
Relevance in Elasticsearch
bullAt Search Time
bullAt Index Time
bullTerm Frequency
-Term Document
bullInverse Document Frequency (IDF)
-Term All Documents in the collection
bullField-Length Norm
10
What is Elasticsearch
11
Elasticsearch ishellip
bullSearch and Analytics engine
bullDocument Store
-Every field is indexedsearchable
bullDistributed
12
What Elasticsearch is not
bullKey-Value Store
-Redis Riak
bullColumn Family Store
-C HBase
bullGraph Database
-Neo4J
13
ElasticSearch in a Nutshell
bullBased on Apache Lucene
bullDistributed
bullDocument-Oriented
bullSchema free
bullHTTP + JSON
bull(Near) Real-time search
bullEcosystem
-Hosting Monitoring Apps Clients (SDK)
14
Where can I get it
bullFree and Open Source
bullhttpswwwelasticco
bullhttpsgithubcomelasticelasticsearch
bullBacked by a Company Elastic
-Training
-Support
-AuthAuthZ
-Marvel for Monitoring
15
How do I run it
bullDownload it
- httpswwwelasticcodownloads
bullbinelasticsearch
bullhttplocalhost9200
16
status 200
name Tesla
cluster_name elasticsearch_royrusso
version
number 142
build_hash 927caff6f05403e936c20bf4529f144f0c89fd8c
build_timestamp 2014-12-16T141112Z
build_snapshot false
lucene_version 4102
tagline You Know for Search
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Search is about
filtering information
and determining
relevance
4
How does a Search Engine
Work
5
Select FROM make WHERE name LIKE lsquoTeslarsquo
Search Engines use Magic
6
Where Magic == Inverted Index
Itrsquos FM
Inverted Index
bullTake some documents
bullTokenize them
bullFind unique tokens
bullMap tokens to documents
7
apple oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Inverted Index
8
apples oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Search for ldquoapple peachrdquo
Relevance
bullHow many tokens per document
bullHow many tokens relative to the number of total
tokens in the document
bullWhat is the frequency of token across all
documents
9
Relevance in Elasticsearch
bullAt Search Time
bullAt Index Time
bullTerm Frequency
-Term Document
bullInverse Document Frequency (IDF)
-Term All Documents in the collection
bullField-Length Norm
10
What is Elasticsearch
11
Elasticsearch ishellip
bullSearch and Analytics engine
bullDocument Store
-Every field is indexedsearchable
bullDistributed
12
What Elasticsearch is not
bullKey-Value Store
-Redis Riak
bullColumn Family Store
-C HBase
bullGraph Database
-Neo4J
13
ElasticSearch in a Nutshell
bullBased on Apache Lucene
bullDistributed
bullDocument-Oriented
bullSchema free
bullHTTP + JSON
bull(Near) Real-time search
bullEcosystem
-Hosting Monitoring Apps Clients (SDK)
14
Where can I get it
bullFree and Open Source
bullhttpswwwelasticco
bullhttpsgithubcomelasticelasticsearch
bullBacked by a Company Elastic
-Training
-Support
-AuthAuthZ
-Marvel for Monitoring
15
How do I run it
bullDownload it
- httpswwwelasticcodownloads
bullbinelasticsearch
bullhttplocalhost9200
16
status 200
name Tesla
cluster_name elasticsearch_royrusso
version
number 142
build_hash 927caff6f05403e936c20bf4529f144f0c89fd8c
build_timestamp 2014-12-16T141112Z
build_snapshot false
lucene_version 4102
tagline You Know for Search
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
How does a Search Engine
Work
5
Select FROM make WHERE name LIKE lsquoTeslarsquo
Search Engines use Magic
6
Where Magic == Inverted Index
Itrsquos FM
Inverted Index
bullTake some documents
bullTokenize them
bullFind unique tokens
bullMap tokens to documents
7
apple oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Inverted Index
8
apples oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Search for ldquoapple peachrdquo
Relevance
bullHow many tokens per document
bullHow many tokens relative to the number of total
tokens in the document
bullWhat is the frequency of token across all
documents
9
Relevance in Elasticsearch
bullAt Search Time
bullAt Index Time
bullTerm Frequency
-Term Document
bullInverse Document Frequency (IDF)
-Term All Documents in the collection
bullField-Length Norm
10
What is Elasticsearch
11
Elasticsearch ishellip
bullSearch and Analytics engine
bullDocument Store
-Every field is indexedsearchable
bullDistributed
12
What Elasticsearch is not
bullKey-Value Store
-Redis Riak
bullColumn Family Store
-C HBase
bullGraph Database
-Neo4J
13
ElasticSearch in a Nutshell
bullBased on Apache Lucene
bullDistributed
bullDocument-Oriented
bullSchema free
bullHTTP + JSON
bull(Near) Real-time search
bullEcosystem
-Hosting Monitoring Apps Clients (SDK)
14
Where can I get it
bullFree and Open Source
bullhttpswwwelasticco
bullhttpsgithubcomelasticelasticsearch
bullBacked by a Company Elastic
-Training
-Support
-AuthAuthZ
-Marvel for Monitoring
15
How do I run it
bullDownload it
- httpswwwelasticcodownloads
bullbinelasticsearch
bullhttplocalhost9200
16
status 200
name Tesla
cluster_name elasticsearch_royrusso
version
number 142
build_hash 927caff6f05403e936c20bf4529f144f0c89fd8c
build_timestamp 2014-12-16T141112Z
build_snapshot false
lucene_version 4102
tagline You Know for Search
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Search Engines use Magic
6
Where Magic == Inverted Index
Itrsquos FM
Inverted Index
bullTake some documents
bullTokenize them
bullFind unique tokens
bullMap tokens to documents
7
apple oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Inverted Index
8
apples oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Search for ldquoapple peachrdquo
Relevance
bullHow many tokens per document
bullHow many tokens relative to the number of total
tokens in the document
bullWhat is the frequency of token across all
documents
9
Relevance in Elasticsearch
bullAt Search Time
bullAt Index Time
bullTerm Frequency
-Term Document
bullInverse Document Frequency (IDF)
-Term All Documents in the collection
bullField-Length Norm
10
What is Elasticsearch
11
Elasticsearch ishellip
bullSearch and Analytics engine
bullDocument Store
-Every field is indexedsearchable
bullDistributed
12
What Elasticsearch is not
bullKey-Value Store
-Redis Riak
bullColumn Family Store
-C HBase
bullGraph Database
-Neo4J
13
ElasticSearch in a Nutshell
bullBased on Apache Lucene
bullDistributed
bullDocument-Oriented
bullSchema free
bullHTTP + JSON
bull(Near) Real-time search
bullEcosystem
-Hosting Monitoring Apps Clients (SDK)
14
Where can I get it
bullFree and Open Source
bullhttpswwwelasticco
bullhttpsgithubcomelasticelasticsearch
bullBacked by a Company Elastic
-Training
-Support
-AuthAuthZ
-Marvel for Monitoring
15
How do I run it
bullDownload it
- httpswwwelasticcodownloads
bullbinelasticsearch
bullhttplocalhost9200
16
status 200
name Tesla
cluster_name elasticsearch_royrusso
version
number 142
build_hash 927caff6f05403e936c20bf4529f144f0c89fd8c
build_timestamp 2014-12-16T141112Z
build_snapshot false
lucene_version 4102
tagline You Know for Search
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Inverted Index
bullTake some documents
bullTokenize them
bullFind unique tokens
bullMap tokens to documents
7
apple oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Inverted Index
8
apples oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Search for ldquoapple peachrdquo
Relevance
bullHow many tokens per document
bullHow many tokens relative to the number of total
tokens in the document
bullWhat is the frequency of token across all
documents
9
Relevance in Elasticsearch
bullAt Search Time
bullAt Index Time
bullTerm Frequency
-Term Document
bullInverse Document Frequency (IDF)
-Term All Documents in the collection
bullField-Length Norm
10
What is Elasticsearch
11
Elasticsearch ishellip
bullSearch and Analytics engine
bullDocument Store
-Every field is indexedsearchable
bullDistributed
12
What Elasticsearch is not
bullKey-Value Store
-Redis Riak
bullColumn Family Store
-C HBase
bullGraph Database
-Neo4J
13
ElasticSearch in a Nutshell
bullBased on Apache Lucene
bullDistributed
bullDocument-Oriented
bullSchema free
bullHTTP + JSON
bull(Near) Real-time search
bullEcosystem
-Hosting Monitoring Apps Clients (SDK)
14
Where can I get it
bullFree and Open Source
bullhttpswwwelasticco
bullhttpsgithubcomelasticelasticsearch
bullBacked by a Company Elastic
-Training
-Support
-AuthAuthZ
-Marvel for Monitoring
15
How do I run it
bullDownload it
- httpswwwelasticcodownloads
bullbinelasticsearch
bullhttplocalhost9200
16
status 200
name Tesla
cluster_name elasticsearch_royrusso
version
number 142
build_hash 927caff6f05403e936c20bf4529f144f0c89fd8c
build_timestamp 2014-12-16T141112Z
build_snapshot false
lucene_version 4102
tagline You Know for Search
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Inverted Index
8
apples oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Search for ldquoapple peachrdquo
Relevance
bullHow many tokens per document
bullHow many tokens relative to the number of total
tokens in the document
bullWhat is the frequency of token across all
documents
9
Relevance in Elasticsearch
bullAt Search Time
bullAt Index Time
bullTerm Frequency
-Term Document
bullInverse Document Frequency (IDF)
-Term All Documents in the collection
bullField-Length Norm
10
What is Elasticsearch
11
Elasticsearch ishellip
bullSearch and Analytics engine
bullDocument Store
-Every field is indexedsearchable
bullDistributed
12
What Elasticsearch is not
bullKey-Value Store
-Redis Riak
bullColumn Family Store
-C HBase
bullGraph Database
-Neo4J
13
ElasticSearch in a Nutshell
bullBased on Apache Lucene
bullDistributed
bullDocument-Oriented
bullSchema free
bullHTTP + JSON
bull(Near) Real-time search
bullEcosystem
-Hosting Monitoring Apps Clients (SDK)
14
Where can I get it
bullFree and Open Source
bullhttpswwwelasticco
bullhttpsgithubcomelasticelasticsearch
bullBacked by a Company Elastic
-Training
-Support
-AuthAuthZ
-Marvel for Monitoring
15
How do I run it
bullDownload it
- httpswwwelasticcodownloads
bullbinelasticsearch
bullhttplocalhost9200
16
status 200
name Tesla
cluster_name elasticsearch_royrusso
version
number 142
build_hash 927caff6f05403e936c20bf4529f144f0c89fd8c
build_timestamp 2014-12-16T141112Z
build_snapshot false
lucene_version 4102
tagline You Know for Search
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Relevance
bullHow many tokens per document
bullHow many tokens relative to the number of total
tokens in the document
bullWhat is the frequency of token across all
documents
9
Relevance in Elasticsearch
bullAt Search Time
bullAt Index Time
bullTerm Frequency
-Term Document
bullInverse Document Frequency (IDF)
-Term All Documents in the collection
bullField-Length Norm
10
What is Elasticsearch
11
Elasticsearch ishellip
bullSearch and Analytics engine
bullDocument Store
-Every field is indexedsearchable
bullDistributed
12
What Elasticsearch is not
bullKey-Value Store
-Redis Riak
bullColumn Family Store
-C HBase
bullGraph Database
-Neo4J
13
ElasticSearch in a Nutshell
bullBased on Apache Lucene
bullDistributed
bullDocument-Oriented
bullSchema free
bullHTTP + JSON
bull(Near) Real-time search
bullEcosystem
-Hosting Monitoring Apps Clients (SDK)
14
Where can I get it
bullFree and Open Source
bullhttpswwwelasticco
bullhttpsgithubcomelasticelasticsearch
bullBacked by a Company Elastic
-Training
-Support
-AuthAuthZ
-Marvel for Monitoring
15
How do I run it
bullDownload it
- httpswwwelasticcodownloads
bullbinelasticsearch
bullhttplocalhost9200
16
status 200
name Tesla
cluster_name elasticsearch_royrusso
version
number 142
build_hash 927caff6f05403e936c20bf4529f144f0c89fd8c
build_timestamp 2014-12-16T141112Z
build_snapshot false
lucene_version 4102
tagline You Know for Search
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Relevance in Elasticsearch
bullAt Search Time
bullAt Index Time
bullTerm Frequency
-Term Document
bullInverse Document Frequency (IDF)
-Term All Documents in the collection
bullField-Length Norm
10
What is Elasticsearch
11
Elasticsearch ishellip
bullSearch and Analytics engine
bullDocument Store
-Every field is indexedsearchable
bullDistributed
12
What Elasticsearch is not
bullKey-Value Store
-Redis Riak
bullColumn Family Store
-C HBase
bullGraph Database
-Neo4J
13
ElasticSearch in a Nutshell
bullBased on Apache Lucene
bullDistributed
bullDocument-Oriented
bullSchema free
bullHTTP + JSON
bull(Near) Real-time search
bullEcosystem
-Hosting Monitoring Apps Clients (SDK)
14
Where can I get it
bullFree and Open Source
bullhttpswwwelasticco
bullhttpsgithubcomelasticelasticsearch
bullBacked by a Company Elastic
-Training
-Support
-AuthAuthZ
-Marvel for Monitoring
15
How do I run it
bullDownload it
- httpswwwelasticcodownloads
bullbinelasticsearch
bullhttplocalhost9200
16
status 200
name Tesla
cluster_name elasticsearch_royrusso
version
number 142
build_hash 927caff6f05403e936c20bf4529f144f0c89fd8c
build_timestamp 2014-12-16T141112Z
build_snapshot false
lucene_version 4102
tagline You Know for Search
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
What is Elasticsearch
11
Elasticsearch ishellip
bullSearch and Analytics engine
bullDocument Store
-Every field is indexedsearchable
bullDistributed
12
What Elasticsearch is not
bullKey-Value Store
-Redis Riak
bullColumn Family Store
-C HBase
bullGraph Database
-Neo4J
13
ElasticSearch in a Nutshell
bullBased on Apache Lucene
bullDistributed
bullDocument-Oriented
bullSchema free
bullHTTP + JSON
bull(Near) Real-time search
bullEcosystem
-Hosting Monitoring Apps Clients (SDK)
14
Where can I get it
bullFree and Open Source
bullhttpswwwelasticco
bullhttpsgithubcomelasticelasticsearch
bullBacked by a Company Elastic
-Training
-Support
-AuthAuthZ
-Marvel for Monitoring
15
How do I run it
bullDownload it
- httpswwwelasticcodownloads
bullbinelasticsearch
bullhttplocalhost9200
16
status 200
name Tesla
cluster_name elasticsearch_royrusso
version
number 142
build_hash 927caff6f05403e936c20bf4529f144f0c89fd8c
build_timestamp 2014-12-16T141112Z
build_snapshot false
lucene_version 4102
tagline You Know for Search
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Elasticsearch ishellip
bullSearch and Analytics engine
bullDocument Store
-Every field is indexedsearchable
bullDistributed
12
What Elasticsearch is not
bullKey-Value Store
-Redis Riak
bullColumn Family Store
-C HBase
bullGraph Database
-Neo4J
13
ElasticSearch in a Nutshell
bullBased on Apache Lucene
bullDistributed
bullDocument-Oriented
bullSchema free
bullHTTP + JSON
bull(Near) Real-time search
bullEcosystem
-Hosting Monitoring Apps Clients (SDK)
14
Where can I get it
bullFree and Open Source
bullhttpswwwelasticco
bullhttpsgithubcomelasticelasticsearch
bullBacked by a Company Elastic
-Training
-Support
-AuthAuthZ
-Marvel for Monitoring
15
How do I run it
bullDownload it
- httpswwwelasticcodownloads
bullbinelasticsearch
bullhttplocalhost9200
16
status 200
name Tesla
cluster_name elasticsearch_royrusso
version
number 142
build_hash 927caff6f05403e936c20bf4529f144f0c89fd8c
build_timestamp 2014-12-16T141112Z
build_snapshot false
lucene_version 4102
tagline You Know for Search
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
What Elasticsearch is not
bullKey-Value Store
-Redis Riak
bullColumn Family Store
-C HBase
bullGraph Database
-Neo4J
13
ElasticSearch in a Nutshell
bullBased on Apache Lucene
bullDistributed
bullDocument-Oriented
bullSchema free
bullHTTP + JSON
bull(Near) Real-time search
bullEcosystem
-Hosting Monitoring Apps Clients (SDK)
14
Where can I get it
bullFree and Open Source
bullhttpswwwelasticco
bullhttpsgithubcomelasticelasticsearch
bullBacked by a Company Elastic
-Training
-Support
-AuthAuthZ
-Marvel for Monitoring
15
How do I run it
bullDownload it
- httpswwwelasticcodownloads
bullbinelasticsearch
bullhttplocalhost9200
16
status 200
name Tesla
cluster_name elasticsearch_royrusso
version
number 142
build_hash 927caff6f05403e936c20bf4529f144f0c89fd8c
build_timestamp 2014-12-16T141112Z
build_snapshot false
lucene_version 4102
tagline You Know for Search
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
ElasticSearch in a Nutshell
bullBased on Apache Lucene
bullDistributed
bullDocument-Oriented
bullSchema free
bullHTTP + JSON
bull(Near) Real-time search
bullEcosystem
-Hosting Monitoring Apps Clients (SDK)
14
Where can I get it
bullFree and Open Source
bullhttpswwwelasticco
bullhttpsgithubcomelasticelasticsearch
bullBacked by a Company Elastic
-Training
-Support
-AuthAuthZ
-Marvel for Monitoring
15
How do I run it
bullDownload it
- httpswwwelasticcodownloads
bullbinelasticsearch
bullhttplocalhost9200
16
status 200
name Tesla
cluster_name elasticsearch_royrusso
version
number 142
build_hash 927caff6f05403e936c20bf4529f144f0c89fd8c
build_timestamp 2014-12-16T141112Z
build_snapshot false
lucene_version 4102
tagline You Know for Search
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Where can I get it
bullFree and Open Source
bullhttpswwwelasticco
bullhttpsgithubcomelasticelasticsearch
bullBacked by a Company Elastic
-Training
-Support
-AuthAuthZ
-Marvel for Monitoring
15
How do I run it
bullDownload it
- httpswwwelasticcodownloads
bullbinelasticsearch
bullhttplocalhost9200
16
status 200
name Tesla
cluster_name elasticsearch_royrusso
version
number 142
build_hash 927caff6f05403e936c20bf4529f144f0c89fd8c
build_timestamp 2014-12-16T141112Z
build_snapshot false
lucene_version 4102
tagline You Know for Search
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
How do I run it
bullDownload it
- httpswwwelasticcodownloads
bullbinelasticsearch
bullhttplocalhost9200
16
status 200
name Tesla
cluster_name elasticsearch_royrusso
version
number 142
build_hash 927caff6f05403e936c20bf4529f144f0c89fd8c
build_timestamp 2014-12-16T141112Z
build_snapshot false
lucene_version 4102
tagline You Know for Search
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Elasticsearch requires Java
17
You have 5 seconds to whine about it and then shutup
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Some Use-Cases
18
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
ElasticSearch for Centralized
LogsbullLogstash + ElasticSearch + Kibana (ELK)
bullWellhellip and then therersquos Loggly
19
ldquoNetflix is a Log generating company that happens to stream moviesrdquo
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Elasticsearch at Predikto
21
bullWrite From Spark to Elasticsearch
bullQuery from Spark to Elasticsearch
bullVisualize
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Widely Used
22
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Based on Apache Lucene
bullFree and Open Source
bullStarted in 1999
bullCreated by Doug Cutting
bullWhatrsquos it do
-Tokenizing
- Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Elasticsearch is a Document
Store
24
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Document Store
bullLike MongoDB and CouchDB
bullDocument DBs
- JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
What is a document
26
genre Crime
ldquolanguage English
country USA
runtime 170
title Scarface
year 1983
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Modeled in JSON
27
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_index imdb
_type movie
_id u17o8zy9RcKg6SjQZqQ4Ow
_version 1
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Schema-Free
bullDynamic Mapping
-Elasticsearch guesses the data-types (string int
floathellip)
28
imdb
movie
properties
country
type stringldquo
ldquostorerdquotrue
ldquoindexrdquofalse
genre
type stringldquo
null_value naldquo
ldquostorerdquofalse
ldquoindextrue
year
type long
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
bullCluster 1N Nodes w same Cluster Name
bullNode One ElasticSearch instance (1 java proc)
bullShard = One Lucene instance
- 0 or more replicas
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
High Availability
bullNo need for load balancer
bullDifferent Node Types
bullIndices are Sharded
bullReplica shards on different Nodes
bullAutomatic Master election amp failover
31
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
About Indices Shards
32
$ curl -XPUT httplocalhost9200twitter -d
settings
index
number_of_shards 3
number_of_replicas 2
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A 2 Shards amp 1 Replica
Index B 3 Shards amp 1 Replica
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Discovery
bullNodes discover each other using multicast
-Unicast is an option
bullEach cluster has an elected master node
-Beware of split-brain
discoveryzenpingmulticastenabled false
discoveryzenpingunicasthosts [host1 host2port host3]
34
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Nodes
bullMaster node handles cluster-wide (Meta-API)
events
-Node participation
-New indices createdelete
-Re-Allocation of shards
bullData Nodes
- Indexing Searching operations
bullClient Nodes
-REST calls
- Light-weight load balancers
35
nodedata | nodemaster
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
The Basics - Shards
bullPrimary Shard
-First time Indexing
- Index has 1N primary shards (default 5)
- Not changeable once index created
bullReplica Shard
-Copy of the primary shard
- Can be changed later
-Each primary has 0N replicas
- HA
bull Promoted to primary if primary fails
bull GetSearch handled by primary||replica36
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Shard Auto-Allocation
bullShard Phases
-Unassigned
- Initializing
-Started
-Relocating37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node Shards Relocate
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Allocation Awareness
bullShard Allocation Awareness
- clusterroutingallocationawarenessattributes rack
-Shards RELOCATE to even distribution
-Primary amp Replica will NOT be on the same rack
value
38
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Cluster State
bullCluster State
-Node Membership
- Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET httplocalhost9200_clusterstatepretty=1
cluster_name elasticsearch_royrusso
version 27
master_node s3fpXfPKSFeUqo1MYZxSng
blocks
nodes
s3fpXfPKSFeUqo1MYZxSng
name Bulldozer
transport_address inet[localhost1270019300]
attributes
metadata
templates
logging_index_all
template logstash-09-
order 1
settings
index
number_of_shards 2
number_of_replicas 1
mappings
date
store false
logging_index
template logstash-
order 0
settings
index
number_of_shards 2
ldquonumber_of_replicasrdquo 1
mappings
ldquodaterdquo
ldquostorerdquo true
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Talking to Elasticsearch
40
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
REST
bullHTTP Verbs GET POST PUT DELETE
bullJSON
bull_cat API
41
curl 19216856109200_cathealthvampts=0
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
The API
bullDocument
bullCluster
-Node
bullIndex
bullSearch
42
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Create a Document
43
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Of notehellip
44
curl -XPOST http1270019200imdbmovie -d
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
createdtrue
Auto-creates Index amp Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Get a Document
45
curl -XGET http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version1
foundtrue
_source
genre Crime
language English
country USA
runtime 170
title Scarface
year 1983
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Update a Document
46
curl -XPUT http1270019200imdbmovieAUwGeWib1u4mCngDYT7y -d
genre Crime
language English
country USA
runtime 180
title Scarface
year 1983
_indeximdb
_typemovie
_idAUwGeWib1u4mCngDYT7y
_version2
createdfalse
More like an Upsert
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Delete a Document
47
curl -XDELETE http1270019200imdbmovieAUwGeWib1u4mCngDYT7y
ldquofoundtrue
ldquo_indeximdb
ldquo_typemovie
ldquo_idAUwGeWib1u4mCngDYT7y
ldquo_version2
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
You can alsohellip
bullPartial document updating
bullSpecify Version
bullSpecify ID
bullMulti-Get API
bullExists API
bullBulk API
48
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
How Searching Works
bullHow it works
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged sorted and returned to client
49
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
REST API - Search
bullFree Text Search
-URL Request
bullComplex Query
50
httplocalhost9200imdbmovie_searchq=scar
httplocalhost9200imdbmovie_searchq=scarface+OR+star
httplocalhost9200imdbmovie_searchq=(scarface+OR+star)+AND+year[1981+TO+1984]
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
REST API ndash Query DSL
curl -XPOST localhost9200_searchpretty -d
query
bool
must [
query_string
query scarface or star
range
year gte 1931
]
51
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
REST API ndash Query DSL
bullBoolean Querybool
must[
match
colorblue
match
titleshirt
]
must_not[
match
sizexxl
]
should[
match
textilecotton
52
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
REST API ndash Query DSL
bullRange Query
-Numeric Date Types
bullPrefixWildcard Query
-Match on partial terms
bullRegExp Query
bullGeo_bbox
-Bounding box filter
bullGeo_distance
-Geo_distance_range
range
founded_year
gte1990
lt2000
53
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Filters
bullFilters recommended over Queries
-Better cache support
54
curl -XGET httplocalhost9200my_indexevents_searchpretty=1 -d
from 0
size 0
query
terms
message [ apples]
minimum_should_match 1
post_filter
terms
userId [ 25476c6788ce g20d5470d7b4 ]
execution or
sort eventDate order desc
explain false
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Analyzers Tokenizers
55
curl -XPUT lsquohttplocalhost9200my_index -d
settings
analysis
analyzer
str_search_analyzer
tokenizer keyword
filter [lowercase]
str_index_analyzer
tokenizer substring
filter [lowercase stop]
tokenizer
substring
type edgeNgram
min_gram 3
max_gram 42
token_chars [letter digit]
curl -XPUT httplocalhost9200my_indexevents_mapping -d
events
properties
eventId type string store true index not_analyzed
userId type string store false index not_analyzed
message
type string store false
search_analyzer str_search_analyzer
index_analyzer str_index_analyzer
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Tokenizers
bullWhitespace
bullNGram
bullEdge NGram
bullLetter
- non-letters
56
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Clients
bullClient list
httpwwwelasticsearchorgguideclients
- Java (Node) Client JS PHP Perl Python Ruby
bullSpring Data
-Uses TransportClient
- Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Monitoring
bullBigDesk
bullKopf
bullHead
bullElasticHQ
bullMarvel
bullSematext SPM
58
Questions
59
Questions
59