+ All Categories
Home > Data & Analytics > Elasticsearch

Elasticsearch

Date post: 13-Jan-2017
Category:
Upload: ricardo-peres
View: 143 times
Download: 1 times
Share this document with a friend
31
Titulo Subtitulo Ricardo Peres @rjperes75
Transcript

Titulo

SubtituloRicardo Peres@rjperes75

Introduction

NoSQL database for indexing JSON contentsDocuments are indexed as they are added (< 1s)Schema-less (kind of…)DistributedHigh performanceREST semanticsGraph capabilities Based on LucenePart of the ELK stackOpen source!

Cluster

A collection of servers (nodes) running Elasticsearch

Single masterMulticast based discovery (can be explicit)

Shards

Indexes are distributed by shards – default is 5 shards and 1 replica (cluster)

Defined at index creation timeTransparent to the userIt is possible to define a hashing function

Indexes

A collection of typesSimilar to a database

Types

Collection of documentsHas a schema (implicit or explicit)Similar to a table

Documents

Self-contained dataExist in a typeHave an idHave a versionHave a schemaCan have expiration

Fields

Documents are structured in fieldsSpecial fields: _id, _uid, _index, _type_timestamp, _all, _source, _ttl, _meta, _parent, _routing are optional

Data Types

string long, integer, short, byte,

double, floatdatebooleanbinary geo_point geo_shapeobjectnested

ip completion token_count

arrays

Creating a Document

Auto IdPOST /website/blog{

"title" : “My Blog", "url" :

"http://my/blog", "tags" :

[ "development" ]}

Explicit IdPOST /website/blog/1{

"title" : "My Blog", "url" :

"http://my/blog", "tags" : [ "

development " ]}

Updating a Document

PartialPOST /website/blog/1/_update{

"doc" :{

"tags" : [ "testing" ],

"views": 1}

}

FullPOST /website/blog/1/_update{

"title" : "My Blog", "url" :

"http://my/blog", "tags" : [ "testing" ]

}

Deleting a Document

SingleDELETE /website/blog/1

IndexDELETE /website

MappingsCreated at index or type level implicitly or explicitlyCannot modify, only addCan enforce schema or notPUT website

{

"mappings": {

"blog": {

" _timestamp”: {

"enabled" : true

},

"dynamic" : "strict",

"properties": {

"title": {

"type": "string",

"analyzer": "standard"

}

}

}

}

}

Mapping Templates

Automatically apply mappings to new typesPUT website{ "mappings": { "post": { "dynamic_templates": [ { "timestamp": {

"date_detection": true, "dynamic_date_formats": [ "yyyy-MM-dd HH:mm", "yyyy-MM-dd" ], "match": "timestamp", "match_mapping_type": "date", "mapping": { "type": "date", "format" : "yyyy-MM-dd HH:mm" }

} } ] } }}

Query and Filter Context

Queries: scoring of the results

Filters determine what appears in the resultsAre cached

Querying

Search API Uses the URLStarting with <index> and <type> is optional/<index>/<type>/_search?q=something/<index>/<type1>,<type2>/_search?q=something_search?q=something_search?q:field:value_search?q=+firstname(john mary)&-surname:smith

Query DSL Query and filter context simple_query_string,

query_string, match, term, terms, range, multi_match, match_phrase, missing, exists, regexp, fuzzy, prefix, ids

bool, dis_max more_like_this, script,

template

Pagination, Sorting and Projection

size, fromsortfieldsPOST website/post/_search{

“size”: 10,“from”: 0,“sort”: {

“timestamp”: {“order”: “desc”

}},“fields”: [ “title”, “_id” ]

}

Percolator

Search in reverse: first define the query, then add documents to it

Querying a document gives all percolator queries that it matches

Relations

No joins, but some alternatives

Parent/child: has_child, has_parent

Nested objects

Terms filter lookup: terms with type and id

Relevance

Term Frequency/Inverse Document Frequency/Field Length NormCustom scoresA match hit/miss can be explained

Indexing

StemmingTokenizationNormalization

Index Aliases

Used to refer to one or more indexes, one or more types, possibly with a filterUseful for "moving indexes" (month, year, country, etc)

POST /_aliases{

"actions" : [ {"add" : {

"indices" : [ "social-2015", "social-2016" ],"alias" : "social-testing","filter" : {

"term" : {"tag" : "testing"

} }

} } ]

}

Alias Templates Creates an alias when a type is created

POST /_template/social{ "order": 0, "template": "social-*", "settings": { "index": { "refresh_interval": "5s" } }, "mappings": {}, "aliases": { "social": {} }}

Bulk Operations

Perform multiple operations (index, update, delete) at once

POST bulk/data/_bulk{ "index" : { "_id" : "1" } }{ "field1" : "value1" }{ "index" : { "_id" : "2" } }{ "field1" : "value1" }{ "index" : { "_id" : "3" } }{ "field1" : "value1" }{ "update" : { "_id" : "2" } }{ "doc": { "field2": "value2" } }{ "delete" : { "_id" : "3" } }

Analytics Aggregations Can be nested Can use scripts

GET /megacorp/employee/_search{ "aggs": { "all_interests": { "terms": { "field": “feature“ }, “aggs”: { “average_price”: { “field”: “price” } } } }}

APIs

REST (native) .NET JavaScript/Node.js Python Java Groovy PHP Perl Ruby

PluginsMarvelSenseWatcherGraphShieldES-Hadoop

HeadKopfElasticsearch-SQL

Kibana

ReportingDashboards

Logstash Collect and transform data Input – Filters – Outputs Sources/destinations:

Elasticsearch File Syslog Windows Eventlog Redis RabbitMQ GitHub HTTP Beats Twitter WebSocket …

Thank you!

Thank you for attending!

@[email protected]://weblogs.asp.net/ricardoperes


Recommended