Date post: | 12-Nov-2014 |
Category: |
Technology |
Upload: | luiz-rocha |
View: | 2,468 times |
Download: | 3 times |
ElasticSearch
Friday, March 8, 13
Links
• https://bitbucket.org/lsdr/es/overview
• https://confluence.abril.com.br/x/J5I_AQ
Friday, March 8, 13
Instalação
• Mac OSX:
• brew install elasticsearch
• CentOS 6.x:
• não tem RPM oficial :-(
• https://gist.github.com/lsdr/5117589
Friday, March 8, 13
Setup
cluster.name: buffalonode.name: "Bruce Smith"
path.data: /usr/local/var/elasticsearch/path.logs: /usr/local/var/log/elasticsearch/path.plugins: /usr/local/var/lib/elasticsearch/plugins
network.host: 127.0.0.1
suficiente para subir um server local!
Friday, March 8, 13
Setup++
• Configuração de um nó:
MasterMaster
TRUE FALSE
DataTRUE Development Workhorse
DataFALSE Coordinator Load Balancer
http://elasticsearch.org/guide/reference/modules/node.html
Friday, March 8, 13
Setup++
• # shards e # replicas
• possível aumentar replicas em runtime, shards não
• Plugins “obrigatórios”
• só inicia nó se estiverem presentes
• Tunning JVM
• Outros módulos: Thrift, JMX
Friday, March 8, 13
“Hello World!”
$ curl -XGET 'localhost:9200/world'
No handler found for uri [/world] and method [GET]
Friday, March 8, 13
“Hello World!”
curl -XPOST "localhost:9200/world/hello" -d'{ "text": "hello world", "lang": "en"}'
Friday, March 8, 13
“Hello World!”
{ "ok": true, "_index": "world", "_type": "hello", "_id": "A5HX8IhTR0CzMNWHBPhEqA", "_version": 1}
POST: id automágico | PUT: id “manual”
Friday, March 8, 13
“Hello World!”
{ "count":1, "_shards": { "total": 3, "successful": 3, "failed":0 }}
$ curl -XGET 'localhost:9200/world/_count’
Friday, March 8, 13
“Hello World!”
"hits" [ { "_index": "world", "_type": "hello", "_id": "A5HX8IhTR0CzMNWHBPhEqA", "_score": 1.0, "_source": {"text": "hello world", "lang": "en"} }]
$ curl -XGET 'localhost:9200/world/_search’
Friday, March 8, 13
“Hello World!”
{ "hello": { "properties": { "lang": { "type": "string" }, "text": { "type": "string" } } }}
$ curl -XGET 'localhost:9200/world/hello/_mapping’
Friday, March 8, 13
Mapping
Mapping is the process of defining how a document should be mapped to the Search Engine, including its searchable characteristics such as which fields
are searchable and if/how they are tokenized.
http://elasticsearch.org/guide/reference/mapping/
Friday, March 8, 13
Querying
• URI Request
• Não expõe todos os features do ES
• /guide/reference/api/search/uri-request.html
• Query DSL
• POST-based (no cache!)
• /guide/reference/query-dsl/
Friday, March 8, 13
Querying
• Brincar de fazer queries em matérias!
• Queries simples funcionam, mas...
• facets quebram?
Friday, March 8, 13
Mapping
By default, there isn’t a need to define an explicit mapping, (...) Only when the defaults need to be
overridden must a mapping definition be provided.
http://elasticsearch.org/guide/reference/mapping/
Friday, March 8, 13
Mapping
• Override não é trivial
• Possivelmente envolve reindexação
• Esse é o trabalho do time
• “massagistas de dados”
Friday, March 8, 13
Analyzer
curl -XGET 'localhost:9200/_analyze?analyzer=standard' -d'Esporte::Futebol'
curl -XGET 'localhost:9200/_analyze?analyzer=keyword' -d'Esporte::Futebol'
Friday, March 8, 13
River
Friday, March 8, 13
River
• Cria índices/mappings se não existir
• lembrar limitações
• Pulling
• elasticsearch-river-mongo
• Explode se o mongo estiver fora
• Demora (se perde?) quando voltar
Friday, March 8, 13
River
• Instalar River (plugin)
• Criar River
• mongorestore
Friday, March 8, 13
River
$ES_HOME/bin/plugin -‐install elasticsearch/elasticsearch-‐mapper-‐attachments/1.6.0
$ES_HOME/bin/plugin -‐url https://github.com/downloads/richardwilly98/elasticsearch-‐river-‐mongodb/elasticsearch-‐river-‐mongodb-‐1.6.1.zip -‐install river-‐mongodb
Friday, March 8, 13
Rivercurl -XPUT "localhost:9200/_river/v/_meta" -d '{ "type": "mongodb", "mongodb": { "db": "alx_midia", "collection": "videos", "servers": [ { "host": "localhost", "port": "27017" } ] }, "index": { "name": "videos", "type": "documents" }}'
origem - destino
Friday, March 8, 13
Rivercurl -XGET "localhost:9200/_river/v/_meta"{ "_index": "_river", "_id": "_meta", "exists": true, "_source": { "type": "mongodb", "mongodb": { "db": "alx_midia", "collection": "videos", "servers": [ { "host": "localhost", "port": "27017" } ] }, "index": { "name": "videos", "type": "documents" } }}
Friday, March 8, 13
River
$ mongorestore --host localhost --port 27017 --noIndexRestore alx_midia
Friday, March 8, 13
River
[videos] creating index, cause [api], shards [3]/[1], mappings [][_river] update_mapping [v] (dynamic)[mongodb][v] No known previous slurping time for this collection[_river] update_mapping [v] (dynamic)[videos] update_mapping [documents] (dynamic)Indexed 100 insertions 0, updates, 0 deletions, 100 documents per secondIndexed 100 insertions 0, updates, 0 deletions, 100 documents per secondIndexed 81 insertions 0, updates, 0 deletions, 81 documents per secondIndexed 100 insertions 0, updates, 0 deletions, 100 documents per secondIndexed 15 insertions 0, updates, 0 deletions, 15 documents per second
Friday, March 8, 13
River
• Na operação padrão, não vai ter “restore” em caso de falha
• Necessário pensar em uma solução de “recrawling”
Friday, March 8, 13