Date post: | 02-Dec-2014 |
Category: |
Data & Analytics |
Upload: | jodok-batlogg |
View: | 339 times |
Download: | 2 times |
SQL on Elasticsearch?
How all started
You know, for searchquerying 24 000 000 000 Records in 900ms
@jodok
6 ES Master Nodesc1.xlarge
40 ES nodes per zonem1.large8 EBS Volumes
6 Node Hadoop Cluster+ Spot Instances
3 AP server / MCc1.xlarge
Elastic Search as Primary Storage?
NoSQL Roadshow 2013 Jodok Batlogg
• Security Model? • Transactions? • Data security? • Toolsets? • Larger Computations? • Availability?
D I S T R I B U T E D D A T A S T O R E W I T H S Q L . S I M P L E . R E L I A B L E . S C A L A B L E .
Open Source (Apache 2.0)
shared nothing
is high available and cheap to operate.
not NOSQL but SQL
NOFS but distributed BLOBs
Storage
Data Aggregation
Query
Client
Network/Cluster
CRATE Dashboard Python JavaDB-API
SQLAlchemyCRATE Shell
ES native
Transport
FB Presto SQL Parser
Query planner
Bulk import/export
BLOB streaming
Distributed SQL
ES Transport protocol
ES Discovery and state
Lucene BLOB storageES
CRATE DATA – Module overview
3rd party Open
Source Module
s
CRATE
BLOB streaming support
Netty
ES Scatter/Gather
Distributed reduce Data transformation and reindex support
ES Sharding
Ruby
S T A R T A C L U S T E R I N 1 M I N
H T T P S : / / C R A T E . I O
BLOB Storage
Distributed Accurate Aggregations
Partitioned Tables
Import/Export
Update by Query
Insert by Query
Integrated Admin-UI
How is Crate Data different than Elasticsearch?
Demo Video
http://bigdatanerd.files.wordpress.com/2011/12/cap-theorem.jpg
• Basically Available - you always get an response
• Soft State - it’s not consistent all the time.
• Eventually Consistent - it becomes consistent at a later point in time
BASE & CAP