+ All Categories
Home > Documents > Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the...

Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the...

Date post: 03-Jun-2020
Category:
Upload: others
View: 30 times
Download: 2 times
Share this document with a friend
130
© 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano [email protected] @gianarb
Transcript
Page 1: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 1

Inside the InfluxDB Storage Engine

Gianluca [email protected]

@gianarb

Page 2: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 2

Page 3: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 3

What is time series data?

Page 4: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 4

Stock trades and quotes

Page 5: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 5

Metrics

Page 6: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 6

Analytics

Page 7: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 7

Events

Page 8: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 8

Sensor data

Page 9: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

Traces

Page 10: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 10

Two kinds of time series data…

Page 11: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 11

Regular time series

t0 t1 t2 t3 t4 t6 t7

Samples at regular intervals

Page 12: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 12

Irregular time series

t0 t1 t2 t3 t4 t6 t7

Events whenever they come in

Page 13: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 13

Why would you want a database for time series

data?

Page 14: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 14

Scale

Page 15: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 15

Example from server monitoring

• 2,000 servers, VMs, containers, or sensor units

• 1,000 measurements per server/unit

• every 10 seconds

• = 17,280,000,000 distinct points per day

Page 16: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 16

Compression

Page 17: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 17

Aging out data

Page 18: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 18

Downsampling

Page 19: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 19

Fast range queries

Page 20: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

Two Databases…

Page 21: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 21

TSDB

Page 22: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 22

Inverted Index

Page 23: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

preliminary intro materials…

Page 24: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 24

Everything is indexed by time and series

Page 25: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 25

Shards

10/11/2015 10/12/2015

Data organized into Shards of time, each is an underlying DBefficient to drop old data

10/13/201510/10/2015

Page 26: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 26

InfluxDB data

temperature,device=dev1,building=b1 internal=80,external=18 1443782126

Page 27: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 27

InfluxDB data

Measurement

temperature,device=dev1,building=b1 internal=80,external=18 1443782126

Page 28: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 28

InfluxDB data

Measurement Tags

temperature,device=dev1,building=b1 internal=80,external=18 1443782126

Page 29: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 29

InfluxDB data

Measurement Tags Fields

temperature,device=dev1,building=b1 internal=80,external=18 1443782126

Page 30: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 30

InfluxDB data

temperature,device=dev1,building=b1 internal=80,external=18 1443782126

Measurement Tags(tagset all together)

Fields Timestamp

Page 31: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 31

InfluxDB data

temperature,device=dev1,building=b1 internal=80,external=18 1443782126

Measurement Fields Timestamp

We actually store up to ns scale timestampsbut I couldn’t fit on the slide

Tags(tagset all together)

Page 32: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 32

Each series and field to a unique ID

temperature,device=dev1,building=b1#internal

temperature,device=dev1,building=b1#external

1

2

Page 33: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 33

Data per ID is tuples ordered by time

temperature,device=dev1,building=b1#internal

temperature,device=dev1,building=b1#external

1

2

1 (1443782126,80)

2 (1443782126,18)

Page 34: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 34

Arranging in Key/Value Stores

1,1443782126

Key Value

80

ID Time

Page 35: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 35

Arranging in Key/Value Stores

1,1443782126

Key Value

802,1443782126 18

Page 36: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 36

Arranging in Key/Value Stores

1,1443782126

Key Value

80

2,1443782126 18

1,1443782127 81 new data

Page 37: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 37

Arranging in Key/Value Stores

1,1443782126

Key Value

80

2,1443782126 18

1,1443782127 81key spaceis ordered

Page 38: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 38

Arranging in Key/Value Stores

1,1443782126

Key Value

80

2,1443782126 18

1,1443782127 81

2,1443782256 15

2,1443782130 17

3,1443700126 18

Page 39: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

Many existing storage engines have this model

Page 40: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 40

New Storage Engine?!

Page 41: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 41

First we used LSM Trees

Page 42: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 42

deletes expensive

Page 43: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 43

too many open file handles

Page 44: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 44

Then mmap COW B+Trees

Page 45: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 45

write throughput

Page 46: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 46

compression

Page 47: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 47

met our requirements

Page 48: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 48

High write throughput

Page 49: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 49

Awesome read performance

Page 50: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 50

Better Compression

Page 51: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 51

Writes can’t block reads

Page 52: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 52

Reads can’t block writes

Page 53: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 53

Write multiple ranges simultaneously

Page 54: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

Hot backups

Page 55: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 55

Many databases open in a single process

Page 56: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 56

Enter InfluxDB’sTime Structured Merge Tree

(TSM Tree)

Page 57: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 57

Enter InfluxDB’sTime Structured Merge Tree

(TSM Tree)like LSM, but different

Page 58: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 58

Components

WALIn

memorycache

IndexFiles

Page 59: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 59

Components

WALIn

memorycache

IndexFiles

Similar to LSM Trees

Page 60: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 60

Components

WALIn

memorycache

IndexFiles

Similar to LSM Trees

Same

Page 61: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 61

Components

WALIn

memorycache

IndexFiles

Similar to LSM Trees

Same like MemTables

Page 62: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 62

Components

WALIn

memorycache

IndexFiles

Similar to LSM Trees

Same like MemTables like SSTables

Page 63: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 63

awesome time series data

WAL (an append only file)

Page 64: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 64

awesome time series data

WAL (an append only file)

in memory index

Page 65: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 65

awesome time series data

WAL (an append only file)

in memory index

on disk index

(periodic flushes)

Page 66: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 66

awesome time series data

WAL (an append only file)

in memory index

on disk index

(periodic flushes)

Memory mapped!

Page 67: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 67

TSM File

Page 68: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 68

TSM File

Page 69: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 69

TSM File

Page 70: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 70

TSM File

Page 71: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 71

TSM File

Page 72: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 72

TSM File

Page 73: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 73

Compression

Page 74: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 74

Timestamps: encoding based on precision and deltas

Page 75: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 75

Timestamps (best case):Run length encoding

Deltas are all the same for a block

Page 76: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 76

Timestamps (good case): Simple8B

Ann and Moffat in "Index compression using 64-bit words"

Page 77: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 77

Timestamps (worst case):raw values

nano-second timestamps with large deltas

Page 78: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 78

float64: double deltaFacebook’s Gorilla - google: gorilla time series facebook

https://github.com/dgryski/go-tsz

Page 79: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 79

booleans are bits!

Page 80: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 80

int64 uses double delta, zig-zagzig-zag same as from Protobufs

Page 81: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 81

string uses Snappysame compression LevelDB uses

(might add dictionary compression)

Page 82: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 82

UpdatesWrite, resolve at query

Page 83: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 83

Deletestombstone, resolve at query & compaction

Page 84: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 84

Compactions

• Combine multiple TSM files

• Put all series points into same file

• Series points in 1k blocks

• Multiple levels

• Full compaction when cold for writes

Page 85: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 85

Example Query

select percentile(90, value) from cpuwhere time > now() - 12h and “region” = ‘west’group by time(10m), host

Page 86: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 86

Example Query

select percentile(90, value) from cpuwhere time > now() - 12h and “region” = ‘west’group by time(10m), host

How to map to series?

Page 87: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 87

Inverted Index!

Page 88: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 88

Inverted Indexcpu,host=A,region=west#idle -> 1cpu,host=B,region=west#idle -> 2 series to ID

Page 89: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 89

Inverted Indexcpu,host=A,region=west#idle -> 1cpu,host=B,region=west#idle -> 2

cpu -> [idle] measurement to fields

series to ID

Page 90: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 90

Inverted Indexcpu,host=A,region=west#idle -> 1cpu,host=B,region=west#idle -> 2

cpu -> [idle]

host -> [A, B]

measurement to fields

host to values

series to ID

Page 91: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 91

Inverted Indexcpu,host=A,region=west#idle -> 1cpu,host=B,region=west#idle -> 2

cpu -> [idle]

host -> [A, B]

region -> [west]

measurement to fields

host to values

region to values

series to ID

Page 92: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 92

Inverted Indexcpu,host=A,region=west#idle -> 1cpu,host=B,region=west#idle -> 2

cpu -> [idle]

host -> [A, B]

region -> [west]

cpu -> [1, 2]host=A -> [1]host=B -> [1]region=west -> [1, 2]

measurement to fields

host to values

region to values

series to ID

postings lists

Page 93: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 93

Index V1

• In-memory

• Load on boot

• Memory constrained

• Slower boot times with high cardinality

Page 94: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 94

Index V2

Page 95: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 95

in memory index on disk index (do we already have?)

time series meta data

Page 96: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 96

in memory index on disk index (do we already have?)

time series meta data

nope

WAL (an append only file)

Page 97: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 97

in memory index on disk index (do we already have?)

time series meta data

nope

WAL (an append only file)

on disk indices

(periodic flushes)

Page 98: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 98

in memory index on disk index (do we already have?)

time series meta data

nope

WAL (an append only file)

on disk indices

(periodic flushes)

(compactions)

on disk index

Page 99: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 99

Page 100: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 100

Index File Layout

Page 101: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 101

Page 102: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 102

Page 103: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 103

Example Key Exists Lookup

[ 76, 234, 129, 352 ] File locations

Page 104: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 104

[ 76, 234, 129, 352 ]

cpu,host=serverA,region=west#idle

Page 105: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 105

[ 76, 234, 129, 352 ]

cpu,host=serverA,region=west#idle

Page 106: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 106

Robin Hood Hashing

• Can fully load table

• No linked lists for lookup

• Perfect for read-only hashes

Page 107: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 107

[ , , , , ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 0, 0, 0, 0 ]

Keys

Probe Lengths

Page 108: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 108

[ , , , , ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 0, 0, 0, 0 ]

Keys

Probe Lengths

A -> 0

Page 109: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 109

[ A, , , , ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 0, 0, 0, 0 ]

Keys

Probe Lengths

A -> 0

Page 110: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 110

[ A, , , , ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 0, 0, 0, 0 ]

Keys

Probe Lengths

B -> 1

Page 111: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 111

[ A, B, , , ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 0, 0, 0, 0 ]

Keys

Probe Lengths

B -> 1

Page 112: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 112

[ A, B, , , ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 0, 0, 0, 0 ]

Keys

Probe Lengths

C -> 1

Page 113: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 113

[ A, B, C, , ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 0, 0, 0, 0 ]

Keys

Probe Lengths

C -> 2

Page 114: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 114

[ A, B, C, , ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 0, 1, 0, 0 ]

Keys

Probe Lengths

C -> probe 1

Page 115: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 115

[ A, B, C, , ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 0, 1, 0, 0 ]

Keys

Probe Lengths

D -> 0

Page 116: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 116

[ A, B, C, , ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 0, 1, 0, 0 ]

Keys

Probe Lengths

D -> probe 1

Page 117: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 117

[ A, D, C, , ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 1, 1, 0, 0 ]

Keys

Probe Lengths

B -> probe 1

Page 118: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 118

[ A, D, C, , ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 1, 1, 0, 0 ]

Keys

Probe Lengths

B -> probe 2

Page 119: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 119

[ A, D, C, B, ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 1, 1, 2, 0 ]

Keys

Probe Lengths

B -> probe 2

Page 120: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 120

Rob probe rich, give to probe poor

Page 121: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 121

Refinement: average probe

Page 122: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 122

Cache Hit

[ A, D, C, B, ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 1, 1, 2, 0 ]

Keys

Probe LengthsAverage: 1

Page 123: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 123

Cache Hit

[ A, D, C, B, ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 1, 1, 2, 0 ]

Keys

Probe LengthsAverage: 1

D -> hashes to 0 + 1

Page 124: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 124

Cache Miss

[ A, D, C, B, ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 1, 1, 2, 0 ]

Keys

Probe Lengths

Z -> hashes to 0

Page 125: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 125

Cache Miss

[ A, D, C, B, ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 1, 1, 2, 0 ]

Keys

Probe Lengths

Z -> move probe 1

Page 126: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 126

Cache Miss

[ A, D, C, B, ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 1, 1, 2, 0 ]

Keys

Probe Lengths

Z -> move probe 2

Page 127: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 127

Cache Miss

[ A, D, C, B, ]

[ 0, 1, 2, 3, 4 ] Positions

[ 0, 1, 1, 2, 0 ]

Keys

Probe Lengths

Max Probe 2, so Z not present

Page 128: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 128

Cardinality Estimation

Page 129: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

© 2017 InfluxData. All rights reserved. 129

HyperLogLog++

Page 130: Inside the InfluxDB Storage Engine · © 2017 InfluxData. All rights reserved. 1 Inside the InfluxDB Storage Engine Gianluca Arbezzano gianluca@influxdb.com @gianarb

Gianluca [email protected]

@gianarb

Thank you.


Recommended