1Dennis Kafura – CS5204 – Operating Systems
Big Table:Distributed Storage System For
Structured Data
Sergejs Melderis
1
2
BigTable
Dennis Kafura – CS5204 – Operating Systems
Unstructured Data vs. Structured Data
Unstructured data refers to computerized information that either does not have a data model plain text, audio
Structured data can be described by data modelFlat Hierarchical Network RelationalDimensionalObject-relational
3
BigTable
Dennis Kafura – CS5204 – Operating Systems
Relational Model and RDBMS
most popular model of organizing structured datamodel based on first-order predicate logicprovides a declarative method for specifying data
and queries via SQLdata is organized in tables of fixed-length recordsvariety of open source and commercial
implementationsprovides ACID properties
3
4
BigTable
Dennis Kafura – CS5204 – Operating Systems
NoSQL
not relational databaseno fixed table schemasno join operationsno sql
flexible and/or no data modelusually do not provide ACID propertiesscale horizontally
4
5
BigTable
Dennis Kafura – CS5204 – Operating Systems
BigTable
distributed, high performance, fault-tolerant, NoSql storage system build on top of Google File System
designed to scale to a very large size on low cost commodity hardware
it was designed by Google and used in various projects (web indexing)
the paper was published in 2006related implementations
HBaseHypertableApache CassandraNeptune 5
6
BigTable
Dennis Kafura – CS5204 – Operating Systems
BigTable Data Model
sparse, distributed, persistent multi-dimensional sorted map
map is indexed by a row key, column family, column key, and a timestamp
{ row : { column_family : {
column : { timestamp : value}
}
}
6
7
BigTable
Dennis Kafura – CS5204 – Operating Systems
Webtable
7
“<html>...” “CNN” “CNN.com”
“contents” “anchor:cnnsi.com “anchor:my.look.ca”
t6 t9 t9“com.cnn.www”
8
BigTable
Dennis Kafura – CS5204 – Operating Systems
Relational Data Model
8
Student
student_id - PK
first_name
last_name
birthday
major
academic_level
Course
crn PK
course
title
type
instructor_id
seats
StudentCourse
student_id
crn
9
BigTable
Dennis Kafura – CS5204 – Operating Systems
Student table
info course
last_name <crn>
first_name
birthday
major
academic_level
student_id
Row Key Column Family
Column Qualifier
Column Qualifier
10
BigTable
Dennis Kafura – CS5204 – Operating Systems
Course table
info students
course <student_id>
title
type
instructor_id
seats
crn
Row Key Column Family
Column Qualifier
Column Qualifier
11
BigTable
Dennis Kafura – CS5204 – Operating Systems
Example
11
“Sergejs” “Melderis”“Computer Science”
“YES” “NO”
info:first_name info:last_name info:major courses:96322 courses:96320
“905514”
“CS5204”“Operating Systems”
“1983943” “YES” “YES”
info:course info:title info:instructor_id students:905514 students:905520
“96322”
12
BigTable
Dennis Kafura – CS5204 – Operating Systems
Students data view in JSON
{ 905514: { info : {
first_name : { t1 : Sergejs },last_name : { t1 : Melderis },major : { t1 : Comp Science }
}, courses : {
96322: { t1 : “YES” },96320: { t2 : “NO” }
}
}
12
13
BigTable
Dennis Kafura – CS5204 – Operating Systems
Rows
row keys are arbitrary strings up to 64 KBread and write of data under a single row is atomicordered in lexicographic order by row keyrow range is dynamically partitioned into blocks
called tablets tablets are units of distribution and loadbalancing
13
14
BigTable
Dennis Kafura – CS5204 – Operating Systems
Columns
Column keys are grouped by column familiesColumn family is a basic unit of access controlAll data stored in a column family is of the same
typeNumber of column families should be smallThere can be unlimited number of columnsColumn key is named using family:qualifier
14
15
BigTable
Dennis Kafura – CS5204 – Operating Systems
Timestamps
Bigtable can contain multiple versions of the same data
timestamps are 64-bit integers assigned by Bigtable or client
client can specify to keep up to n versions of data
15
16
BigTable
Dennis Kafura – CS5204 – Operating Systems
Implementation
client libraryone master server distributed lock service called Chubbymany tablet servers containing several tabletstablet server
handles read and write requestsautomatically splits tablets that have grown too large (100 -
200 MB)
client data directly goes to tablet server
16
17
BigTable
Dennis Kafura – CS5204 – Operating Systems
Tablet Location
three-level hierarchy to store tablet locationfirst level is stored in lock serviceroot tablet contains the location of metadata tablesmetadata tablets contain the location of user tables
UserTable1
UserTable2
METADATAtablets
Root tabletLock Service
18
BigTable
Dennis Kafura – CS5204 – Operating Systems
Distribution of data
One master serverChubby distributed lock serviceHundred or thousands of tablet serversEach tablet contains a contiguous range of rowsMaster distributes tablets across of serversEach tablet server contains tablets with different ranges
18
19
BigTable
Dennis Kafura – CS5204 – Operating Systems
Tablet Representation
19
SSTable SSTable
memtable Read Op
Write Op
tablet log
Memory
GFS
20
BigTable
Dennis Kafura – CS5204 – Operating Systems
Compactions
compaction is a process of writing memtable to SSTable
minor compaction write memtable to SSTableshrinks the memory usage of the tablet serverreduces the commit log
merging compaction merges several SSTablesmajor compaction rewrites all SSTables into
exactly one SSTable
20
21
BigTable
Dennis Kafura – CS5204 – Operating Systems
API
create, delete tables and column familieswrite or delete valueslook up values from individual rowsscan over a subset of the data in a table
21
22
BigTable
Dennis Kafura – CS5204 – Operating Systems 22