+ All Categories
Home > Documents > Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Date post: 19-Jan-2018
Category:
Upload: mariah-roberts
View: 214 times
Download: 0 times
Share this document with a friend
Description:
Garbage Collection Garbage example (the figure has a bug) “ VVii ” should be “ VVEE ” Determine liveness: Within each block, store information about which logical blocks are stored within each page Checking the mapping table for the logical block
58
Lecture 23 SSD Data Integrity and Protection Distributed Systems
Transcript
Page 1: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Lecture 23SSD

Data Integrity and Protection

Distributed Systems

Page 2: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

SSD• Basic Flash Operations: read, erase, and program• Bit, page, block, and bank• Wear out• Flash translation layer (FTL) – control logic to turn

client reads and writes into flash operations• Direct Mapped is bad. Why?

Page 3: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Garbage Collection • Garbage example (the figure has a bug)

• “VVii” should be “VVEE”

• Determine liveness:• Within each block, store information about which logical

blocks are stored within each page• Checking the mapping table for the logical block

Page 4: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Garbage Collection Steps• Read live data (pages 2 and 3) from block 0• Write live data to end of the log• Erase block 0 (freeing it for later usage)

Page 5: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Block-Based Mappingto Reduce Mapping Table Size• Logical address: the least significant two bits as offset• Page mapping: 2000→4, 2001→5, 2002→6, 2003→7

Before

After

Page 6: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Problem withBlock-Based Mapping• Small write• The FTL must read a large amount of live data from the

old block and copy it into a new one

• What might be a good solution?• Page-based mapping is good at …, but bad at …• Block-based mapping is bad at …, but good at …

Page 7: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Hybrid Mapping• Log blocks: a few blocks that are per-page mapped• Call the per-page mapping log table

• Data blocks: blocks that are per-block mapped• Call the per-block mapping data table

• How to read and write?• How to switch between per-page mapping and per-

block mapping?

Page 8: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Hybrid Mapping Example

• Overwrite each page

Page 9: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Switch Merge• Before and After

Page 10: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Partial Merge• Before and After

Page 11: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Full Merge• The FTL must pull together pages from many other

blocks to perform cleaning• Imagine that pages 0, 4, 8, and 12 are written to log

block A

Page 12: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Wear Leveling• The FTL should try its best to spread that work

across all the blocks of the device evenly• The log-structuring approach does a good initial job

• What if a block is filled with long-lived data that does not get over-written?• Periodically read all the live data out of such blocks and

re-write it elsewhere

Page 13: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

SSD Performance

• Fast but expensive• An SSD costs 60 cents per GB• A typical hard drive costs 5 cents per GB

Page 14: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Data Integrity and Protection• Ensure that the data you put into your storage

system is the same when the storage system returns it to you

Page 15: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Disk Failure Modes• Fail-stop as assumed by RAID

• Fail-partial:• Latent-sector errors (LSEs): a disk sector (or group of

sectors) has been damaged in some way, e.g., head crash, cosmic rays • Silent faults, e.g., block corruption caused by buggy

firmware or faulty bus

Page 16: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Findings about LSEs• Costly drives with more than one LSE are as likely to

develop additional errors as cheaper drives• For most drives, annual error rate increases in year two• LSEs increase with disk size• Most disks with LSEs have less than 50• Disks with LSEs are more likely to develop additional LSEs• There exists a significant amount of spatial and temporal

locality• Disk scrubbing is useful (most LSEs were found this way)

Page 17: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Findings about Corruption• Chance of corruption varies greatly across different

drive models within the same drive class• Age affects are different across models• Workload and disk size have little impact on corruption• Most disks with corruption only have a few corruptions• Corruption is not independent with a disk or across

disks in RAID• There exists spatial locality, and some temporal locality• There is a weak correlation with LSEs

Page 18: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Latent Sector Errors• Many ways:• head crash• cosmic rays can also flip bits

• How to detect:• A storage system tries to access a block, and the disk

returns an error (with ECC)

• How to fix:• Use whatever redundancy mechanism it has to return

the correct data

Page 19: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Detecting Corruption:The Checksum• Common Checksum Functions• XOR, addition• Fletcher checksum• Cyclic redundancy check (CRC)• Collision is possible

Page 20: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Misdirected Writes• Arises in disk and RAID controllers which write the

data to disk correctly, except in the wrong location• Physical identifier (physical ID)

Page 21: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Lost Writes• Occur when the device informs the upper layer that

a write has completed but in fact it never is persisted。• Do any of our strategies from above (e.g., basic

checksums, or physical ID) help to detect lost writes?• Solutions:

• Perform a write verify or read-after-write• Some systems add a checksum elsewhere in the system

to detect lost writes• ZFS includes a checksum in each file system inode and

indirect block for every block included within a file

Page 22: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Scrubbing• When do these checksums actually get checked?

• Many systems utilize disk scrubbing:• Periodically read through every block of the system• Check whether checksums are still valid• Schedule scans on a nightly or weekly basis

Page 23: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Overhead of Checksumming• Space• Small

• Time• Noticeable• CPU overhead• I/O overhead

Page 24: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Distributed Systems

Page 25: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

OSTEP Definition• Def: more than 1 machine

• Examples: • client/server: web server and web client • cluster: page rank computation

• Other courses• Networking • Distributed Systems

Page 26: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Why Go Distributed?• More compute power

• More storage capacity

• Fault tolerance

• Data sharing

Page 27: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

New Challenges• System failure: need to worry about partial failure.

• Communication failure: links unreliable

• Performance

• Security

Page 28: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Communication• All communication is inherently unreliable.

• Need to worry about: • bit errors • packet loss • node/link failure

Page 29: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Overview• Raw messages

• Reliable messages

• OS abstractions • virtual memory • global file system

• Programming-languages abstractions • remote procedure call

Page 30: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Raw Messages: UDP• API: • reads and writes over socket file descriptors• messages sent from/to ports to target a process on

machine

• Provide minimal reliability features: • messages may be lost• messages may be reordered• messages may be duplicated• only protection checksums

Page 31: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Raw Messages: UDP• Advantages • lightweight • some applications make better reliability decisions

themselves (e.g., video conferencing programs)

• Disadvantages • more difficult to write application correctly

Page 32: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Reliable Messages Strategy• Using software, build reliable, logical connections

over unreliable connections.

• Strategies: • acknowledgment

Page 33: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

ACK

• Sender knows message was received.

Sender [send message]

[recv ack]

Receiver

[recv message][send ack]

Page 34: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

ACK

• Sender misses ACK... What to do?

Sender [send message]

Receiver

Page 35: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Reliable Messages Strategy• Using software, build reliable, logical connections

over unreliable connections.

• Strategies: • acknowledgment• timeout

Page 36: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

ACKSender

[send message] [start timer]

... waiting for ack ...[timer goes off][send message]

[recv ack]

Receiver

[recv message][send ack]

Page 37: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Timeout: Issue 1• How long to wait?• Too long: system feels unresponsive!• Too short: messages needlessly re-sent!

• Messages may have been dropped due to overloaded server. Aggressive clients worsen this.• One strategy: be adaptive!• Adjust time based on how long acks usually take. • For each missing ack, wait longer between retries.

Page 38: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Timeout: Issue 2• What does a lost ack really mean?• Maybe the receiver does not get the message• Maybe the receiver gets the message, but the ack is not

delivered successfully

• ACK: message received exactly once• No ACK: message received at most once• Proposed Solution• Sender could send an AckAck so receiver knows whether

to retry sending an Ack• Sound good?

Page 39: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Reliable Messages Strategy• Using software, build reliable, logical connections

over unreliable connections.

• Strategies: • acknowledgment• timeout• remember sent messages

Page 40: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Receiver Remembers MessagesSender

[send message]

[timeout][send message]

[recv ack]

Receiver

[recv message][send ack]

[ignore message][send ack]

Page 41: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Solutions• Solution 1: remember every message ever sent.

• Solution 2: sequence numbers• give each message a seq number• receiver knows all messages before an N have been seen • receiver remembers messages sent after N

Page 42: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

TCP• Most popular protocol based on seq nums.

• Also buffers messages so they arrive in order

• Timeouts are adaptive.

Page 43: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Overview• Raw messages

• Reliable messages

• OS abstractions • virtual memory • global file system

• Programming-languages abstractions • remote procedure call

Page 44: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Virtual Memory• Inspiration: threads share memory

• Idea: processes on different machines share mem• Strategy: • a bit like swapping we saw before • instead of swap to disk, swap to other machine • sometimes multiple copies may be in memory on

different machines

Page 45: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Virtual Memory Problems• What if a machine crashes? • mapping disappears in other machines • how to handle?

• Performance? • when to prefetch? • loads/stores expected to be fast

• DSM (distributed shared memory) not used today.

Page 46: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Global File System• Advantages • file access is already expected to be slow • use common API • no need to modify applications (sorta true)

• Disadvantages • doesn’t always make sense, e.g., for video app

Page 47: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

RPC: Remote Procedure Call• What could be easier than calling a function?

• Strategy: create wrappers so calling a function on another machine feels just like calling a local function.

• This abstraction is very common in industry.

Page 48: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

RPCMachine A

int main(...) { int x = foo(); }

// client wrapperint foo(char *msg) { send msg to B recv msg from B }

Machine Bint foo(char *msg) { ... }

// server wrappervoid foo_listener() { while(1) { recv, call foo } }

Page 49: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

RPC Tools• RPC packages help with this with two components.

• (1) Stub generation • create wrappers automatically

• (2) Runtime library • thread pool • socket listeners call functions on server

Page 50: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Client Stub Steps• Create a message buffer• Pack the needed information into the message

buffer• Send the message to the destination RPC server• Wait for the reply• Unpack return code and other arguments• Return to the caller

Page 51: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Server Stub Steps• Unpack the message• Call into the actual function• Package the results• Send the reply

Page 52: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Wrapper Generation• Wrappers must do conversions: • client arguments to message• message to server arguments • server return to message• message to client return

• Need uniform endianness (wrappers do this).

• Conversion is called marshaling/unmarshaling, or serializing/deserializing.

Page 53: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Stub Generation• Many tools will automatically generate wrappers: • rpcgen • thrift • protobufs

• Programmer fills in generated stubs.

Page 54: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Wrapper Generation: Pointers• Why are pointers problematic?• The addr passed from the client will not be valid on the

server.

• Solutions? • smart RPC package: follow pointers • distribute generic data structs with RPC package

Page 55: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Runtime Library• Naming: how to locate a remote service

• How to serve calls? • usually with a thread pool

• What underlying protocol to use? • usually UDP

• Some RPC packages enable a asynchronous RPC

Page 56: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

RPC over UDP• Strategy: use function return as implicit ACK.

• Piggybacking technique.

• What if function takes a long time? • then send a separate ACK

Page 57: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Conclusion• Many communication abstraction possible: • Raw messages (UDP) • Reliable messages (TCP) • Virtual memory (OS) • Global file system (OS) • Function calls (RPC)

Page 58: Lecture 23 SSD Data Integrity and Protection Distributed Systems.

Next• NFS• AFS


Recommended