+ All Categories
Home > Documents > Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA...

Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA...

Date post: 15-Jul-2018
Category:
Upload: doanh
View: 226 times
Download: 1 times
Share this document with a friend
25
Achieving UFS Host Throughput For System Performance Yifei-Liu CAE Manager, Synopsys Copyright © 2013 Synopsys Mobile Forum 2013
Transcript
Page 1: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

Achieving UFS Host Throughput For System Performance

Yifei-LiuCAE Manager, Synopsys

Copyright © 2013 SynopsysMobile Forum 2013

Page 2: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

Agenda

• UFS Throughput Considerations to Meet Performance Objectives

• UFS Host Controller IP Required Features for Implementation Success

• Meeting UFS Throughput Requirements with UFS Host Controller IP Designed to Maximize UFS Performance

• UFS Host Solution to Future Proof Your Design

Page 3: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

UFS Throughput Considerations to Meet Performance Objectives

Page 4: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

UFS Subsystem & Its Components

• UFS standard defines the throughput

• Achieving these throughput goals can be challenging

• Requires best practices to maximize throughput

StorageSoC

UniP

roUni

Pro

M-PHY

Tx

Rx UFS Host

ContrlM-PHY

Rx

Tx

UF

S

Dev

ice

Point to point topology

Core

Systembus

Page 5: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

UFS 1.1/2.0 Host ControllerThroughput Requirements

UFS 1.1 UFS 2.0

Requirement Gear2 L1 Rate B2915.2Mbps ~=3Gbps ~=365MBps

Gear3 L1 Rate B5836.8 Mbps ~=6Gbps ~=730MBps

20% 8b10bEncoding Overhead

~= 292MBps ~= 584MBps

~ 6% Of UniProOverhead

~= 275MBps ~= 550MBps

Target 300MBps (Unidirectional) 600MBps (Bidirectional)

600MBps (Unidirectional) 1200MBps (Bidirectional)

Total Available Bandwidth at System Bus• Depends on bandwidth allocation on the SOC• Typically can vary from 5% to 30%

System Bandwidth Requirement • UFS 1.1: 800 – 1600MBps• UFS 2.0: 800 – 2400MBps

Page 6: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

UFS 1.1/2.0 Host ControllerE2E Throughput Requirements

• End to end raw data throughput

– Doorbell set to doorbell clear

– For Writes include

• CMD transmission, CMD processing by device, RTT receipt from device, DataOut transmission, response receive-al and processing with system update.

– For Reads include

• CMD transmission, CMD processing by device, Response receipt from device, DataIntransmission time, response receive-al and processing with system update

Page 7: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

UFS 1.1/2.0 Host ControllerE2E Throughput Requirements

• End to end raw data throughput

– Excluding UPIU overhead

– Excluding device Flash read and write times

– Including UniPro and M-PHY latencies

– For single command writes

• 90MBps onwards (assumed conditions of UFS setup)

– For single command reads

• 90MBps onwards(assumed conditions of UFS setup)

Page 8: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

UFS Host Controller Throughput Considerations

• System bus side

– System clock frequency

– System fabric configuration• Bus width, burst, size, etc.

• DMA

• Buffer/FIFO size

• Outstanding request number

– System memory bandwidth, efficiency and access latency

• UFS subsystem side

– UniPro setup• C-Port width, FIFO sizes, Group ack, Timeouts, etc.

– Device Response• Type of LUN, number and size of RTTs, attributes, capabilities

– Device/flash read and write times

– Transaction/transfer setup • Read/write sequences, read only, write only, sizes, etc.

• Application software or device drivers

SoC

Unipro

M-PHY

Tx

Rx UFS Host

Contlr

Core

SystemBus

SW

Page 9: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

UFS Subsystem and IP Design Considerations

• A typical system may have:

– Multiple CPUs

– System memory interface

– Several bus bridges (AXI, AHB...)

– Network-on-chip

– Several high bandwidth peripherals

• UFSHC might have no direct path into system memory

CPU System Memory

System Bus Bridge

System (Interface) Bus Bridge

UFS HC

UFS Dev.

USB HC

USB Dev.

DSPGraphics

GB Ethernet

Page 10: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

UFS Host Controller IP Required Features for Implementation Success

Page 11: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

UFS Host Controller Features for Implementation Success

• Scatter/gather DMA to transfer large data blocks

• Burst transfers to maximize DMA throughput and keep system impact minimal

• Pre-configured for up to 32 task requests

• Pre-configured for up to 8 task management requests

• Ability to perform commands without system host intervention

• Support for the full range of UPIU packets, from 32byte to 64kB

• UniPro and M-PHY compliant stack

• DFT and clock gating ready design

Page 12: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

UFS Host Controller Hardware & Software Features for Implementation Success

• Software Features (UFS Host Driver)– UPIUs setup & processing

• NOP, CMD, TM, Query, Response, Reject & DataIn

– UTP TM / TR (descriptors) and Data Buffers formation

– UFS Interconnect (Link & PHY) Control

• Hardware Features (UFS Host Controller)– UPIUs processing

• DATAOUT, DATAIN & RTT

– Task and command management

– Data transfer through DMA

– Host Controller register interface (MMIO)

– Interrupt generation

– Vendor-specific features

Page 13: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

AXI Features Can Assist in Meeting Throughput

• 32/64-bit address bus and 64 bit-data bus

• Support for INCR bursts reads and writes

• Burst length is a power of 2 (1, 2,…,16, 32) and aligned to that boundary

• Configurable buffer sizes

• Supports configuration read and write outstanding transactions

• Byte enables supported

• Make use of posted writes during data access and non-posted write for certain descriptor access to maximize the AXI bus efficiency

Page 14: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

Meeting Throughput Requirements with UFS Host Controller IP Designed to Maximize UFS Performance

Page 15: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

System Latency Requirementsto Maximize System Performance

• 300MBps, 64-bit C-Port width

=> 300MBps/8B =37M transactions/s

• One transaction every 1/37M transactions/s ~= 26ns

• For 16-beats burst size (128bytes to transfer)

– Time taken to complete one burst at 100MHz

– ~= 10ns x 16= 160ns

• For 16-beats burst size (128bytes to transfer)

– Time available between two bursts

– ~= 16 x 26ns= 416ns Time between bursts (TBB)

SoC

Unipro

M-PHY

Tx

Rx UFS HostContlr

Core

SystemBus

SW

Page 16: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

Configurable and Scalable Parameters Assist in Meeting Throughput

• To achieve required throughput

– Burst length (2…to 32)

– FIFOs / buffer sizes

– Data bus widths (currently 64, scalable to 128)

– Number of outstanding requests

– Group acknowledge SoC

Unipro

M-PHY

Tx

Rx UFS HostContlr

Core

SystemBus

SW

Page 17: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

Sample UFS Setup for Throughput Measurement

• AXI latency =100-200 ns

• FIFO sizes =96

• Outstanding request = 4

• Hclk=200MHz; SymbolClk=150MHz

• Transfer size in bytes = 32768

• Number of RTTs =1

• Vary the number of PRD entries from 1,8,16,32,64

• Performance is measured from door bell setup to door bell clear for Gear2 Lane1

Page 18: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

UFS E2E Throughput ResultsMultiple PRD Entries

0

20

40

60

80

100

120

140

160

180

1 2 3 4 5

Series1

Series2Read

1 16 32 64

End-t

o-E

nd T

hro

ughput

----

----

>M

Bps

Number of PRD Entries ----------------->

Write

8

Page 19: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

UFS Host Solution to Future Proof Your Design

Page 20: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

Future Speeds Demand Scalable IP Feature

• Future versions of the UFS Specification can demand higher throughput

– G3L2, G4L1

– G3L4, G4L2

• Scalable UFS Host Controller IP features

– Without compromising on latency and operating frequency

• 128/256-bit data bus width

• Higher number of outstanding RTTs

• Out-of-order execution granularity on system bus side

Page 21: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

Future Proofing UFS Host Controller IP Design

• UFS host solution compliant with latest JEDEC Universal

Flash Storage (UFS) standard and JEDEC UFS host

controller interface specification

• Integrated with UniPro controller, compliant with latest

MIPI Alliance UniPro specification

• Single traffic class

• Supports M-PHY v3.0 and access to attributes

• Low-power operation, small area, and low latency

• Synopsys Solution is deployed in UFS Host and Device ICs

Page 22: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

Close Collaboration Between Companies Developing UFS is Key

UFS Host and MIPI UniPro IP Interoperability Demo

Video at: http://www.synopsys.com/IP/Pages/designware-ip-mipi-videos.aspx

Page 23: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

28-nm High-Speed Gear3 M-PHY

HS-Gear1 B Large Amplitude

HS-Gear3 B Large Amplitude

HS-Gear2 B Large Amplitude

Page 24: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

Achieving High UFS Host Throughput for System Performance

Software

Verification

Controllers

PHYs

Boards

System-Level Interoperability

• Meet end-to-end throughput requirements

• Understand IP design considerations

• Meet system latency requirements

• Set configurable and scalable parameters

• Collaborate with proven UFS IP supplier

Page 25: Achieving UFS Host Throughput For System Performance · • Burst transfers to maximize DMA throughput and keep ... and low latency ... HS-Gear1 B Large Amplitude

THANK YOUwww.synopsys.com/mipi


Recommended