+ All Categories
Home > Documents > High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion...

High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion...

Date post: 22-Dec-2015
Category:
Upload: lauren-green
View: 214 times
Download: 1 times
Share this document with a friend
Popular Tags:
42
High Availability for OPNFV May. 2015 1
Transcript
Page 1: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

1

High Availability for OPNFV

May. 2015

Page 2: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

2

Agenda

• Project introduction• Scenarios discussion• Requirement doc • Gap analysis• Blue prints• Open questions: Storage HA

Page 3: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

3

Project Introduction

Fu Qiao

Page 4: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

4

Project Detail• Project Page:

• https://wiki.opnfv.org/high_availability_for_opnfv• Weekly meeting:

• Wednesday at 13:00pm-14:00pm UTC

• https://wiki.opnfv.org/high_availability_project_meetings• Mailing list

• Opnfv-tech-discussion [availability]• Participants:

• Hui Deng [[email protected]]• Jolliffe, Ian [[email protected]]• Maria Toeroe [[email protected]]• Qiao Fu [[email protected]]• Xue Yifei [[email protected]]• Yuan Yue [[email protected]]• Yao Cheng LIANG [[email protected]]• Sean Winn [[email protected]]• Joe Huang [[email protected]]• Georg Kunz [[email protected]]• Basavaprabhu Badami [[email protected]]• ….

Page 5: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

5

Project progress• The ongoing work of 1st release of OPNFV has included some HA schemas, e.g.

openstack HA, active/active or active/passive state of Rabbit MQ and Mysql, which is described in requirement doc. section 5.

• In this project, we further discuss the scenarios, framework, and detail requirements and API definition of HA in OPNFV platform.

• Project Outputs:– Service HA scenario analysis– Requirement Document

• https://etherpad.opnfv.org/p/High_Availabiltiy_Requirement_for_OPNFV– Gap Analysis of Openstack HA scheme

• https://etherpad.opnfv.org/p/ha_gap_analysis– Blue Prints:

• https://etherpad.opnfv.org/p/Blue_Print_From_HA_project• https://blueprints.launchpad.net/keystone/+spec/keystone-ha-multisite

– HA API description

Page 6: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

6

Scenarios Discussion

Fu Qiao

Page 7: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

7

Service Availability Levels for Carrier Grade VNFs

Recovery Time

Customer Type Recommendations

SAL 1 e.g. 5 – 6 seconds

- Network Operator Control Traffic

- Government/Regulatory Emergency Services

Redundant resources to be made available on-site to ensure fast recovery.

SAL 2 e.g. 10 – 15 seconds

- Enterprise and/or large scale Customers

- Network Operators service traffic

Redundant resources to be available as a mix of on-site and off-site as appropriate.On-site resources to be utilized for recovery of real-time services. Off-site resources to be utilized for recovery of data services.

SAL 3 e.g. 20 – 25 seconds

General Consumer Public and ISP Traffic

Redundant resources to be mostly available off-site. Real-time services should be recovered before data services

Source: ETSI GS NFV-REL 001 V1.1.1

Page 8: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

8

ScenariosState Redundancy in VNF Failure detection Use Case

VNF

Stateful

yesVNF only UC1

VNF & NFVI UC2

noVNF only UC3

VNF & NFVI UC4

Stateless

yesVNF only UC5

VNF & NFVI UC6

noVNF only UC7

VNF & NFVI UC8

UC9: Repeated failure in VNF

Page 9: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

9

NFVI

UC1: Stateful VNF with Redundancy

VM VM VM VM

VNF

VNFMSTB

VIM

NFVO

ACT

NFVI’s Services

VNF’s Services NF

Failure detection: VNF only

ACT

1. VNFC fails 2. NF fails*3. VNF detects the failure4. VNF isolates VNFC5. VNF fails over6. NF recovers7. VNF repairs VNFC

STB

Nothing new in this scenario

Recovery time

*Steps 1&2 are simultaneous they are separated for clarity

Page 10: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

10

NFVI

UC2: Stateful VNF with Redundancy

VM VM VM VM

VNF

VNFMSTB

VIM

NFVO

ACT

NFVI’s Services

VNF’s Services NF

Failure detection: VNF & NFVI

ACT

1. VM fails 2. VM Service fails

5a. VNF detects the failure

5b. NFVI detects the failure

6a. VNF fails over7a. NF recovers

11. VNF repairs VNFC

STB

3. VNFC fails4. NF fails*

6b. NFVI reports to VIM7b. VIM reports to VNFM8. VNFM ok to VIM9. VIM repairs VM

VM

10. VM Service recovers

Recovery time

*Steps 1-4 are simultaneous they are separated for clarity

Page 11: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

11

NFVI

UC3: Stateful VNF with No Redundancy

VM VD VM VM

VNF

VNFM

VIM

NFVO

ACT

NFVI’s Services

VNF’s Services NF

Failure detection: VNF only

1. VNFC fails 2. NF fails*3. VNF detects the failure4. VNF isolates VNFC5. VNF repairs VNFC

7. NF recovers

ACT

statestate

state

6. VNFC gets state

VNFC checkpoints its state to VD, which is HA

Recovery time

*Steps 1&2 are simultaneous they are separated for clarity

Page 12: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

12

NFVI

UC4: Stateful VNF with No Redundancy

VM VD VM VM

VNF

VNFM

VIM

NFVO

ACT

NFVI’s Services

VNF’s Services NF

Failure detection: VNF & NFVI

3. VNFC fails 4. NF fails*5a. VNF detects the failure6a. VNF reports to VNFM

12. VNFM repairs VNFC

14. NF recovers

ACT

statestate

state

13. VNFC gets state

1. VM fails 2. VM Service fails

5b. NFVI detects the failure6b. NFVI reports to VIM7. VIM reports to VNFM8. VNFM ok to VIM9. VIM repairs VM

10. VM Service recovers

VM

Recovery time

VNFC checkpoints its state to VD, which is HA

11. VIM informs VNFM

*Steps 1-4 are simultaneous they are separated for clarity

Page 13: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

13

High Availability Flow Chart

Service is unavailable

VNF failure only

repeated failure

Step 1-Service Recovery: (Time Constraint, Carrier Grade VNF should be recovered within seconds)

Step 2-NFVI recovery or repair:(less Time Constraint)

recovery time

Service failure happens(may be caused of failure of VNF or NFVI)

Failure detection (by service heartbeat loss/NFVI report of failure)

Service failover

VNFC repair/restart

VM recovery and VNFC recovery

NFVI failure

Page 14: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

14

Requirement doc & Gap Analysis doc.

Ian Jolliffe

Page 15: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

15

Requirement Doc. Details• Framework

– The ultimate goal is to provide upper layer service high availability– Service high availability is provided by recovery of service (including

service restart and failover) within seconds following the SAL.– Repair or recovery of the failed layer should happen afterwards.– Ensure that no failure in one layer causes a cascading failure at other

layers. – A single layer can detect failures in other layers and help recover failed

components.

Hardware

NFVI VIM

VNF/VNFC VNFM

Service Service layer

Application/VM layer

NFVI/VIM layer

Hardware layer

Page 16: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

16

Requirement doc. outline• 1 Overall Principle for High Availability in NFV

– 1.1 Framework for High Abailability in NFV– 1.2 Definitons– 1.3 Overall requirements– 1.4 Time requirement

• 2 Hardware HA• 3 Virtualization Facilities (Host OS, Hypervisor)• 4 Virtual Infrastructure HA – Requirements:

– 4.1 Virtual Compute– 4.2 Virtual Storage– 4.3 Virtual Network

• 5 VIM High availability– 5.1 Archeticture requirement of VIM HA– 5.2 Fault detection and alarm requirement of VIM– 5.3 HA mechanism of VIM provided for NFV– 5.4 SDN controller

• 6 VNF HA– 6.1 Service Availability– 6.2 Service Continuity

• 7 Storage

Page 17: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

17

Gap Analysis14+ HA related gaps have been discovered

Nova: 6 gaps in Nova covering scheduler, consoleauth and health status of compute node.

Neutron: 2 gaps in Neutron covering L3 agent and DHCP agent.

Cinder: 2 gaps in Cinder covering HA configuration and multi-attachment.

VIM NBI: 1 gap for error reporting

QoS: 1 gap for QoS management

References:https://etherpad.openstack.org/p/kilo-crossproject-ha-integrationhttps://etherpad.openstack.org/p/kilo-summit-ops-hahttps://blueprints.launchpad.net/openstack

Page 18: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

18

Blue Prints

Joe Huang

Page 19: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

19

KeyStone

KeyStoneMiddleware

API Request(Nova/Cinder/Neutron…)

OpenStack service(Nova/Cinder/Neutron…)

Validate Token (Fernet,UUID) or retrieve RevokeList (PKI)

Site1 KeyStone

KeyStoneMiddleware

API Request(Nova/Cinder/Neutron…)

OpenStack service(Nova/Cinder/Neutron…)

Validate Token (Fernet,UUID) or retrieve RevokeList (PKI)

Site2 KeyStone

Only one KeyStone server can be configured for token validation or revoke list

Allow secondary KeyStone server configured in case of site level KeyStone failure.

(Cons. of DNS based load balance : delayed failover for caching issues, an unpredictable routing)

Escape from site level KeyStone failure

Page 20: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

20

Open Questions: Storage HA

Georg Kunz

Page 21: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

21

Storage Architecture

Hardware

host host storage array

NFVI

VNF/VNFC

distributed storageVIM

switch switch

block, file, object block, file,

object

storage service component

file, (object)

Ctrl1 Crtl2

Page 22: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

22

Storage HA – Network Failure

host host storage array

distributed storageVIM

switch switch

block, file, object block, file,

object

Ctrl1 Crtl2

1. Storage network link fails2. Storage network detects failure3. Storage network switches to standby link(s)

• iSCSI multi-pathing • bonding

4. Report failure to O&M

Page 23: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

23

Storage HA – Failure in Storage Array

host host storage array

distributed storageVIM

switch switch

block, file, object block, file,

object

Ctrl1 Crtl2

1. Component within storage array fails2. Array-internal fail-over kicks in

• RAID• Redundant controllers, NICs, …

3. Report failure to O&M

Page 24: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

24

Storage HA – Host Failure

host host storage array

distributed storageVIM

switch switch

block, file, object block, file,

object

Ctrl1 Crtl2

1. Storage host fails2. Distributed storage layer detects failure3. Distributed storage layer rebalances data

Page 25: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

25

NFVI

Non-HA Block Storage (legacy)

• Mirroring of block devices on VNF level

VNF

VNFC(active)

VNFC(passive)

mirroring

Page 26: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

26

HA Block Storage

• Active/passive configuration– Failover supervised by clustering software in VNF– Requires multi-attach capability of Cinder

NFVI

VNF

VNFC(active)

VNFC(standby)

VNFC(active)

VNFM

VIM

Page 27: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

27

HA Block Storage

• Active/active configuration– Clustered file system enables concurrent access– Requires multi-attach capability of Cinder

NFVI

VNF

VNFC(active)

VNFC(active)

VNFM

VIM

Page 28: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

28

VNF level HA for Multiple Backends

• Block devices provided by multiple backends– Mirroring of block devices on VNF level– Pro-active failover possible

NFVI

VNF

VNFC(active)

VNFC(passive)

mirroring

VNFM

VIM

backend 1 backend 2

Page 29: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

29

Open Questions

• Can NFVI storage system provide sufficient level of HA to meet SAL levels?– Failover/recovery times heavily depend on

deployed solution• How much does rebuild of data impact

performance?

Page 30: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

30

File Storage

• Legacy deployments– File storage service provided by VNFC– Layered on top of block storage services

• NFVI– File storage service provided by NFVI / hardware– Openstack Manila

Page 31: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

31

Ephemeral Storage

• Ephemeral Storage– Main use: File systems of VMs booted from image– Location

• On local disks of compute host– Isolation of failover domains

» VM unaffected by failure of storage system– Disk failure corresponds to host failure– Limits live migration capabilities

• On distributed or external storage– Correlated failures possible

» Failure of storage backend impacts VMs– Properties of respective storage backend apply

Page 32: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

32

Appendix

Page 33: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

33

NFVI

UC3: Statefull VNF with No Redundancy

VM VD VM VM

VNF

VNFM

VIM

NFVO

ACT

NFVI’s Services

VNF’s Services NF

Failure detection: VNF only

1. VNFC fails 2. NF fails*3. VNF detects the failure4. VNF isolates VNFC5. VNF repairs VNFC

7. NF recovers

ACT

statestate

state

6. VNFC gets state

VNFC checkpoints its state to VD, which is HA

Recovery time

*Steps 1&2 are simultaneous they are separated for clarity

Page 34: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

34

NFVI

UC3-b: Statefull VNF with No Redundancy

VM VD VM VM

VNF

VNFM

VIM

NFVO

ACT

NFVI’s Services

VNF’s Services NF

Failure detection: VNF only

1. VNFC fails 2. NF fails*3. VNF detects the failure

6. VNFM repairs VNFC

8. NF recovers

ACT

statestate

state7. VNFC gets state

VNFC checkpoints its state to VD, which is HA

Recovery time

4. VNF reports to VNFM5. VNFM isolates VNFC

*Steps 1&2 are simultaneous they are separated for clarity

Page 35: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

35

NFVI

UC4: Statefull VNF with No Redundancy

VM VD VM VM

VNF

VNFM

VIM

NFVO

ACT

NFVI’s Services

VNF’s Services NF

Failure detection: VNF & NFVI

3. VNFC fails 4. NF fails*5a. VNF detects the failure6a. VNF reports to VNFM

12. VNFM repairs VNFC

14. NF recovers

ACT

statestate

state

13. VNFC gets state

1. VM fails 2. VM Service fails

5b. NFVI detects the failure6b. NFVI reports to VIM7. VIM reports to VNFM8. VNFM ok to VIM9. VIM repairs VM

10. VM Service recovers

VM

Recovery time

VNFC checkpoints its state to VD, which is HA

11. VIM informs VNFM

*Steps 1-4 are simultaneous they are separated for clarity

Page 36: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

36

NFVI

UC5: Stateless VNF with Redundancy

VM VM VM VM

VNF

VNFMSpare

VIM

NFVO

ACT

NFVI’s Services

VNF’s Services NF

Failure detection: VNF only

ACT

1. VNFC fails 2. NF fails*3. VNF detects the failure4. VNF isolates VNFC5. VNF fails over6. NF recovers7. VNF restores redundancy

Spare

Nothing new in this scenario

Spare VNFC may or may not be instantiated

Recovery time

*Steps 1&2 are simultaneous they are separated for clarity

Page 37: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

37

NFVI

UC6: Stateless VNF with Redundancy

VM VM VM VM

VNF

VNFMSpare

VIM

NFVO

ACT

NFVI’s Services

VNF’s Services NF

Failure detection: VNF & NFVI

ACT

1. VM fails 2. VM Service fails

5a. VNF detects the failure

5b. NFVI detects the failure

6a. VNF fails over7a. NF recovers

11. VNF restores redundancy

Spare

3. VNFC fails4. NF fails*

6b. NFVI reports to VIM7b. VIM reports to VNFM8. VNFM ok to VIM9. VIM repairs VM

VM

10. VM Service recovers

Spare VNFC may or may not be instantiated

Recovery time

*Steps 1-4 are simultaneous they are separated for clarity

Page 38: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

38

NFVI

UC7: Stateless VNF with No Redundancy

VM VD VM VM

VNF

VNFM

VIM

NFVO

ACT

NFVI’s Services

VNF’s Services NF

Failure detection: VNF only

1. VNFC fails 2. NF fails*3. VNF detects the failure

5. VNF isolates VNFC6. VNF repairs VNFC 7. NF recovers

ACT

Recovery time

4. VNF reports to VNFM

*Steps 1&2 are simultaneous they are separated for clarity

Page 39: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

39

NFVI

UC8: Stateless VNF with No Redundancy

VM VD VM VM

VNF

VNFM

VIM

NFVO

ACT

NFVI’s Services

VNF’s Services NF

Failure detection: VNF & NFVI

3. VNFC fails 4. NF fails*5a. VNF detects the failure6a. VNF reports to VNFM

12. VNF repairs VNFC 13. NF recovers

ACT

1. VM fails 2. VM Service fails

5b. NFVI detects the failure6b. NFVI reports to VIM7. VIM reports to VNFM8. VNFM ok to VIM9. VIM repairs VM

10. VM Service recovers

VM

Recovery time

11. VIM informs VNFM

*Steps 1-4 are simultaneous they are separated for clarity

Page 40: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

40

NFVI

UC9: Stateless VNF with No Redundancy

VM VD VM VM

VNF

VNFM

VIM

NFVO

ACT

NFVI’s Services

VNF’s Services NF

Failure detection: VNF only – BUT Repeatedly1. VNFC fails 2. NF fails3. VNF detects the failure and counts4. VNF isolates VNFC5. VNF repairs VNFC 6. NF recovers ACT 1…. VNFC fails….2 …. VNFC fails….3 …. VNFC fails….4

234

Fault is not in the VNFC!

ACTACT

UC7

Page 41: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

41

NFVI

UC9: Stateless VNF with No Redundancy

VM VD VM VM

VNF

VNFM

VIM

NFVO

NFVI’s Services

VNF’s Services NF

Failure detection: VNF only – BUT Repeatedly1. VNFC fails 2. NF fails3. VNF detects the failure and counts4. VNF isolates VNFC5. VNF repairs VNFC 6. NF recovers ACT…. VNFC fails….2 …. VNFC fails….3 …. VNFC fails….4

4

N. VNF reports to VNFM

N+5. VNF repairs VNFC N+6. NF recovers

N+1. VNFM reports to VIMN+2. VIM isolates VM

N+4. VM Service recoversN+3. VIM repairs VM

VM

Page 42: High Availability for OPNFV May. 2015 1. Agenda Project introduction Scenarios discussion Requirement doc Gap analysis Blue prints Open questions: Storage.

42

• Scenario chart• Scenario 1,2,5,6• Add all the scenarios as appendix• NFVI provide HA API to VNF? Opensaf is a PaaS,

as a HA middleware actually• VNF stateful and stateless may require different

schema in the NFVI, if VNF is not redundancy, we may need VM redundancy.

At this case VNF problem may not be solved.


Recommended