Aaron Bosley Marc MercuriArchitect Sr. DirectorSolutions Engineering Applied Incubation
Architecting FailSafe Data Services
ARC303
Presented in 2014
Session Objective(s): Describe what failsafe data services are composed of as well as their capabilitiesRecognize and describe the failsafe architectural attributes and patterns as they relate to data services
Session Objectives And Takeaways
In the beginning there was…
A million different complex implementations of Remote Procedure Calls (RPC) XML-R
PC SOAP
Java RMI
Jax-RPC
CORBA
REST: Guiding Principles
Simple Discoverable MeaningfulNavigable
But there are some issues:No convention for URL/URI structureLevel of detail varies greatly across implementationsDiscoverability quality varies dramaticallyRetrieval and search support is weak
Example HTTP Request
GET http://services.odata.org/OData/OData.svc/ProductsHTTP/1.1Accept: application/jsonUser-Agent: FiddlerHost: services.odata.org
HTTP Verb
API End Point
Data Format for Response
Example ResponseHTTP/1.1 200 OKCache-Control: no-transform, public, max-age=300, s-maxage=600Content-Length: 750Content-Type: application/json;odata=minimalmetadata;streaming=true;charset=utf-8ETag: "686897696a7c876b7e"Server: Microsoft-IIS/8.0X-Content-Type-Options: nosniffDataServiceVersion: 3.0;X-AspNet-Version: 4.0.30319X-Powered-By: ASP.NETSet-Cookie: ARRAffinity=4b98c8c7832a0f11db30cc5be0c0d64bbd90359f8a87b799460af4623a8a0aaf;Path=/;Domain=services.odata.orgSet-Cookie: WAWebSiteSID=2754cab394374e0c890fdcc569e94616; Path=/; HttpOnlyDate: Fri, 24 Jan 2014 02:52:04 GMT
<!--OData Payload-->
HTTP Code
Caching Capabilities
Response Format
OData Custom Header Tag
Modern Authentication Protocols
OAuth 2.0
OAuth 2.0
WS-Fed, SAML 2.0, OpenID
Connect
OAuth 2.0
Browser
Native app
Server app
Web applicatio
n
Web service
API
Standard, http-based protocols for maximum platform reach
Challenges with OAuth’s evolutionChallenges
• Spec identifies Oauth as a Framework vs. Protocol• 60+ optional aspects
• Commercial implementations delivered at different spec levels• Potential for man in the middle attacks in some scenarios• Vendor specific compensation approaches to deal with gaps /
concerns
The Result• Oauth Provider portability can be a challenge• Understanding of different vendor implementation nuances required• Custom code can be required when supporting different vendors
Task Operation URICreate an Order POST http://api.contoso.com/CreateOrder?OrderID=1Approve an Order POST http://api.contoso.com/ApproveOrder?OrderID=1Delete the Order POST http://api.contoso.com/DeleteOrder?Order=1Cancel the Order POST http://api.contoso.com/CancelOrder?Order=1
Task Operation URICreate an Order POST http://api.contoso.com/Order/1?OrderName=“Contoso
Reorder”Approve the Order PUT http://api.contoso.com/Order/1/approvalDelete the Order DELETE http://api.contoso.com/Order/1Cancel the Order PUT http://api.contoso.com/Order/1/cancellation
Service API DesignThe Basic L0 Approach
The Mature Approach
• Enterprises used to provide data services to a Small Number of Known Users (SNKU)
• The new Paradigm for the Enterprise is serving a Large Number of Unknown Users (LNUU)
• Enterprises need to understand who their customers are and design the data service to support them
The World Is Changing!
Service Design ConsiderationsScenario and Persona Based ModelingFocus on Resources, Not URLsVersioning evolvable API DesignInformation Hiding & SecurityHTTP FeaturesHeaderHTTP Codes
Network BandwidthManagement
An example of what NOT to doHTTP/1.1 200 OK Server: cloudflare-nginx Date: Mon, 06 Jan 2014 15:22:04 GMT Content-Type: text/html Transfer-Encoding: chunked Connection: keep-alive Set-Cookie: __cfduid=[ommitted]; expires=Mon, 23-Dec-2019 23:50:00 GMT; path=/; domain=.ycombinator.com; HttpOnly Last-Modified: Mon, 06 Jan 2014 13:14:48 GMT Vary: Accept-Encoding Expires: Thu, 04 Jan 2024 13:14:48 GMT Cache-Control: max-age=315352364 Cache-Control: public CF-RAY: [omitted] <html> <head> <link rel="stylesheet" type="text/css" href="/news.css"> <link rel="shortcut icon" href="/favicon.ico"> <title>Hacker News</title> </head> <body> <center> <table border="0" cellpadding="0" cellspacing="0" width="85%" bgcolor="#f6f6ef"> <tr> <td bgcolor="#ff6600"> <table border="0" cellpadding="0" cellspacing="0" width="100%" style="padding:2px"> <tr> <td style="width:18px;padding-right:4px"> <a href="http://ycombinator.com"> <img src="/y18.gif" width="18" height="18" style="border:1px #ffffff solid;" /> </a> </td> <td style="line-height:12pt; height:10px;"> <span class="pagetop"> <b><a href="/news">Hacker News</a></b> </span> </td> </tr> </table> </td> </tr> <tr style="height:10px"></tr> <tr> <td> Sorry for the downtime. We hope to be back soon. </td> </tr> <tr> <td> <img src="s.gif" height="10" width="0" /> <table width="100%" cellspacing="0" cellpadding="1"> <tr> <td bgcolor="#ff6600"></td> </tr> </table> <br /> </td> </tr> </table> </center> </body> </html>
SharePoint Online3 Different TopologiesOutboundInboundBi-Directional
Key DriversIdentity FederationBCS Connectivity RequirementsSharePoint workloads used
Handle Transient FaultsRetry logic
Idempotency
Protect calls with timeouts on outbound requests
• Fast retries often fail again. Exponential back-off is useful.• Error codes should provide insight.
• Allows restart of requests which may have partially or fully succeeded.
• Don’t retry on timeouts.• Queue work for slow retry later.
Graceful DegradationCompensating behavior
Last resort
Alternate path
Omission
• Example: Serve stale data from cache or switch to read-only mode.
• Please try again• Never show:
• Example: Return a message that says transaction process, charge confirmation will come later in e-mail.
• Example: Browsing items might not include inventory count.
TransientFaultHandlingApplicationBlock
using Microsoft.Practices.TransientFaultHandling; using Microsoft.Practices.EnterpriseLibrary.Common.Configuration; using Microsoft.Practices.EnterpriseLibrary.WindowsAzure.TransientFaultHandling; ... // Get an instance of the RetryManager class. var retryManager = EnterpriseLibraryContainer.Current.GetInstance<RetryManager>(); // Create a retry policy that uses a retry strategy from the configuration. var retryPolicy = retryManager.GetRetryPolicy <StorageTransientErrorDetectionStrategy>("Incremental Retry Strategy"); // Receive notifications about retries. retryPolicy.Retrying += (sender, args) => { // Log details of the retry. var msg = String.Format("Retry - Count:{0}, Delay:{1}, Exception:{2}", args.CurrentRetryCount, args.Delay, args.LastException);
// Pass msg to your logging handler of choice…. And choose it wisely!
try { // Do some work that may result in a transient fault. var blobs = retryPolicy.ExecuteAction( () => { // Call a method that uses Windows Azure storage and which may // throw a transient exception. this.container.ListBlobs(); }); } catch (Exception) { // All the retries failed. }
<RetryPolicyConfiguration defaultRetryStrategy="Fixed Interval Retry Strategy" defaultSqlConnectionRetryStrategy="Backoff Retry Strategy" defaultSqlCommandRetryStrategy="Incremental Retry Strategy" defaultAzureStorageRetryStrategy="Fixed Interval Retry Strategy" defaultAzureServiceBusRetryStrategy="Fixed Interval Retry Strategy"><incremental name="Incremental Retry Strategy" retryIncrement="00:00:01“ retryInterval="00:00:01" maxRetryCount="10" /> <fixedInterval name="Fixed Interval Retry Strategy" retryInterval="00:00:01" maxRetryCount="10" /> <exponentialBackoff name="Backoff Retry Strategy" minBackoff="00:00:01" maxBackoff="00:00:30" deltaBackoff="00:00:10" maxRetryCount="10" fastFirstRetry="false"/> </RetryPolicyConfiguration>
Circuit Breaker PatternUsed in conjunction with timeouts
Always alert
Used to combat slow responses
• Counter based action• Often activates admission control with metering to allow
recovery• Often activates alternative pathway
• Mitigations should have monitored counters too
• Instrument all calls with timers
Circuit Breaker PatternFallbacks
Custom Fallback
Fail Fast
Fail Silent
• Client library can provide an invokeable callback method• Can also use locally available data on API server (cookie or
cache) to generate a fallback response
• When data is required and there’s no good fallback• Negative UX impact, but keeps API healthy
• Return a null value. Useful if the data is optional
Throttling – ConsiderationsDesign the strategy early on
Perform quickly
Return a specific error code
Can be used together w/ Auto-scalingConsider aggressive auto-scaling if demands grow very quickly
Throttling – When to use
Ensure a system meets SLAHandle burst activityPrevent a single tenant from monopolizingHelp cost-optimize a system by limiting the maximum resource levelsCombine with auto-scaling
WebApiContrib Throttling Handlerconfig.MessageHandlers.Add(new ThrottlingHandler(new InMemoryThrottleStore(), id => 60, TimeSpan.FromHours(1)));
config.MessageHandlers.Add(new ThrottlingHandler( new InMemoryThrottleStore(), id => {if (id == "10.0.0.1") { return 5000; } return 60; }, TimeSpan.FromHours(1)));
Allow 60 requests per hour for all users
Allow 60 requests per hour for a given IP
Source: http://blog.maartenballiauw.be/post/2013/05/28/Throttling-ASPNET-Web-API-calls.aspx
WebApiContrib Throttling Handler
public class MyThrottlingHandler : ThrottlingHandler { // ... protected override string GetUserIdentifier(HttpRequestMessage request) { // your user id generation logic here} }}
Override to tailor to your needs
Additional ResourcesDesigning Evolvable Web APIs with ASP.NET, Glenn Block, et alRoy Fielding's Dissertation on RESTRichardson Maturity Model, Martin FowlerRESTful Web APIs, Leonard Richardson, et al
MICROSOFT CONF IDENTIAL – INTERNAL ONLY
© 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.