+ All Categories
Home > Documents > 6/21/2000 Edward Chow Advanced Load Balancing/Web Systems Edward Chow Department of Computer Science...

6/21/2000 Edward Chow Advanced Load Balancing/Web Systems Edward Chow Department of Computer Science...

Date post: 21-Dec-2015
Category:
View: 216 times
Download: 0 times
Share this document with a friend
Popular Tags:

of 82

Click here to load reader

Transcript
  • Slide 1
  • 6/21/2000 Edward Chow Advanced Load Balancing/Web Systems Edward Chow Department of Computer Science University of Colorado at Colorado Springs
  • Slide 2
  • Load Balancing/Web System 26/21/2000Edward Chow Outline of the Talk Trends in Web Systems Web switches and the support for advanced web system features. Load balancing Research load balancing algorithms research network bandwidth measurement research web server status research
  • Slide 3
  • Load Balancing/Web System 36/21/2000Edward Chow Readings/References Application level solution: Apache/Jserv/Servlet Apache/Jserv/Servlet Kernel level load balancing solution: http://www.linuxvirtualserver.org/ http://www.linuxvirtualserver.org/ Joseph Marks presentationJoseph Marks presentation LVS-NAT(Network Address Translation) web pageLVS-NAT(Network Address Translation) web page LVS-IP Tunnel web pageLVS-IP Tunnel web page LVS-DR (Direct Routing) web pageLVS-DR (Direct Routing) web page Hardware solution: Foundry ServerIron Installation and Configuration Guide, May 2000.Foundry ServerIron Installation and Configuration Guide, May 2000.
  • Slide 4
  • Load Balancing/Web System 46/21/2000Edward Chow Trends in Web Systems Increasing the availability, performance, and manageability of web sites. High Performance through multiple servers connected by high speed networks. High Availability (HA) 7x24 network services Reliable/Efficient Content Routing and Content Distribution Emerging Network Centric StorageNetworks Emerging Linux virtual server library for low cost HA web systems.
  • Slide 5
  • Load Balancing/Web System 56/21/2000Edward Chow Networkshops Prediction Already, load-balancers are overcoming the inherent one-to-one nature of the network and distributing queries across tuned servers -- GIFs to a machine with a huge RAM cache, processing to servers with fibre-channel-attached databases. I suspect we'll see content routing as a full-fledged concept in Las Vegas next spring. By Networkshop News 10/1999.
  • Slide 6
  • Load Balancing/Web System 66/21/2000Edward Chow Load Balancing Systems Cheap solution: Linux/LVS as load balancer for distributing requests to backend real servers. Medium price solution: Microsoft Server Cluster; Zeus Load Balancer High performance: Web Switches (special hardware) from Arrowpoint(CISCO), Foundry ServerIron, Alteon WebSystems, Intel XML distributor.
  • Slide 7
  • Load Balancing/Web System 76/21/2000Edward Chow Virtual Resource Management Also called Server load balancing or Internet Traffic Management. Goal: Increasing the availability, performance, and manageability of web sites. April 2000 Acuitive Report on 1999 VRM market share
  • Slide 8
  • Load Balancing/Web System 86/21/2000Edward Chow VRM Market Prediction
  • Slide 9
  • Load Balancing/Web System 96/21/2000Edward Chow F5 VRM Solution BIG-IP Server Array Webmaster Site I newyork.domain.com Site III tokyo.domain.com Site II losangeles.domain.com User london.domain.com Local DNS 3-DNS GLOBAL-SITE Router BIG-IP Internet
  • Slide 10
  • Load Balancing/Web System 106/21/2000Edward Chow BIG/ip - Delivers High Availability E-commerce - ensures sites are not only up-and-running, but taking orders Fault-tolerance - eliminates single points of failure Content Availability - verifies servers are responding with the correct content Directory & Authentication - load balance multiple directory and/or authentication services (LDAP, Radius, and NDS) Portals/Search Engines Using EAV administrators perform key- word searches Legacy Systems - Load balance services to multiple interactive services Gateways Load balance gateways (SAA, SNA, etc.) E-mail (POP, IMAP, SendMail) - Balances traffic across a large number of mail servers
  • Slide 11
  • Load Balancing/Web System 116/21/2000Edward Chow 3DNS Intelligent Load Balancing Intelligent Load Balancing QoS Load Balancing Quality of Service load balancing is the ability to select apply different load balancing methods for different users or request types Modes of Load Balancing Round Robin Ratio Least ConnectionsRandom User-defined Quality-of-ServiceRound Trip Time Completion Rate (Packet Loss)BIG/ip Packet Rate Global AvailabilityHOPS Topology DistributionAccess Control LDNS Round RobinDynamic Ratio E-Commerce
  • Slide 12
  • Load Balancing/Web System 126/21/2000Edward Chow GLOBAL-SITE Replicate Multiple Servers and Sites File archiving engine and scheduler for automated site and server replication BIG-IP controls server availability during replication and synchronization Gracefully shutdown for update update in group/scheduled manner FTP provides transferring files from GLOBAL-SITE to target servers (agent free, scalable) RCE for source control No client side software Complete, turnkey system (appliance) (adapt from F5 presentation)
  • Slide 13
  • Load Balancing/Web System 136/21/2000Edward Chow Content Distribution Secure, automate content/application distribution to single (multiple server)/wide area Internet sites. Provide replication, synchronization, staged rollout and roll back. With revision control, transmit only updates. User-defined file distribution profiles/rules
  • Slide 14
  • Load Balancing/Web System 146/21/2000Edward Chow Intel NetStructure Routing based on XML tag (e.g., given preferred treatment for buyers, large volume) http://www.intel.com/network/solutions/xml.htm
  • Slide 15
  • Load Balancing/Web System 156/21/2000Edward Chow 1. Compared to SUN E450 server
  • Slide 16
  • Load Balancing/Web System 166/21/2000Edward Chow Phobos IPXpress Balances web traffic among all 4 servers. Easily connects to any Ethernet network. Quick set up and remote configuration. Choose from Six different load balancing algorithms Round Robin Least Connections Weighted Least Connections Fastest Response Time Adaptive Fixed Hot standby failover port for web site uptime. U.S. Retail $3495.00
  • Slide 17
  • Load Balancing/Web System 176/21/2000Edward Chow Phobos In-Switch Only load balancing switch in a PCI card form factor Plugs directly into any server PCI slot Supports up to 8,192 servers, ensuring availability and maximum performance Six different algorithms are available for optimum performance: Round Robin, Weighted Percentage, Least Connections, Fastest Response Time, Adaptive and Fixed. Provides failover to other servers for high-availability of the web site U.S. Retail $1995.00
  • Slide 18
  • Load Balancing/Web System 186/21/2000Edward Chow Foundry Networks ServerIron Internet Traffic Management Switches One Million Concurrent Connections SwitchBack - Also known as direct server return Throughput: 64 Gbps with BigServerIron Session Processing: Lead with 80,000 connections/sec. Symmetric LB: picking up the full load where the failed switch left off without losing stateful information. Switching Capacity: BigServerIron deliver 256 Gbps of total switching capacity.
  • Slide 19
  • Load Balancing/Web System 196/21/2000Edward Chow BigServerIron BigServerIron supports up to 168 10/100Base-TX ports or 64 Gigabit Ethernet ports. Internet IronWare supports unlimited virtual server addresses, up to 64,000 Virtual IP (VIP) addresses and 1,024 real servers. Web Hosting: enable network managers to define multiple VIPs and track service usage by VIP. Health Checks: provide Layer 3,4,7 Health Checks Include HTTP, DNS, SMTP, POP3, iMAP4, LDAPv3, NNTP, FTP, Telnet and RADIUS
  • Slide 20
  • Load Balancing/Web System 206/21/2000Edward Chow BigServerIron LB Algorithms Round Robin Least Connections Weighted Percentage (assign perform weight to server) Slow Start - To protect the server from a surging flow of traffic at startup. It can really happened!! Ya, LVS has performed for us like a champ.. under higher volumes, I have had some problems with wlc.... for some reason LVS freaks and starts binding all traffic to one box... or at least the majority of it.. it is really wierd... but as soon as you switch to using wrr then everything worked fine... I have been using LVS for about 4 months to manage our E-Commerce cluster and I haven't had any problems other than the wlc vs wrr problem -- Jeremy Johnson 6/1/2000
  • Slide 21
  • Load Balancing/Web System 216/21/2000Edward Chow BigServerIron LB Features Set max connection limit for each server Cookie Switching - This feature directs HTTP requests to a server group based on cookie value. For client persistent and servlet URL Switching - directs requests based on the text of a URL string using defined policies. Can place different web content on different servers URL Hashing - map hash value of Cookie header or the URL string to one of the real servers bound to the virtual server. This HTTP request and all future HTTP requests that contain this information then always go to the same real server. URL Parsing - Selects real server by applying pattern matching expression to the entire URL. ServerIron supports up to 256 URL rules SSL Session ID Switching - ensures that all the traffic for a SSL transaction with a given SSL session ID always goes to the same server.
  • Slide 22
  • Load Balancing/Web System 226/21/2000Edward Chow IronClad Security NAT TCP SYN attack protection: stops binding new sessions for a user definable timeframe when the rate of incoming TCP SYN packets exceed certain threshod. Guard against Denial Of Service (DoS) Attacks -against massive numbers of uncompleted handshakes, also known as TCP SYN attacks, by monitoring and tracking unfinished connections High Performance Access Control Lists (ACLs) and Extended ACLs - By using ACLs, network administrators can restrict access to specific applications from/to a given address or sub-net, or port number. Cisco-syntax ACLs - ServerIron supports Cisco-syntax ACLs, which enables network administrators to cut/copy/paste ACLs from their existing Cisco products.
  • Slide 23
  • Load Balancing/Web System 236/21/2000Edward Chow Session Persistence for eCommerce Transactions Port Tracking: Some web applications define a lead port (http) and follower (SSL) ports. ServerIron ensures connections to the follower ports arrive at the same server Sticky Ports - ServerIron supports a wide variety of 'sticky' connections: clients request for next port or all ports go to same server Support large range of user programmable options Mega Proxy Sever Persistence - treat a range of source IP addresses as a single source to solve the persistence problem caused by certain mega proxy sites in the Internet. Use Source IP address for session persistenece when cookie missing.
  • Slide 24
  • Load Balancing/Web System 246/21/2000Edward Chow High Availability Services Remote Backup Servers - If no local servers or applications are available, ServerIron sends client requests to remote servers. HTTP Re-direct - ServerIron can also use HTTP redirect to send traffic to remote servers if the requested application is not available on the local server farm. Active/Standby - When deployed in Active/Standby mode, the standby load-balancing device will assume control and preserve the state of existing sessions in the event the primary load-balancing device fails Active/Active - When deployed in Active/Active mode, both load- balancing devices work simultaneously and provide a backup for each other while supporting stateful fail-over. Quality of Service - Network administrators can prioritize traffic based on ports, MAC, VLAN, and 802.1p attributes, grant priority to HTTP traffic over FTP Redundant hot-swappable power supplies
  • Slide 25
  • Load Balancing/Web System 256/21/2000Edward Chow Linux Virtual Server (LVS) Virtual server is a highly scalable and highly available server built on a cluster of real servers. The architecture of the cluster is transparent to end users, and the users see only a single virtual server.
  • Slide 26
  • Load Balancing/Web System 266/21/2000Edward Chow LVS-NAT Configuration All return traffic go through load balancer
  • Slide 27
  • Load Balancing/Web System 276/21/2000Edward Chow LVS-Tunnel Configuration Real Servers need to be reconfigured to handle IP-IP packets Real Servers can be geographically separated and return traffic go through different routes
  • Slide 28
  • Load Balancing/Web System 286/21/2000Edward Chow LVS-Direct Routing Configuration Similar to the one implemented in IBM's NetDispatcher Real servers need to configure a non-arp alias interface with virtual IP address and that interface must share same physical segment with load balancer. Load balancer only rewrites server mac address; IP packet not changed
  • Slide 29
  • Load Balancing/Web System 296/21/2000Edward Chow HA-LVS Configuration
  • Slide 30
  • Load Balancing/Web System 306/21/2000Edward Chow Persistence Handling in LVS Sticky connections Examples: FTP control (port21), data (port20) For passive FTP, the server tells the clients the port that it listens to, the client initiates the data connection connecting to that port. For the LVS/TUN and the LVS/DR, LinuxDirector is only on the client- to-server half connection, so it is imposssible for LinuxDirector to get the port from the packet that goes to the client directly. SSL Session: port 443 for secure Web servers and port 465 for secure mail server, key for connection must be chosen/exchanged. Persistent port solution: First accesses the service, LinuxDirector create a template between the given client and the selected server, then create an entry for the connection in the hash table. The template expires in a configurable time, and the template won't expire until all its connections expire. The connections for any port from the client will send to the server before the template expires. The timeout of persistent templates can be configured by users, and the default is 300 seconds
  • Slide 31
  • Load Balancing/Web System 316/21/2000Edward Chow Performance of LVS-based Systems We ran a very simple LVS-DR arrangement with one PII-400 (2.2.14 kernel)directing about 20,000 HTTP requests/second to a bank of about 20 Web servers answering with tiny identical dummy responses for a few minutes. Worked just fine. Jerry Glomph Black, Director, Internet & Technical Operations, RealNetworks I had basically (1024) four class-Cs of virtual servers which were loadbalanced through a LinuxDirector (two, actually -- I used redundant directors) onto four real servers which each had the four different class- Cs aliased on them. "Ted Pavlic"
  • Slide 32
  • Load Balancing/Web System 326/21/2000Edward Chow What is Content Intelligence? By Erv Johnson, Arrowpoint Session load balancing based on IP address and TCP port Network Address Translation (NAT) Policies based on TCP port Layer 4 (TCP) Switching on MAC address, VLANs IP Routing 802.1 P/Q policy Layer 3 (IP) Content Routing based on: Host Tag Entire URL Dynamic Cookie location File extension # of rules # of services # of services per content rule Layer 5-7 (content)
  • Slide 33
  • Load Balancing/Web System 336/21/2000Edward Chow ArrowPoints Content Smart Web Switch Architecture from CCL viewgraph Switch Fabric Shared Memory Control Plane (content Policy Services) Forwarding Plane Switch Fabric Switch Fabric Switch Fabric Flow Managers Flowwall Security Content Based QoS Content Location Services LAN I/O Mapped Row Cache LAN I/O Mapped Row Cache Site & Server Selection 4 MIPS RISC CPU& 512 MB Mem 8 Mb Mem Up to 16 ports : 1B hits per day
  • Slide 34
  • Load Balancing/Web System 346/21/2000Edward Chow Load Balancing Study The current web switches do not take server load or network bandwidth directly into consideration. How can we improve them? The node with the least connection may have the heaviest load. The current wide area load balancing does not consider the available/bottleneck bandwidth. Lack of simulation and network planning tools for suggesting network configuration.
  • Slide 35
  • Load Balancing/Web System 356/21/2000Edward Chow Server Load Status Collection Three basic approaches: Observe response time of requests modify web servers to report current queue/processing speed Use web server agent to collect system data The 2nd approach requires access to web server code/internal We have modified Apache code (v1.3.9) by accumulating size of pending request (in terms bytes) in active child servers and diving it with the estimated processing speed. Note that it is harder to estimate CGI script of Servlet processing.
  • Slide 36
  • Load Balancing/Web System 366/21/2000Edward Chow Apache Server Status Report Apache Server Status for gandalf.uccs.edu Current Time: Wed Dec 10 00:32:51 1997 Restart Time: Wed Dec 10 00:32:27 1997 Server uptime: 24 seconds Total accesses: 0 - Total Traffic: 0 kB CPU Usage: u0 s0 cu0 cs0 0 requests/sec - 0 B/second 1 requests currently being processed, 4 idle servers... Forked web server processes with no work (idle servers) Requests per second (history)
  • Slide 37
  • Load Balancing/Web System 376/21/2000Edward Chow Collecting System Statistics Web server agent collects system data Run queue (#) CPU idle time (%) Pages scanned by page daemon (pages/s) Web server agent uses vmstat 1 2 every 1 second collect 2 samples
  • Slide 38
  • Load Balancing/Web System 386/21/2000Edward Chow Vmstat Output and Meaning r - # of processes waiting to run (extent) sr - # of pages scanned by page daemon to put back on the free list id - % of CPU idle time 100 - (us + sy) = id (discrete)
  • Slide 39
  • Load Balancing/Web System 396/21/2000Edward Chow Network Bandwidth Measurement Bottleneck bandwidth BBw can be measured by sending burst of packets (of size S) and measuring the return time gap(Tg). BBw=S/Tg if no interference Available bandwidth ABw is harder to measure. Cprobe (U. Boston) sends burst of packets and measures the time-gap between 1st and last msg. Estimate ABw based on packet round trip time or comparison with history of round trip time.
  • Slide 40
  • Load Balancing/Web System 406/21/2000Edward Chow Smart Probe Simulation Results
  • Slide 41
  • Load Balancing/Web System 416/21/2000Edward Chow Weight Calculation Rate each web server with weight based on statistics sent from the web server agents weight of server= ((19.68*rid) + (19.58*rcpu) + (19.60*rrq) + (19.64*rrps) + (17.24*rap) + (4.23*rsr))
  • Slide 42
  • Load Balancing/Web System 426/21/2000Edward Chow Weight Calculations (Example) CPU idle time had an average throughput of 51.92. The sum of averages for the characteristics was 265.18. To find the relevant percentage 51.92/265.18 = 0.1958 = 19.58% was then multiplied by the actual CPU percent idle divided by the approximate threshold (found to be 100% during the benchmarks), to get the weight: = 19.58*( /100)
  • Slide 43
  • Load Balancing/Web System 436/21/2000Edward Chow Network Design/Planning Tool Need realistic network traffic (Self-similar) load to exercise the simulator. Need tools for specifying network topology, detecting bottlenecks in the web systems suggesting new topology and configurations
  • Slide 44
  • Load Balancing/Web System 446/21/2000Edward Chow Why is the Internet hard to model? Its BIG January 2000: > 72 Million Hosts 1 Growing Rapidly > 67% per year Constantly Changing Traffic patterns have high variability Causes of High variability Client Request Rates Server Responses Network Topology
  • Slide 45
  • Load Balancing/Web System 456/21/2000Edward Chow Characteristics of Client Request Rate 1 Client Sleep Time Inactive Off Time Active Off Time Embedded References 1 Barford and Crovella, Generating Representative Web Workloads for Network and Server Performance Evaluation, Boston University, BU- CS-97-006, 1997
  • Slide 46
  • Load Balancing/Web System 466/21/2000Edward Chow Internet Traffic Request Pattern
  • Slide 47
  • Load Balancing/Web System 476/21/2000Edward Chow Inactive Off Time Time between requests (Think Time) Uses a Pareto Distribution Shape parameter: = 1.5 Lower bound: (k) = 1.0 To create a random variable x: u ~ U(0,1) x = k / (1.0-u)^1.0/
  • Slide 48
  • Load Balancing/Web System 486/21/2000Edward Chow Inactive Off Time
  • Slide 49
  • Load Balancing/Web System 496/21/2000Edward Chow Active Off Time Time between embedded references Uses a Weibull Distribution alpha: = 1.46 (scale parameter) beta: = 0.382 (shape parameter) To create a random variable x: u ~ U(0,1) x = ( -ln( 1.0 u ) ^ 1.0/
  • Slide 50
  • Load Balancing/Web System 506/21/2000Edward Chow Active Off Time
  • Slide 51
  • Load Balancing/Web System 516/21/2000Edward Chow Example HTML Document with Embedded References CS522 F99 Home Page
  • Slide 52
  • Load Balancing/Web System 526/21/2000Edward Chow Embedded References
  • Slide 53
  • Load Balancing/Web System 536/21/2000Edward Chow Server Characteristics File Size Distribution Body Lognormal Distribution Tail Pareto Distribution Cache Size Temporal Locality Number of Connections System Performance: CPU speed, disk access time, memory, network interface
  • Slide 54
  • Load Balancing/Web System 546/21/2000Edward Chow File Size Distribution - Body Lognormal Distribution Build table with 930 values Range: 92

Recommended