www.numaria.com
Numaria Ltd. 6th January '09
Email:  
Pass:  
Server Load Balancing (SLB)

Abstract

Server load balancing has become an important function both for hosting environments and a variety of other network architectures. This paper introduces the concepts behind server load balancing and explains in detail how Numaria implements server load balancing.

Load balancing was developed to address the problem of overloaded servers. A Server Load Balancer (SLB) is placed between the client and a group of servers and configured so that, although multiple servers may be on one side of the SLB, they appear to be one very large and powerful server that never goes down. The SLB takes on the IP address that the client is trying to contact, becoming a Virtual Server that directs the client to one of the servers in the load balanced group of servers. The SLB may be a dedicated device that uses software to perform all traffic management decisions or it may be a multi-port switching device, such as Numaria offers, that uses a hardware to perform these functions allowing for greater performance.

This paper highlights the three areas that must be addressed to implement a successful load-balanced environment: the load balancing algorithm, the method for checking server availability, and the method of ensuring that client requests are directed to the same server when required. We discuss these issues and the solutions that Numaria delivers.

Load Balancing Algorithms

The first issue that must be addressed is how traffic is balanced between servers. As a client comes in, the SLB must determine which server to connect the client to for the session. The goal is to allocate sessions in a relatively equal fashion so that no single server receives an inordinate amount of traffic, leaving other servers idle.

The simplest algorithm is called "round-robin." When a request arrives in this mode, it is sent to the next real server in the pool of servers. This process assumes two things: that all servers are equal in power, and that all requests require the same amount of effort for the server to fulfill. When servers of different performance levels are used, then a straight round robin algorithm would cause the slower servers to have the same load as the faster servers, not a good balance. The solution to having different levels of performance in the machines in the server pool is to implement a weighted round robin algorithm, where a fractional weight is given to each server and sessions are assigned using these ratios. While this solves the inequality of session allocation, not all sessions generate the same load. The number of sessions may be equal, but each of the individual sessions could generate vastly different server loads. Some may be intense database queries or high bandwidth streaming media sessions, while others may be just minimal text downloads or a small gif file. If a disproportionate amount of one type of session goes to a particular server, that server may bog down, resulting in poor response time or lost sessions, while other servers in the load balanced pool remain underutilized.
Server Load Balancing,

Server Load Balancing Diagram

One solution to this problem is to try to measure server utilization. This is done by either load balancing based on the number of open sessions a server has - which is known as the "least load" algorithm - or by keeping track of the response time of each server and balancing based on fastest response. Other options include keeping a ranking of the combination of fastest response and least load, or tracking this information over time and ranking it based on changes in increasing or decreasing values, and using these rankings for selecting which server a client session is assigned. Some devices allow for load balancing based on the URL. Besides the obvious limitations of working only for HTTP traffic, there are concerns about the delayed binding and the maintenance of the URL to the server-binding table. Each new session requires an increasing amount of software to perform properly. As traffic increases and requires additional software features in the SLB device, the SLB can become the bottleneck. The power and the number of servers, along with the amount of network traffic, all come into play in optimizing the system's performance levels when selecting the load-balancing algorithm.

One final part of SLB algorithm selection is ensuring availability. One feature an SLB must have to perform at top levels is a Maximum Session Threshold per server. This allows the system administrator to select how many sessions at most will be assigned to a given server. Once the SLB has reached its maximum, no new sessions are assigned until the number of sessions goes below the maximum threshold. The threshold control feature ensures that a server won't receive too many sessions and become overloaded. It also allows two SLB devices to run in either an Active-Passive mode where one device is running in a standby mode for the other, or in an Active-Active mode where both devices are acting as backups for each other and both are load balancing sessions to the same server by using different real port numbers on the servers, without danger of overload. If either SLB were to fail, the other would have enough bandwidth available to pick up the slack without causing a cascade of failures resulting in the site being down and lost revenue.

Session Persistence

The load-balancing algorithm spreads the load and risk across multiple servers; each flow from a client is processed by the algorithm and assigned accordingly. But problems abound. For example, downloading a Web page, entering information, loading a shopping cart, and purchasing items are all considered to be part of one session for a client. But for an SLB, these are considered to be tens or hundreds of individual sessions or flows. A Web page consists of many elements or objects, each of which is requested separately. Filling a shopping cart is done by viewing multiple Web pages and entering data where desired. Making a purchase requires moving from HTTP to a secure SSL mode then back again. In addition, the shopping cart information usually is stored on the same server as the SSL session. Without session persistence, the SLB would see all these flows as distinct events to be load balanced, and the shopping cart information would be scattered over the pool of servers.

The solution is to send the client to the same server each time. In an ideal world, this would be accomplished by looking at the client's IP address, matching it to previously assigned flows and sending the client to the same server or using the load-balancing algorithm of choice to assign the client to a server. Client-to-server bindings should have a timeout feature that enables a client to visit other sites and still return and connect to the same server, without being assigned to an entirely new server and losing previously entered data.

Most sites mix applications, using HTTP for Web pages, SSL for secure transactions, and an audio or video engine for media streaming. Because each of these sessions uses different port numbers, each is considered by an SLB to be a distinct session. With Sticky Ports, however, the SSL session will be assigned the same server as the HTTP session. Assigning it to the same server is accomplished by enabling the feature during installation of the virtual server. The intelligence of the software allows for selecting a configuration associating multiple application port numbers together. When a new session arrives, the SLB looks to see if a session binding to a real server exists for the client IP address and the virtual server IP and port number combination, or any of the other virtual server port numbers in the sticky port grouping. If a binding already exists between the client and a server, then the new session is sent to the same server. If there is no current binding, then the load balancing algorithm selects to which server the client session should be sent.

Another issue that must be addressed is when a client goes through a proxy server. Whether as a security precaution or as a way to save public IP address numbers, some proxy servers make all traffic coming from the network they are serving appear to be originating from the same IP address. This is done using a technique known as Network Address Translation (NAT). It is possible that a client may use one IP address for HTTP traffic and another for the SSL (or other port) traffic. The SLB would see this as traffic coming from two different clients and potentially assign the supposed clients to different servers, causing shopping cart data to be unavailable for the checkout application. This problem is solved using one of two techniques: delayed binding or Intrinsic Persistence Checking.

In a delayed binding mode, the SLB actually initiates a TCP session with each new flow request. The client thinks it's talking to the end server and starts to send data to the SLB, which reads the first packet of information and looks for client-specific information. In an HTTP mode, the SLB looks for "cookies" that it or one of the servers has inserted. In an SSL mode, by comparison, the SLB looks at the SSL session ID. In either case, the SLB compares this information with its stored table of server bindings and picks the real server to which the client should go. The SLB then initiates a session with the server, looking like the client, and connects the two together. This is an extremely software-intensive process that puts a limit on the throughput of the SLB and currently works only with SSL or HTTP sessions. In addition, the Sticky Port feature must be running to ensure that the SSL and HTTP traffic goes to the same server.

Numaria's SLB utilize real-time Intrinsic Persistence Checking in conjunction with the Sticky Port feature. Instead of using extrinsic information contained in the data payload, the SLB uses the intrinsic information in the packet header to know where to send client sessions. Traffic to an SLB comes from all over the Internet, meaning that, on average, traffic is coming from hundreds or even thousands of different proxy servers. Each of these proxies uses contiguous pools of IP addresses to assign to clients as they access the Internet.

We know that a client will be given an address from a specific range. We also know that these ranges of addresses never overlap. We can use this intrinsic information contained in every packet header to make a real-time, accurate decision as to which server to connect the client. This Intrinsic Persistence Checking is accomplished by applying a netmask to the client IP address and comparing the result to existing client/server bindings. If one exists already, then the client is sent to the same server; otherwise, the selected SLB algorithm will choose the server.

Comprehensive Server Checking

The last key element for implementing a successful SLB environment is comprehensive server checking designed to verify that the server is up and functioning properly. Aliveness checking entails more than pinging the device and waiting for a response, as the server could very well be up but the application that is servicing customer requests could be down. The applications and databases on the server, the connections to a backend database server, and the ability of that server to supply data must all be checked to guarantee that customers receive the highest levels of service.

We addresses these requirements through Comprehensive Server Checking, where the SLB sends each server a request to execute a CGI script. The CGI script checks the server side applications and ability to talk to backend databases. If everything is working properly, the CGI script sends an "OK" message to the SLB; otherwise, it either sends a failure message or times out. If the server still fails to respond properly after a pre-selected number of tries, it is taken out of the queue and no new sessions are sent to it. The truly unique feature here is that, regardless of what kind of application is being load-balanced on the server, administrators can configure CGI scripts to verify that the server is up and running.

Comprehensive Server Checking ensures that all the paths a client request can take are checked and working. If a device only sent a ping or an open and close on a port, it would only check if a port or stack was working, but not if the application was actually working and the database is available. If a device uses an external server, it is merely checking the path the server has to the servers under test - and not the path that the client would take. Either case can potentially lead to a server being marked as up and running, when it is actually performing improperly for the client.

Reliability

The final piece to the puzzle is SLB reliability. The Hitless Protection system provides primary defense against control module failures, by negating the need for a reboot on a control module failure. In addition, two SLB devices can be deployed so that if one has a dual control module failure the second is available to process client requests. This is done using VRRP . With VRRP, the backup device monitors the primary device and takes over processing traffic if a failure occurs. Stateful fail-over, finally, adds a third layer of protection. As bindings between clients and servers are created and cleared, the information is sent to the backup device, which has all the information required to pass traffic from clients to the appropriate server if a failure should occur.

Server Load Balancing Example

By any measure, server load balancing entails much more than simply redirecting client traffic to multiple servers. In order to implement it correctly, the SLB device must have features such as Intrinsic Persistence Checking, Comprehensive Server Checking, and Stateful Fail-over Redundancy. All these capabilities need to scale as the volume of network traffic increases so that they do not become bottlenecks or potential points of failure. In response to these needs, the Load Balancing Switch provides the performance and scalability that network administrators need to maintain optimal performance, no matter how large the network, Numaria provides its customers with enabling service provider infrastructure that yields increased network reliability, improved performance, and enhanced services.

English version of www.numaria.com Translate www.numaria.com into German Translate www.numaria.com into Spanish Translate www.numaria.com into French Translate www.numaria.com into Italian Translate www.numaria.com into Japanese Translate www.numaria.com into Korean Translate www.numaria.com into Chinese
Numaria Ltd.
© 2009
Company No. 05437524
Please read our Privacy Policy
and Terms and Conditions
. .