Friday, December 08, 2006

Apache: Handling Traffic

Introduction

That low-volume Website you launched a while back is becoming quite popular these days and now you're starting to wonder how you can make your Web server a little more robust to handle the load. What options are available to you? From upgrading your current server to setting up a scalable server farm are discussed - and while written primarily with Apache in mind, actually applies to any Web server software. Likewise, this article could apply equally well across server platforms, including Solaris, Linux, BSD and Darwin.

This article takes a high-level look at the various software and hardware solutions for providing a high-availability website. The methods discussed, from upgrading existing hardware to implementing load-balancing with fail-over mechanisms, can be applied to any Web server on any platform. It also touches on the pros and cons of replicating the content of a Web site across multiple local hard drives versus using "Network Attached Storage" (NAS).

Reference Infrastructure

By "reference infrastructure," we're basically talking about your current Web serving solution. Typically this will be a single Web server, or perhaps several separate servers handling different parts of, or entire Web sites each.

The Web server hardware and software doesn't particularly matter in the scope of this article, since the solutions discussed typically operate at a network and/or hardware level in addition to your Web server, independant of what it's running. There are several software-based solutions, and these would be the exceptions - but are usually run on their own dedicated servers. See below for specifics on each type and how they differ.

Handling Traffic

"Traffic" is defined as the network interactions between your Website and its viewers and can vary from a few hits here and there to a constant pounding. The infrastructure that you are currently running more than likely has a finite limit of resources with which to deal with this traffic. How gracefully you can add more resources as your Web site grows and the more available your site is to your viewers becomes increasingly important, especially in demanding situations such as E-commerce - but also popular Web sites in general. Just as your traffic scales up, so must your capability to meet this demand.

Several methods of scaling have emerged and range from making that single server more powerful than it is today, to allowing almost plug-and-play scaling simply by adding another server to your network - using practically any class of machine.

Upgrading current hardware - The simplest method would involve simpling upgrading your current server by either adding more RAM, CPUs, faster hard drives, network interfaces or just replacing it with something new altogether. In the case of the former, you'll eventually hit a ceiling in what you can upgrade, after which you face diminishing returns. In case of the latter, while it would certainly work - leaves something to be desired. First, you don't gain redundancy as you still have one server. Secondly, why not keep the original server in use as well? Wouldn't it be better if you could just add on additional servers, protecting your investment and going beyond depreciation?

DNS "round robin" - The next logical step and a quick and easy hack was to use what is called "round robin" name resolution, by specifying multiple IP addresses for a particular hostname in your zone files. While it does work, it is a simple solution indeed. There is a practical limit to how many machines you can add to the mix this way, and only works best if all your machines are similar in resources. Finally, if any of those machines goes down, your viewers could experience what seems like "random failures" of your Web site as they alternatingly hit your working servers as well as the dud. That is, there's no mechanism to route around machines in a down state. This method is also very inflexible in that you cannot assign particular servers to a specific task, for example - or weight the preference for a particular server.

Server load-balancing - True server load-balancing (SLB) is the current way to build a wonderfully scalable and redundant server farm that is customizable to your needs and budget. The benefits of SLB include:

* Flexibility
* High-availability
* Scaling

Flexibility comes in the numerous ways in which you can configure your infrastructure, mixing and matching servers and re-directing traffic to specific servers. High-availability, or redundancy, keeps your Web site up and running by providing failover mechanisms to route around servers that may have failed or otherwise become unavailable. Finally, scaling using SLB is as easy as adding another server to your farm, be it a state-of-the-art, latest and greatest server - or an old desktop from accounting, otherwise headed for the scrap heap.

Methods of Server Load-Balancing

There are other methods of "load-balancing" your Web site, which aren't so in the true sense of the definition but breaking up functions of your site amongst several servers. This method of resource allocation would more accurately be described as "load-sharing." An example of this would be having a separate Web server, CGI or application server, database server or ad and banner server. This method spreads the load from one machine to several and does scale, but doesn't quite address the redundancy and failover issues. SLB allows you to maintain this common method of resource allocation, but provide the benefits associated with SLB as well. As your Website grows, you should be splitting up tasks between dedicated servers for other reasons:

* Having a server tuned for the task it has to perform
* Spreading load among several servers over one
* Security by sandboxing exploits and intrusions
* More fine-grained control over resources and upgrades

There are two basic implementations of SLB, being software and hardware. Software-based SLB usually suggests software, running on a server to handle the load-balancing process. Hardware-based SLB is defined as a specific piece of hardware such as a router or switch, or something designed around "Application Specific Integrated Circuit" (ASIC) chips. Indeed, software-based solutions could be thought of as "more complicated hardware solutions" since they're best run on their own dedicated server, functioning as a unit - although this isn't written in stone. You could even use combinations of both. It sounds complicated, but it's logical and should become obvious if you know your Web serving demands well.

Using SLB with the aforementioned "load-sharing" method of implementing multiple, specific-purpose servers allows you to add resources specifically to areas where you need it most. For example, you site is becoming bogged down with images which are taking increasingly longer to download. To speed up your Website, you could have one server for the HTML content and have all your images loaded from a second, dedicated server. This would give your viewers the readable content very quickly, while off-loading image serving tasks. You could tune the image server by having faster I/O or larger storage. Now apply SLB to this scenario and you could add additional HTML or image servers as necessary and transparently.

This is just the tip of the iceberg. The ways in which you can set up your infrastructure with this technology in hand is limited only by your budget and imagination. There is an initial outlay in funds to implement SLB, but the rewards down the line pay for themselves and if you have a serious Web site, is the only choice. Not to mention peace of mind for management, system administrators and your Web site viewers alike.

Load-Balancing Products

As mentioned above, there are software and hardware solutions to handling SLB. Which one you use depends on your needs but for the most part and on a feature level - are indentical. The devil is in the details as they say, and here is a run-down of each:

Hardware is always that - a switch, a router or other dedicated device which has no operating system or application software as you would know it and uses ASICs to perform SLB. These devices are usually faster and more efficient than software-based products due to the ASICs that handle the processing which are more specialized than your average desktop CPU such as an UltraSPARC or Pentium chip and don't involve any software layers. The downside with hardware solutions is that you are limited to how much you can upgrade the device via firmware. Since the tasks that they are required to perform revolve around standardized protocols, it's not that critical of an issue. Perhaps the nicest features of the hardware approach are its small size and virtually maintenance-free operation.

Software solutions are more flexible than hardware at the expense of some of that speed and another server to administrate and maintain. Software such as Resonate's "Central Dispatch" are indeed nice, with it's flexibility and extensibility. Performance monitoring is usually also better with software solutions. You can really tweak your set up this way and new features are easily implemented through software upgrades.

Here are some examples of each type of product, from various vendors.

Vendor Product
Software: Resonate Central Dispatch
Hardware: Alteon ACEswitch
Foundry Networks ServerIron/XL
Cisco CSS Series
F5 BIG-IP 2000
Extreme Networks SummitPx1

Design Considerations

One thing that you must consider is how you are going to make your Web site content and data available to multiple servers. Basically, you have two approaches and each has its pros and cons.

Local hard drive(s) - Local hard drives in their simplest form, that is - by themselves is a perfectly viable option if you have multiple Web servers in your farm since you have some redundancy through SLB and also the cheapest approach. You can and should increase your I/O throughput and data redundancy with a RAID array for additional protection and how far you take that approach is up to you. Keep in mind that it doesn't need to be super-resilient, but don't take risks, either. The big problem with locally attached storage is data replication. That is, you must synchronize any changes to your Website content or data amongst however many servers you have, serving that content. Replication is left up to you by vendors of SLB products. There are several ways to handle this, and my favorite - and very powerful method, is illustrated in the article "A Tutorial on Using Rsync" using a software approach.

"Network Attached Storage" (NAS) - If you have a lot of content or expect a large increase in demand for storage space, NAS is a great solution. An example of this type of storage solution would be the Network Appliance Filer. Think of it as an integrated, yet highly-scalable NFS server that you hang on your network. Many machines can access it, and adding hard drives for additional storage or volumes, increased redundancy and hot spares - couldn't be easier. The best way to implement this type of storage is through the use of a high-speed fiber network that is independant of your public network which increases security and dedicates maxiumum bandwidth to I/O tasks. These Filers can also be clustered, so that they become redundant amongst each other. There are also many other vendors offering this type of storage ranging in price from $20,000 to well over $250,000 depending on "how many nines" you want in your uptime percentage (e.g. "99.99999% uptime") and how much total storage space you need.

A hardware-based approach to replication also exists, using the GLOBAL-SITE Controller in conjunction with BIG-IP load-balancing switches by F5, for example - and can manage both types, on a global scale.

Scaling from Here

How you scale your infrastructure from here is wholly dependant on your growth needs and budget available. You will need to analyze your Website traffic and monitor your server performance characteristics closely to see how they all interelate. From there you could find that there is a particular shortage of resources such as CPU power on your database server or storage on your multimedia server or even the redundancy of both. Knowing your current limitations is the key to scaling and upgrading your infrastructure wisely.

When done correctly, scaling in its simplest form could mean just adding another server to your farm for increased redundancy, higher-availability and speed. This sort of linear scaling is practically plug and play.

No comments: