Developer Network Home - Help

High Performance Web Sites: Rule 2 - Use a Content Delivery Network (Yahoo! Developer Network blog)

« Seen the New Aussie Search? | Main | The Right Media at the right time »

High Performance Web Sites: Rule 2 - Use a Content Delivery Network

April 26, 2007

The user's proximity to your web server has an impact on response times. Deploying your content across multiple, geographically dispersed servers will make your pages load faster from the user's perspective. But where should you start?

As a first step to implementing geographically dispersed content, don't attempt to redesign your web application to work in a distributed architecture. Depending on the application, changing the architecture could include daunting tasks such as synchronizing session state and replicating database transactions across server locations. Attempts to reduce the distance between users and your content could be delayed by, or never pass, this application architecture step.

Remember that 80-90% of the end-user response time is spent downloading all the components in the page: images, stylesheets, scripts, Flash, etc. This is the Performance Golden Rule, as explained in The Importance of Front-End Performance. Rather than starting with the difficult task of redesigning your application architecture, it's better to first disperse your static content. This not only achieves a bigger reduction in response times, but it's easier thanks to content delivery networks.

A content delivery network (CDN) is a collection of web servers distributed across multiple locations to deliver content more efficiently to users. The server selected for delivering content to a specific user is typically based on a measure of network proximity. For example, the server with the fewest network hops or the server with the quickest response time is chosen.

Some large Internet companies own their own CDN, but it's cost-effective to use a CDN service provider, such as Akamai Technologies, Mirror Image Internet, or Limelight Networks. For start-up companies and private web sites, the cost of a CDN service can be prohibitive, but as your target audience grows larger and becomes more global, a CDN is necessary to achieve fast response times. At Yahoo!, properties that moved static content off their application web servers to a CDN improved end-user response times by 20% or more. Switching to a CDN is a relatively easy code change that will dramatically improve the speed of your web site.

Steve Souders

[Steve Souders is Yahoo!'s Chief Performance Yahoo!. This is one in a series of Best Practices for Speeding Up Your Web Site. This article is based on Steve's book High Performance Web Sites, published by O'Reilly.]

Posted at April 26, 2007 9:03 AM

rss     Add to My! Yahoo

Comments

Sounds good. You guys are hosting my site. What should I do about these CDN suggestions? TIA.

Posted by: BillyG at July 25, 2007 4:59 AM

We use a CDN but it reports we do not only because we cname the akamai url to a match.com url. This is a little misleading.

Posted by: Ryan at July 25, 2007 9:16 AM

I would suggest allowing users to disable this rule. for a large percentage of sites a CDN won't be an option due to cost constraints, so this just adds noise to the results.

Posted by: Jeff at July 25, 2007 10:49 AM

You can add your own CDN hostnames to YSlow as described in the FAQ. In a future release we'll have a way for these CDN preferences to be fed back into the main YSlow source code. Using Yahoo! web hosting doesn't make it a CDN. For smaller sites that can't afford a for-profit CDN service, you can try free services such as Globule, CoDeeN, and CoralCDN.

Posted by: Steve Souders at July 25, 2007 11:05 AM

Are you guys getting a kickback from Akamai or what? You give Google's front page a 'C' purely on the basis of not using a CDN. How stupid is that? My own site which is a single 11KB HTML file also gets a 'B', purely based on not using a CDN. This rule is amazingly stupid.

Posted by: Jeffrey W. Baker at July 25, 2007 1:34 PM

FAQ does explain how to add your hostname to the CDN list, so let's move on and pay attention to all the other good suggestions this tool is producing :)

Posted by: Boris Popov at July 26, 2007 10:29 AM

Wouldn't it make more sense to talk about http pipelining here?
What use is a CDN, when your hitting a single hostname?

Posted by: David Mcanulty at July 26, 2007 2:17 PM

Mileage varies on CDNs. My company tried a CDN (one of the ones mentioned in the article) for a while and we found no measurable performance difference, from multiple client networks, between serving our own static files and serving them from the CDN. And, we had multiple technical issues with the CDN, where certain customers would not be able to see our images, or something would get "stuck" in the CDN cache even though it had clearly expired and needed to be refreshed from our servers. The whole experience left a pretty bad taste and the bottom line was it didn't seem to increase performance at all.

On high-traffic sites, there's clearly a benefit to moving static files OFF of application web servers and onto servers that are optimized just for serving that static content.

And I do suspect that CDNs provide benefits for sites with truly global audiences-- latency on transcontinental internet connections can be kind of high. If this is your situation, make sure your CDN has adequate nodes in the countries where your audience lives.

Posted by: David C-L at July 26, 2007 8:58 PM

Amazon's S3 is worth a mention here: It's distributed, pay-as-you-go, and very cheap. It costs me about $3/mo to host >130,000 PNG images (including traffic costs), and it's definitely helped response times.

Posted by: Nick Johnson at August 1, 2007 2:57 PM

We're using the Cachefly CDN (http://www.cachefly.com), which starts at $15 per month and we're very happy with it. Before getting our Cachefly 'Plus' plan ($99/month) we were paying $1000+ per month to one of the bigger CDN vendors. Performance is similar. If you want to customize a lot of settings I can recommend Akamai and Mirror Image.

Posted by: Jep at August 2, 2007 2:49 PM

I agree totally that there should be the option to disable this rule... Yslow is being used by myself and my development team at the moment, working on building a large ecommerce site, but when you are testing pages from a local server, or even a virtual machine, you are not going to be running a CDN. Firebug's primary use is during the development cycle of a project, and not many people would roll out their project across a CDN until it is finished. Having the rule in there, for us, makes for misleading results, as it is an automatic F for the pages we are coding.

Posted by: David Eglin at August 31, 2007 3:48 AM

Thanks for all the great feedback.

David Mcanulty: HTTP pipelining is discussed in the book. Unfortunately, the lack of support in IE and off-by-default in FF preclude it from being a viable factor.

David Eglin: That's an interesting problem. Development at Yahoo! is similar - for some components moving them to the CDN happens later in the process. But that's also a common problem: people forget to move components to the CDN. Since the page works fine, they overlook that last step of the push process. I would suggest leaving the rule on. It helps remind folks to complete the push to CDN.

Posted by: Steve Souders at September 5, 2007 10:46 AM

I've been looking into the results of YSlow on a project I'm currently working on, in which is have seem to have some trouble getting the CDN hosts configured properly. The
site we're talking about is for sure using a CDN Network (Savvis), but for some reason YSlow seems to ignore the hostname I configured for that CDN.
I have been trying to enter the CDN hostname in various way, but still then YSlow keeps complaining about all of the files not being on a CDN network.
Could anyone tell me in what way YSlow tries to check if files are being retrieved from a CDN network or not? Or try to help me out on how to configure the CDN hostname properly?

Posted by: Martijn at September 27, 2007 2:31 PM

A way to get around the push problem is to replace references to assets on the fly. This can be easily accomplished in Apache's httpd 2.0+ using the mod_ext_filter module and sed. Set up a regular expression to filter the web server output and replace references to assets to those that need to fetched from the CDN. Using this method there was a minimal impact on the server load on a dynamic site doing 40-60 hits/s spread across 3 nodes and the process was completely transparent to developers. Additionally, when we needed to take the CDN off the site. It took less than a minute to comment out the line enabling the filter in httpd.conf and restarting the servers. With a load balanced site you won't even have downtime.

One thing to note is that you still have to judge the results of YSlow intelligently based on the traffic characteristics of your own site. When in doubt of YSlow, use curl to verify some of the results.

Posted by: Marcin Depinski at October 17, 2007 3:10 PM

I love YSlow and Steve's book, but I find it funny that akamai.com scores an F for the CDN rule.

Posted by: Mike at October 23, 2007 7:34 PM

Yes, we pre-load YSlow with just a few well known CDNs (Yahoo's, S3). I'll add akamai.com. A future feature will be a way for people to "nominate" their CDN so that it's automatically included in YSlow (as opposed to having to set it in your FF about:config preferences).

Posted by: Steve Souders at October 24, 2007 9:09 AM

We decided to use Amazon S3 as our CDN, which I know is not a tru CDN, but at least helps with storage, bandwidth, and parallel downloading. The problem is that now my YSlow score is awful. Mainly because S3 cannot GZIP items.

At the moment we have all js, css, and images on S3. Is it recommended to only host certain files (images, media) and let the server handle the js and css?

I also wanted to know if I can just use alias domains to handle the parallel download issue for JS and CSS, but keep them on the same server.

Thanks.

Posted by: Chris at October 30, 2007 7:04 AM

You should host all your static content (including scripts and stylesheets) on your CDN. Ideally S3 would gzip your scripts and stylesheets. I expect they'll add that soon.

Until then, it is possible to set headers when you store an object (resource) on S3 (see http://docs.amazonwebservices.com/AmazonS3/2006-03-01/gsg/writing-an-object.html). This could be used to achieve your goal, but you'll have to do the work. You could, for example, push a gzipped and non-gzipped version of your file to S3 (eg, module1-gzip.js and module1.js). Make sure to add the Content-Encoding: gzip header per the documentation in the previous URL. Then, when generating your HTML page serve the appropriate SCRIPT SRC value based on the Accept-Encoding request header.

Posted by: Steve Souders at October 30, 2007 8:46 AM

I would like to second the comments several others made in reference to optioning this feature in YSlow evaluations, which otherwise I find to be interesting and often very informative and useful in performance tuning. The option to deploy your site via a CDN is not pertinent to many, many sites, particularly smaller ones obviously, and it should be permitted to be disabled so as not to skew the overall evaluation.

Why not option each of the features so as to enable customized performance results - compilation, analysis and resultant tuning efforts.

Posted by: dwight at November 14, 2007 3:17 PM

You might want to check out Panther Express (http://www.pantherexpress.net/). It's very cost effective.

Posted by: Steve Souders at November 14, 2007 10:05 PM

Umm. We use Akamai but YSlow is not seeing it. I'll look at the FAQ. Great article in Baseline Steve.

Posted by: Mike Hardy at November 15, 2007 4:56 PM

Has anyone tried CacheFly (http://www.cachefly.com)? I've had a great experience with them so far.

Posted by: Eric at November 16, 2007 8:11 AM

SECURITY! SECURITY! SECURITY!

I am surprised that I see no mention of possible security concerns associated with use of CDNs. Clearly, I recognize the performance gains possible from CDNs (especially when page weight is high due a richer user experience) but I think there are possible security concerns that need to be addressed here as well.

My intent is not to be alarmist but simply foster a dialogue so these may be addressed. I am open to sound arguments that address these concerns but let's at least recognize them as a possible concern and proactively address them

images and possibly css files are likely candidates for CDN but Isn't it fair to say that risks vary across all possible static files that could be offloaded to a CDN. What about js files? Doesn't anyone else see the possible security concerns associated with offloading js files to another network? I mean if the CDN is compromised from either an internal attack at the CDN or external attack, then the entire DOM on the respective site is potentially exposed. I can think of a few very clever attacks on a site once the attacker has compromised the integrity of the js files.

Is the position that CDN is useful to everyone unless you are a bank or have other personal user data hosted on your site?

Posted by: joe at November 23, 2007 10:23 AM

Just got agree with everyone already mentioning how this value is tainting an otherwise great tool. Maybe rather than nixing it there could be a site "profile" selection (based in traffic, bandwith, etc..) that determines whether it belongs in the grade. Surely for most sites it should not factor.

Posted by: ben at November 30, 2007 1:31 PM

From the beginning I knew the CDN rule would be hard to apply to all web sites. That's why I provided the option to add your own CDN (see http://developer.yahoo.com/yslow/faq.html#faq_cdn ).

Although this preference setting exists, it's still problematic. Users have to manually add their CDNs. Two people won't necessarily be comparing apples to apples if they compare their YSlow grades without identical CDN preference settings. Unless you know the CDNs used by other sites (not your own), you don't necessarily know if the YSlow score is accurate. etc.

My plan has been to allow people to "nominate" a CDN to a backend YSlow server that would validate it was a CDN (primarily that the hostname had geographically dispersed locations). Whenever YSlow started up it would ask this backend server for the list of validated CDN hostnames and apply that to Rule 2. I almost started working on that this week, but now I don't think it'll scale well (the JSON response would be huge).

Another idea would be to do it in realtime: When YSlow runs it asks a backend YSlow service whether or not the hostnames are CDNs. Results could be cached locally in the browser for better performance. Although I say "realtime" this would actually cause the score for Rule 2 to be delayed so the overall score would be delayed or updated. This realtime check could also be made an option, so it would only performed when the user chooses.

Finally, I want to mention that another high priority feature is customized grades: Users will be able to alter the entire scoring algorithm for YSlow - giving rules different weights and thresholds. Users who wanted to skip the CDN rule could give it a weight of zero.

Thanks for all the feedback.

-Steve Soudes
creator of YSlow

Posted by: Steve Souders at December 1, 2007 12:22 PM

There are many other smaller CDN companies out there offering very competitive rates. Nice post - thumbs up!

Posted by: ValueCDN at December 26, 2007 11:13 AM

I tried using CacheFly for a small image, but then the image (which was previously cached when not on a CND) isn't cached by the browser. Can this be a problem? Is there a solution?

Posted by: Aaron at December 30, 2007 8:32 PM

There are quite a few low cost and budget CDN providers. While it's great it also saves some $$ in bandwidth fees not only lowers your front-end server load.

http://www.valuecdn.com/

Posted by: Josh at January 2, 2008 7:23 AM

You can do load balancing based on IP Deny http://www.ipdeny.com/ IP address blocks in Cidr format.

Posted by: IP Deny at January 2, 2008 1:58 PM

Hi, when you use Yslow to test your sites locally, you can add your own CDN's as many virtual domains you have... but don't leave spaces between this virtual domains names! :D

Posted by: Paul at January 15, 2008 11:54 AM

Want an "A" rating for CDNs? Don't want to wait for Steve's next upgrade to YSlow?

"Grade on a curve!"

In about:config add the following preference as an INTEGER with a VALUE of "0" (zero):

extensions.firebug.yslow.pointsNotCDN

You'll still get the "Use a CDN" and all of the associated noise, but it will grade "A" and appropriately bump your score. Enjoy. Now, go read my blog; I'll be posting more details on hacking the tool to ignore CDN all together, similiar how "Rule 8" currently works.

Posted by: Tim at February 19, 2008 6:29 AM

I have been a web developer for 10+ years and this is the first that I am hearing about "CDN". I can't believe that this idea is being sold and to be honest this seems more of a money making ploy than anything else and the reason that I say that is because of one; Broadband is more rapid than it's ever been, and two; If you design your web site with maximum efficiency then you should not need to bother with a "CDN".
This is just my opinion and I would love to hear what others have to say.

Posted by: Nixit at June 16, 2008 10:25 AM

Nixit....do you really think the CDN industry would exist and continue to expand if this was nothing more than a "money making ploy".

I don't doubt your expertise as a web developer but the reason CDN's exist is not because of poor web page construction, it is because of the network conditions that exist between the source of the content and the end user. Latency has a significant impact to the protocols that are used in the TCP/IP stack and high latency results in slow page downloads or video stream delivery. While you are correct that the last mile to the end user is certainly better than it was by having broadband connections instead of dial-up, that alone doesn't overcome issues of latency and packet loss that directly affect throughput rates. If you can deliver the web or video content from a source closer to the end user from a CDN provider, you can reduce the impact of latency has on TCP throughput and deliver the content much faster and more reliably. This is the main value that CDN's provide. But they also reduce the amount of capital investment needed to deliver content to a wide audience. Rather than have to continue to purchase servers and bandwidth for the origin website, enterprises can utilize the CDN infrastructure that is already built out and can scale as web/video delivery needs increase.


Posted by: Dave at June 24, 2008 5:54 PM

The cost of a CDN for the majority of websites out there would be counterproductive. I think users should be able to ignore this rule.

Posted by: Ezrad at July 11, 2008 9:32 AM

Another vote for letting users ignore this rule. I develop web GUIs for embedded devices, none of which will ever use a CDN. There are many, many standalone webserver applications in the world and the number is growing. CDNs are irrelevant to most of these applications. Please let me switch it off so I can point at a nicer letter grade and make my boss happy.

Posted by: bill at July 23, 2008 7:52 AM

A great CDN, http://www.simpleCDN.com (I am not affiliated). Affordable and easy to setup.

Posted by: Bruce at August 5, 2008 10:50 AM

Post a comment

Comment Policy: We encourage comments and look forward to hearing from you. Please note that Yahoo! may, in our sole discretion, remove comments if they are off topic, inappropriate, or otherwise violate our Terms of Service.




Remember Me?


Copyright © 2008 Yahoo! Inc. All rights reserved.

Privacy Policy - Terms of Service - Copyright Policy - Job Openings