CDN integration module 5.x-1.x

published on February 16, 2010

In this article, which was in fact written in January—February 2008 (well over two years ago), I explain what the benefit is of using a CDN and how the then-new CDN integration module 1.x for Drupal 5 could help you do that for a cheap FTP Push CDN.
This was in fact more of a proof of concept module and therefore this Drupal 5 version of the CDN integration module is no longer supported. This article has been published because it would otherwise only been gathering dust. It will give you a better view on Drupal’s history for supporting CDNs, i.e. how hacky this solution is in comparison with its follower, the CDN integration module for Drupal 6.

A CDN (short for Content Delivery Network) is basically a load-balanced, globally distributed static file server. Why do CDNs matter to Drupal? Because they can drastically improve its page loading performance. The page loading performance of a web site is the time it takes for the end user’s browser to download all files and then render it.
If you’d like to learn more on how to improve Drupal’s page loading performance, see the article I wrote about it.

How it works

When a file is uploaded to a CDN, it is sent to servers all over the planet (hence “globally distributed”): in North- and South-America, in Europe and in South-East Asia. These servers are superfast. And more importantly, because you will be downloading files from a server that’s typically much closer to you, the latency will be much lower. The exact technique used may also affect the result: some pick a server based on the available capacity on a server, others by proximity. The latter is what matters most if you want fast loading websites, the former is more useful in case of large downloads (videos in particular). In the rest of this article, I’m going to assume you’ve opted for a CDN that picks servers based on proximity to the client.

Especially web sites with many images will benefit from this: if you have 8 images to download and 100 ms latency, that’s only 800 ms combined latency. That might go unnoticed. However, imagine there are 40 images on a web page. That would account for 4000 ms in latency, which would definitely be noticed not go unnoticed.
In reality, a browser can perform parallel downloads, so you can’t just add up the latencies linearly. While the example I just gave is scientifically worthless, it’s sufficiently correct to illustrate that latency matters: it adds up in the end.

It’s also worth noting that most web sites are served from the U.S.A., with the necessary consequences: the latencies quickly grow beyond reasonable proportions. The result is that web sites without CDN’s — or without static file servers in Europe, become saddingly slow at the other side of the Atlantic. Digg.com for example, has terrible latencies and terrifying page loading times (6 seconds until I can see everything properly, >12 seconds until everything has finished loading).

CDN integration module: Drupal + CDN made simple!

I won’t discuss any implementation challenges here, this introductory blog post is supposed to be a quick and easy introduction and how-to. I’d like to refer you to my aforementioned article, which has a separate section explaining the challenges faced with integrating this module with Drupal core — which is of course a necessity.

If you want to use it on relatively low-traffic websites (less than 30 GB of static file traffic per month), I would like to recommend CacheFly, which offers a 30 GB plan for only USD 15 per month. There’s a 30-day trial, so you can also just play with it first.
They use proximity-based server picking through anycast.

Installation

  1. Download the CDN integration module.
  2. Then do what you’d do for any other module. Extract and put in sites/all/modules.
  3. Drupal 5 (nor 6) has the ability to alter file URLs — I’ll make sure Drupal 7 can do this. So we’ll need to apply a core patch — if you don’t know how to do that, check Drupal’s documentation. I’ve included two patches in the CDN integration module:
    • If you only want to make the truly necessary changes: patches/d5_file_url_rewrite.patch.
    • If you also want to add JS aggregation (since the patch above affects the same part of the code, you can’t apply the separate pach for that) and put JS files at the bottom: patches/d5_file_url_rewrite_and_js_aggregation.patch.
  4. If your theme uses base_path() somewhere, you’ll have to make some minor modifications to your theme as well. A patch is provided (patches/theme_cdn_integration.patch), but it will only work for Garland or themes with an identical page.tpl.php as Garland’s. So if you’re using a different theme, this is what you’ll have to do. Change:

    <style type="text/css">@import "<?php print base_path() . path_to_theme() ?>/extra.css";</style>
    

    to

    <style type="text/css">@import "<?php print file_url(path_to_theme() .'/extra.css') ?>";</style>
    

    i.e., wrap the URL that’s being constructed in a file_url() call.

  5. Configure the $conf array in your settings.php to include your CDN synchronization filters. I won’t elaborate on that here because it’s rather complex, and chances are very likely that you’ll only have to swap every wimleers.com from the example configuration with your own domain. So I’ll refer you to the included README for the details.
  6. You can use either Drupal’s cron or CDN integration’s own cron for running the synchronization. If you only have a very limited number of files you have to synchronize, then Drupal’s cron will be sufficient. If you’ve got many though, synchronizing to the CDN may consume too much of the cron run’s time. In that case, it’s recommended to use CDN’s cron. You can configure this at admin/settings/cdn.

    Installing CDN’s cron is easy: copy cdn_cron.php from the CDN integration module to the Drupal root. Now configure cron to make sure CDN’s cron also runs.

  7. Go to admin/logs/status. If the CDN integration module is installed correctly or not, it will report so here.

What’s next?

  • Drupal 6 support. You can expect this in a month or so.
  • UI to configure the synchronization filters.
  • Abstract the synchronization stuff into an API module, so other synchronization modules can benefit from it.
  • The remaining tasks in the issue queue.

More information