In this very brief article, I highlight the key properties of CDNs: what differentiates them and which technical implications you should keep in mind.
A content delivery network (CDN) is a collection of web servers distributed across multiple locations to deliver content more efficiently to users. The server selected for delivering content to a specific user is typically based on a measure of network proximity.
It is extremely hard to decide which CDN to use. In fact, by just looking at a CDN’s performance, it is close to impossible (see “Content Owners Struggling To Compare One CDN To Another” and “How Is CDNs Network Performance For Streaming Measured?”)!
That is why CDNs achieve differentiation through their feature sets, not through performance. Depending on your audience, the geographical spread (the number of PoPs around the world) may be very important to you. A 100% SLA is also nice to have — this means that the CDN guarantees that it will be online 100% of the time.
You may also choose a CDN based on the population methods it supports. There are two big categories here: push and pull. Pull requires virtually no work on your side: all you have to do, is rewrite the URLs to your files: replace your own domain name with the CDN’s domain name. The CDN will then apply the Origin Pull technique and will periodically pull the files from the origin (that is your server). How often that is, depends on how you have configured headers (particularly the
Expires header). It of course also depends on the software driving the CDN — there is no standard in this field. It may also result in redundant traffic because files are being pulled from the origin server more often than they actually change, but this is a minor drawback in most situations. Push on the other hand requires a fair amount of work from your part to sync files to the CDN. But you gain flexibility because you can decide when files are synced, how often and if any preprocessing should happen. That is much harder to do with Origin Pull CDNs. See this table for an overview:
|Transfer protocol||none||FTP, SFTP, WebDAV, Amazon S3 …|
|Advantages||virtually no setup|
It should also be noted that some CDNs, if not most, support both Origin Pull and one or more push methods.
The last thing to consider is vendor lock-in. Some CDNs offer highly specialized features, such as video transcoding. If you then discover another CDN that is significantly cheaper, you cannot easily move, because you are depending on your current CDN’s specific features.
My aim is to support the following CDNs in this thesis:
- any CDN that supports Origin Pull
- any CDN that supports FTP
- Amazon S3 and Amazon CloudFront. Amazon S3 (or Simple Storage Service in full) is a storage service that can be accessed via the web (via REST and SOAP interfaces). It is used by many other web sites and web services. It has a pay-per-use pricing model: per GB of file transfer and per GB of storage.
Amazon S3 is designed to be a storage service and only has servers in one location in the U.S. and one location in Europe. Recently, Amazon CloudFront has been added. This is a service on top of S3 (files must be on S3 before they can be served from CloudFront), which has edge servers everywhere in the world, thereby acting as a CDN.
Previously in this series: