Drupal 8 now has page caching enabled by default

published on April 9, 2015

After more than a year and probably hundreds of patches, yesterday it finally happened! As of 13:11:56 CET, April 8, 2015, Drupal 8 officially has page caching enabled by default!1 And not the same page caching as in Drupal 7: this page cache is instantly updated when something is changed.

The hundreds of patches can be summarized very simply: cache tags, cache tags, cache tags. Slightly less simple: cacheability metadata is of vital importance in Drupal 8. Without it, we’d have to do the same as in Drupal 7: whenever content is created or a comment is posted, clear the entire page cache. Yes, that is as bad as it sounds! But without that metadata, it simply isn’t possible to do better.2

I’ve been working on this near-full time since the end of 2013 thanks to Acquia, but obviously I didn’t do this alone — so enormous thanks to all of you who helped!

This is arguably the biggest step yet to make Drupal Fast By Default. I hate slow sites with a passion, so you can probably see why I personally see this as a big victory :)

(One could argue you could just enable Drupal 7’s page cache, but there are 3 reasons why this is inferior to Drupal 8’s page cache: a) no instantaneous updates; b) any node or comment posted causes the entire page cache to be cleared (!), c) it’s not enabled by default: many users don’t know they should enable this. Sure, you can get similar performance, but you’ll have to give up certain things, which makes it an apples vs. oranges comparison.)

Benchmark

By default, Drupal 8 is now between 2 and 200 times faster than Drupal 7 for anonymous users: Drupal 8 will respond in constant time, for Drupal 7 it depends on the complexity of the page.

On my machine (ab -c1 -n 1000, PHP 5.5.11, Intel Core i7 2.8 GHz laptop, warm caches):

Drupal 7

  • Front-page: 18.5 ms/request (55 requests/s)
  • node/1: 23.5 ms/request (43 requests/s)
  • More complex pages: easily hundreds of milliseconds, only few requests per second.

Drupal 8

Always 6.5 ms/request (154 requests/s)3.

Easily tenfold that on actual servers

The above is tested with a concurrency of 1 client, to ignore web server performance (Apache in my case).

Rasmus Lerdorf — creator of the PHP language — also did some benchmarking, and he found that Drupal 8 was able to serve >2500 requests/second on his server, with 20 concurrent clients.

Win-win

The real beauty is that it’s a win-win: enterprise (Acquia), medium, small, tiny (hobbyist) all win:

  • Enterprise sites get very nice reverse proxy/CDN-based hosting
  • Tiny sites can easily serve 100 requests/second (>8 million requests/day) on shared hosting.

So my work was sponsored by Acquia, but benefits everyone!

People have been expressing concerns that Drupal 8 has become too complex, that it doesn’t care about site builders anymore, that it is only for enterprises, etc. I think this is a good counterexample.
Yes, we added the complexity of cacheability metadata, but that only affects developers — for whom we have good documentation. And most importantly: site builders reap the benefits: they don’t even have to think about this anymore. Manually clearing caches is a thing of the past starting with Drupal 8!

Page cache is just a built-in reverse proxy

Drupal’s page cache is just a built-in reverse proxy. It’s basically “poormansvarnish”.

Drupal 8 bubbles all cacheability metadata up along the render tree, just like JavaScript events bubble up along the DOM tree. When it reaches the tree’s root, it also bubbles up to the response level, in the form of the X-Drupal-Cache-Tags header.

The page cache uses that header to know what cache tags it should be invalidated by. And because of that, other (“real”) reverse proxies can do exactly the same. The company behind Varnish even blogged about it. And CDNs are even starting to support this exact technique out of the box, for example Fastly.

Last but not least: all of Drupal 8’s integration tests use the page cache by default, which means all of our integration tests effectively verify that Drupal works correctly even if they’re behind a reverse proxy!

New possibilities for small sites (and shared hosting)

On one end of the spectrum, I see great shared hosting providers starting to offer Varnish even on their smallest plans. For example: Gandi offers Varnish on their €4/month plans. If users can configure Varnish — or even better, if they pre-configure Varnish to support Drupal 8’s cache tag-based invalidation — then almost all traffic will be handled by Varnish. (Update: see the official docs on how to use Varnish + cache tags.)

For 90% or more of all sites, this would quite simply be good enough: very cheap, very fast, very flexible.4

I can’t wait until we see the first hosting provider offering such awesome integration out of the box!

New possibilities for enterprise sites (and enterprise hosting)

On the other hand of the spectrum, enterprise hosting now gains the ability to invalidate (purge) all and only the affected pages on a CDN5. Without having to generate a list of URLs that a modified piece of content may appear on, and then purge those URLs. Without having to write lots of hooks to catch all the cases where said content is being modified.

At least equally important: it finally allows for caching content that previously was generated dynamically for every request, because it was a strong requirement that the information always be up-to-date6. With cache tag support, and strong guarantees that cache tags indeed are invalidated when necessary, such use cases now can cache the content and still be confident that updates will immediately propagate.

New possibilities for developers

Finally, the addition of cache tags and by extension, all render cacheability metadata (cache tags, contexts and max-age), allow for greater insight and tooling when analyzing hosting, infrastructure, performance and caching problems. Previously, you had to analyze/debug a lot of code to figure out why something that was cached was not being invalidated when appropriate by said code.

Because it’s now all standardized, we can build better tools — we can even automatically detect likely problems: suspiciously frequent cache tag invalidations, suspiciously many cache tags … (but also cache contexts that cause too many variations, too low or too high maximum ages …).

Next steps

Warm cache performance is now excellent, but only for anonymous users.

Next week, at Drupal Dev Days Montpellier, we’ll be working on improving Drupal 8’s cold cache performance (including bootstrap and routing performance). That will also help improve performance for authenticated users.

But we already have been working several weeks on improving performance for authenticated users. Together with the above, we should be able to outperform Drupal 7. This is the plan that Fabian Franz and I have been working towards:

  1. smartly caching partial pages for all users (including authenticated users): d.o/node/2429617, which requires cache contexts to be correct
  2. sending the dynamic, uncacheable parts of the page via a BigPipe-like mechanism: d.o/node/2429287

  1. That’s commit 25c41d0a6d7806b403a4c0c555f7dadea2d349f2

  2. In other words: all of this is made possible thanks to optimal cache invalidation. Yes, that quote

  3. We’re also working on making the page cache faster We made the page cache faster. We went down from 8.3 ms/request (120 requests/second) when this blog post was published on April 9, to 6.5 ms/request (154 requests/second) on April 17. It should be possible to achieve 5 ms/request, or 200 requests per second. Drupal 7 is still significantly faster though, at 2.5 ms/request (on my machine, see the Benchmark section). It’s likely Drupal 8 won’t be able to match that because the early bootstrapping is heavier. 

  4. And not something any other CMS offers as far as I know — if there is one, please leave a comment! 

  5. Keep an eye on the Purge module for Drupal 8. It will make it very easy to apply cache tag-based invalidation to self-hosted reverse proxies (Varnish, ngninx…), but also to put your entire site behind a CDN and still enjoy instantaneous invalidations! 

  6. You could already use #cache[expire] in Drupal 7, but in Drupal 8, the combination of #cache[max-age] and #cache[tags] means that you have both time-based invalidation and instantaneous tag-based invalidation. Whichever invalidation happens first, invalidates the cached data. And therefore: updates occur as expected. 

Comments

anavarre's picture

Absolutely thrilled to see this shaping up! This is almost too good to be true and we can’t even yet picture the number of wins this will lead to. Great job, Wim and all!

lussoluca's picture
lussoluca

Wow Wim that’s awesome! Thank you for your hard work!

we can even automatically detect likely problems: suspiciously frequent cache tag invalidations, suspiciously many cache tags … (but also cache contexts that cause too many variations, too low or too high maximum ages …).

This is a job for Webprofiler! I’ve been very busy these past few months but let discuss how can I help to extract those metrics, I’m planning to restart implementing new functionalities soon!

Wim Leers's picture
Wim Leers

There’s definitely a place for Webprofiler there :)

But not for all of these things: some of this need to be tracked/monitored across many requests, not in a single request, and some of them you may even want to monitor in production. Neither of those seem to me what Webprofiler’s intended usage is, right?

That being said, yes, absolutely, I imagine people developing/building Drupal 8 sites will do so with Webprofiler enabled :)

lussoluca's picture
lussoluca

some of this need to be tracked/monitored across many requests, not in a single request

I’m writing a set of Drupal Console commands to do statistical analysis on stored profiles so we can get metrics over time.

and some of them you may even want to monitor in production.

Yes you are right, webprofiler isn’t a production tool.

David Rothstein's picture

Awesome, and congratulations! The cache invalidation work in Drupal 8 is definitely very impressive.

An important clarification, however:

Out of the box, Drupal 8 is now between 2 and 200 times faster than Drupal 7 for anonymous users: Drupal 8 will respond in constant time, for Drupal 7 it depends on the complexity of the page.

The key phrase there is “out of the box”. Specifically, these numbers come from comparing Drupal 7 with page caching turned off to Drupal 8 with page caching turned on.

If you go to the Performance page of a Drupal 7 site and click the checkbox to turn on page caching, Drupal 7’s response time for anonymous users will not depend on the complexity of the page either, and it will be faster than Drupal 8. Some rough testing I did just now (not very scientific) showed Drupal 7 at around 4 ms/request for anonymous users, and (in agreement with your numbers) Drupal 8 around 8 ms/request. So Drupal 7 still seems to be around twice as fast as Drupal 8 for cached page responses (although that could certainly change as time goes on).

Of course, “out of the box” definitely does matter in practice… which is why I’m proposing in https://www.drupal.org/node/606840 that we think seriously about turning on caching by default for Drupal 7 too.

Wim Leers's picture
Wim Leers

Thank you for your thoughtful comment!

Yes, absolutely, the key phrase is “out of the box”. This is why the entire intro explains how Drupal 8’s page caching is different:

  1. And not the same page caching as in Drupal 7: this page cache is instantly updated when something is changed.
  2. Without [cache tags], we’d have to do the same as in Drupal 7: whenever content is created or a comment is posted, clear the entire page cache. Yes, that is as bad as it sounds! But without that metadata, it simply isn’t possible to do better.

But since that apparently is not yet clear enough, I added a new paragraph to the intro:

(One could argue you could just enable Drupal 7’s page cache, but there are 3 reasons why this is inferior to Drupal 8’s page cache: a) no instantaneous updates; b) any node or comment posted causes the entire page cache to be cleared (!), c) it’s not enabled by default: many users don’t know they should enable this. Sure, you can get similar performance, but you’ll have to give up certain things, which makes it an apples vs. oranges comparison.)

It is also for that reason, and for the reasons I posted in that issue you linked, that I think it’s a bad idea to start enabling the page cache by default in Drupal 7.

Finally, just before you posted this comment, I updated footnote 3, to clarify that Drupal 7’s page cache is indeed faster than Drupal 8 (I actually have 2.5 ms/request, so even better than your 4 ms/request). Drupal 8 will still improve in terms of ms/request from the page cache, but is unlikely to become as fast as Drupal 7. See that footnote.

David Rothstein's picture

Thanks, yes, the extra paragraph is helpful.

Regarding the footnotes, glad you mentioned that. I looked at the blog post right after posting my comment and saw it then — at which point I was really confused how I managed to miss it the first time, since it was 100% relevant to my comment. Good to know it wasn’t actually there the first time and that I’m not losing my mind :)

Wim Leers's picture
Wim Leers

Hah! Sorry for the confusion :)

Nicolas's picture

Great work, Wim! I’m eager to test out this caching mechnism.

Wim Leers's picture
Wim Leers

Yes, feel free to try and break it! We’ll be doing that at Drupal Dev Days Montpellier.

I see I forgot to mention that in the blog post: we have an issue for collecting any pages that aren’t invalidated when appropriate: d.o/node/2467071. Hopefully see you there :)

Paul Hudson's picture

Great job, again.

We’ve been using Perusio’s Nginx config as a base for our reverse proxy https://github.com/perusio/drupal-with-nginx.

I’ve kept a D8 branch on my radar for a while and look forward to folding these new Drupal headers in.

As a side, I do wonder why Varnish is still so popular over Nginx or others. HTTPS and SPDY are becoming the defacto standard and it’s just nicer to be able to terminate them at your reverse proxy.

Wim Leers's picture
Wim Leers

I guess it’s because Varnish has been around for longer and has a narrower use case, which makes it clearer what to use it for?

But yeah, from what I can tell, nginx is equally awesome :) Looking forward to see how you let nginx take advantage of the X-Drupal-Cache-Tags header!

Joël Pittet's picture
Joël Pittet

This is so awesome, I’m excited about this change! Granular and automatic cache invalidation! No more explaining why content haven’t changed yet for anonymous! Thank you so much Wim!

Wim Leers's picture
Wim Leers

Exactly! Taking the “WTF” out of the WWW! Don’t thank me, thank the folks who brought the cache tag concept to Drupal 8 in the first place!

Ilmari Oranen's picture

Wow, fantastic work! I have also used (entity ID) based cache-tagging + purging for some years, but in rather limited use cases. So far the use cases required some added complexity with custom code for both Drupal 7 and Varnish (vcl-config).

Having all that working out of the box on D8 is more than I could wish for, it’s huge!

Wim Leers's picture
Wim Leers

That’s great to hear! :) Thanks for leaving that comment!

It’d be great if you could set up complex scenarios and see where things break down. That’s what we need to battle-harden Page Cache/reverse proxy support https://www.drupal.org/node/2467071.

Thomas Svenson's picture

This is just brilliant news and has been a top item on my personal wish list for the core caching system. To be able to trust that changed or added content is available instantly for site users is a real game changer.

That it will be on by default is great as I hope that means developers will code and configure with it on and thus use the same caching mechanism as is used on live production sites.

Thanks Wim and everyone else who have helped make this happen.

Wim Leers's picture
Wim Leers

Glad to hear it was also on your personal wish list :)

Thomas Svenson's picture

It was one of those thing that, in my mind, was very possible to automate and that without the automatization means time, energy and resources wasted.

Not just having to personally remembering when to manually clear cache, especially how much computing power is wasted today to re-generate stuff that don’t need to be.

Basically what your news did to me was “poof” and I knew something wasteful was gone that I no longer need to “worry” about.

Thanks mate, I’ll buy the beer next time…

Wim Leers's picture
Wim Leers

Hah :)

Even ignoring the amount of wasted CPU cycles and therefore elektricity, there’s also the enormous waste of human time. How many sites take many seconds to load? Far too often still, I am forced to watch about:blank on fast, reliable WiFi… which means it’s really the site that’s to blame, not my device or connection. Then imagine what it’s like on 3G, on a train, which often has >5-second round-trip times!

Thomas Svenson's picture

That’s what I mean with “poof gone”! ;)

This for me is the best and purest of fruits grown from the collaboration called open sourcery…

dbeall's picture
dbeall

Great news, it’s been in my mind as when should I move to D8. A great cache ready to run is certainly attractive to thousands of site developers. I’ve been milking the awesome Boost module with D6 for a long time on a shared host with excellent results. This should help bring faster upgrades to D8 for lots of sites. I know what was involved with getting Boost up to speed, this had to be lots of work as well. This is a great thread to stumble across!

Wim Leers's picture
Wim Leers

Boost may still be faster (because it relies only on the web server; no PHP is involved), but it’s not able to deliver instantaneous updates.

Also, with the rise of reverse proxies (such as Varnish) on cheap hosts, you’ll be able to easily outperform Boost, while still retaining instantaneous updates :)

Jim's picture
Jim

With all of the changes in Drupal 8’s caching compared to Drupal 7, it sounds to me like support for self-hosted reverse-proxy is “in there”.

For me, I use NGINX as a web server — both as a front-end SSL terminator and as a backend to a Varnish v4 cache.

Now that Drupal 8’s here I’m trying — and so far failing — to install it on that stack.

In your post I read

If users can configure Varnish — or even better, if they pre-configure Varnish to support Drupal 8’s cache tag-based invalidation — then almost all traffic will be handled by Varnish.

Well, there’s the catch! How?

Is there any official/complete documentation on how to configure Drupal 8 properly for use with Varnish 4?

I’m specifically looking for deployment docs that DO NOT assume/use the ‘Varnish’ module. I.e., I want a config that works with out-of-the-box Drupal 8, with no additional modules installed or required.

Is that even possible with Drupal 8? And if so where are complete docs?

Your post is the closest thing I’ve found anywhere about Drupal 8 plus Varnish 4.

It’d be really helpful if you could point to any further guidance! Love to just get Drupal 8 up & running!