Drupal 8's Dynamic Page Cache

Published on 12 October, 2015

Drupal 8 now has a Dynamic Page Cache. The Page Cache module only works for anonymous users, the Dynamic Page Cache module takes that a step further: it works for any user.

Since April 8, Drupal 8 had its Page Cache enabled by default. Exactly 5 months later, on September 8, the Dynamic Page Cache1 module was added to Drupal 8, and also enabled by default.

What?

The Page Cache module caches fully rendered HTML responses — it assumes only one variant of each response exists, which is only true for anonymous users2. The innovation in 8 on top of 7’s Page Cache is the addition of cache tags, which allow one to use the Page Cache but still have instantaneous updates: no more stale content.

The Dynamic Page Cache module caches mostly rendered HTML responses — because it does not assume only a single variant exists: thanks to cache contexts, it knows the different variants that exist of each part of the page, and thus also of the final (fully assembled) page. During the rendering process, auto-placeholdering ensures that the parts of the page that are too dynamic to cache3 or are simply uncacheable are replaced with placeholders. These placeholders are only rendered at the very last moment. The Dynamic Page Cache module caches the page just before those placeholders are replaced.

So, conceptually speaking:

Page CacheDynamic Page Cache
  • URL
  • URL
  • cache contexts
↓↓
  • final response
  • cache tags
  • partial response
  • placeholders
  • cache tags

This table also illustrates quite clearly the strengths and weaknesses of both: Page Cache is faster because it contains the final response, unlike Dynamic Page Cache which only contains a partial response, but thanks to that response not yet being personalized it can be used across users.

In other words: Dynamic Page Cache saves us less work, but in doing so, allows us to apply it in many more cases.

Benchmark

Drupal 8’s Page Cache can respond in constant time. So, for anonymous users, Drupal 8 already responds in constant (amortized) time.

With Dynamic Page Cache, Drupal 8 is now able to respond in constant amortized time + the time it takes to render the placeholders. Which means that most parts of most pages can be cached in the constant part (and it doesn’t matter how much content is in there!). The parts of the page that end up as placeholders are the ones you want to optimize, and if possible, avoid. (Also see BigPipe further down.)

On my machine (ab -c1 -n 1000, PHP 5.5.11, Intel Core i7 2.8 GHz laptop, warm caches, 15 nodes with one comment each, 10 of which are listed on the frontpage):

  • Without Dynamic Page Cache
  • Front page: 61.3 ms/request (16 requests/s)
  • node/1: 92.3 ms/request (10.8 requests/s)
  • With Dynamic Page Cache
    • Front page: 38.3 ms/request (26 requests/s)
    • node/1: 73.8 ms/request (13.5 requests/s)
Histogram showing the before vs. after of the front page with Dynamic Page Cache.

Analysis

The front page only contains a single placeholder (for messages). This makes it is a strong indicator of the lowest response times to expect: roughly 38 ms on my machine.

  • ~38 ms goes to bootstrapping, routing, route access checking, and of course retrieving the cached response from the Dynamic Page Cache.4
  • Compare that with the node/1 case: 35.5 more milliseconds are spent replacing placeholders there. Which means that we do some very expensive rendering there… and that is the comment form.5

As you can see: Dynamic Page Cache actually becomes a helpful assistant while developing: it surfaces the impossible-to-cache-and-expensive-to-render parts of the page!

Note that we do have an issue to make Dynamic Page Cache run much earlier, see more about that below.

Easily tenfold that on actual servers

The above is tested with a concurrency of 1 client, to ignore web server performance (Apache in my case). See e.g. Rasmus Lerdorf’s profiling of the Page Cache.

Win-win

The real beauty is that it’s a win-win: enterprise (Acquia), medium, small, tiny (hobbyist) all win:

  • Enterprise sites get better tunability and reverse proxy/CDN-based hosting even for authenticated users
  • Tiny sites can handle more simultaneous authenticated users, and serve those faster

So my work was sponsored by Acquia, but benefits everyone!

People have been expressing concerns that Drupal 8 has become too complex, that it doesn’t care about site builders anymore, that it is only for enterprises, etc. I think this is a good counterexample — in fact an even stronger one than Page Cache being enabled by default: this was absolutely out of reach for the majority of Drupal 7 sites. Yet every Drupal 8 site will have this enabled by default.
Yes, we added the complexity of cacheability metadata, but that only affects developers — for whom we have good documentation. And most importantly: site builders reap the benefits: they don’t even have to think about this anymore. Manually clearing caches is a thing of the past starting with Drupal 8!

Drupal is very much like LEGO. Until Drupal 8, if you combined too many LEGO blocks, your Drupal site would become very slow. Setting up caching was very often prohibitively complex. Now the LEGO blocks must specify their cacheability metadata. Which allows Drupal to automate a huge portion of “making things fast”.

New possibilities for small sites (and shared hosting)

Smaller sites will be able to handle more authenticated users simultaneously. And have pages be much faster for those users.

Without any configuration.

Without any frustrating searching on the internet.

New possibilities for enterprise sites (and enterprise hosting)

On the other hand of the spectrum, enterprise hosting now gains the ability to tune: if they prefer to running caching infrastructure with enough storage capacity rather than rendering placeholders on the fly (i.e. more storage for less computation), then they can: just override the default configuration for auto-placeholdering conditions.

It also opens the door to — for the first time — caching responses for authenticated users in a CDN.

New possibilities for developers

The addition of cacheability metadata (cache tags, contexts and max-age), allow for greater insight and tooling when analyzing hosting, infrastructure, performance and caching problems. Previously, you had to analyze/debug a lot of code to figure out why something that was cached was not being invalidated when appropriate by said code.

Because it’s now all standardized, we can build better tools — we can even automatically detect likely problems: suspiciously many cache contexts, suspiciously low maximum ages, suspiciously frequent cache tag invalidations…

And finally, it makes it possible to do BigPipe-style rendering, making Drupal 8 the fastest Drupal yet:

Dynamic Page Cache in 8 vs. Authcache in 7

Dynamic Page Cache’s closest comparison in Drupal 7 is the Authcache module. It’s possible to achieve great performance with Authcache in Drupal 7, but it requires a huge amount of work: all code (contrib & custom) must be analyzed to ensure all aspects that cause variations in the output (rendered pages) are taken into account. All of these factors must then be passed to Authcache so that it can calculate its “authcache key” correctly. The “key properties” that you specify there are conceptually similar to Drupal 8’s cache contexts. However, it is very easy to miss anything, because it’s you as a site owner that is responsible for identifying all the variations (and corresponding key properties) across all modules across all pages6. By default, Authcache assumes the only thing that your site varies by, are a site’s base URL and a user’s roles. So unless your site is very simple (for example not even multilingual), you will need an enormous amount of knowledge/research in order to be able to use Authcache.

(For example, Drupal.org — which currently still runs on Drupal 7 — rolled out render caching of comments several months ago. It works very much like Authcache: you must specify the things that cause variations. One aspect (timezone) was forgotten, and it led to rather confusing broken output. If even Drupal.org’s very strong team can get it wrong, then what hope does a team with less expertise have?)

Dynamic Page Cache in 8 simply builds on the existing cacheability metadata concepts and infrastructure. There’s no need for custom hooks. There’s no possibility of something being forgotten as a site owner. It’s the individual modules’ responsibility now — and those modules know far better what they’re doing. The entire “analyze the entire codebase” phase is gone. And of course, Dynamic Page Cache also supports cache tags, so unlike Authcache in Drupal 7, responses cached in Dynamic Page Cache will actually be updated instantaneously.

In Drupal 8.0.0, Dynamic Page Cache runs after bootstrapping, routing and route access checking. Because Authcache makes certain simplifying assumptions, it can run before Drupal 7 has even fully bootstrapped, thus it is much faster most of the time7. We do have an issue to make Dynamic Page Cache run much earlier, which would result in a great speed-up: #2541284 — we will aim to do that in Drupal 8.x.0.

Finally, because it is enabled by default in Drupal 8, Dynamic Page Cache has one enormous advantage: it brings this functionality to all Drupal sites and all Drupal developers. All Drupal developers are thus heavily encouraged to specify the appropriate cache contexts in their code. Which means that Drupal 8 contrib modules will be battle-hardened by many (tens of) thousands of websites, and missing cacheability metadata will be added. Instead of an enormous burden lying on the shoulders of the Authcache site owner, a very small burden lies on the shoulders of every contrib module maintainer (and user). Drupal 8 contrib modules should have integration test coverage with Dynamic Page Cache enabled.


Historical note

Somewhere in mid-February, Dries asked me what the state was of ESI support in Drupal 8, whether partial pre-rendering was possible, and so on. I replied in two ways: in short (No, ESI is not fully supported yet8) and comprehensively, describing how full ESI support was now more feasible than ever, which other exciting things are possible with the render- and caching-related improvements, and so on. That e-mail I sent formed the foundation of Dries’ Making Drupal 8 fly blog post :)

But a side effect of me describing at a high level the exciting new possibilities got me thinking… I had spent a full year working on improving Drupal 8’s performance & cacheability. At that point, cache tags were ~90% done. Thanks to that, it had become clear that enabling page caching by default had finally become within reach (and it effectively happened two months later).

Cache contexts were also getting in better and better shape. What if… I would be able to leverage cache contexts in a similar way I was able to leverage cache tags? Complete cache tag support allowed us to enable page cache by default. What would complete cache context support allow? Cache tags capture data dependencies. Cache contexts capture (request) context dependencies. Cache tags allow for correct invalidations. Anonymous users do not get any variations, so cache tags are sufficient for them. Which means… that cache contexts would allow me to … do page caching, but for all users — i.e. also authenticated ones!

Or: how articulating an answer to a fairly specific question can make one think of other ideas :)

I started experimenting and was blown away by the results of my rough proof and concept patches. I opened the “SmartCache” ([#2429617]) issue and the “finalize cache contexts API & DX/usage, enable a leap forward in performance” ([#2429287]) issue that captured the many, many places where we’d still have to finish cache context support in order to be able to actually do what the experiment describes.

At the same time, I was talking to Fabian Franz. Together with him, I devised an algorithm (and a proof) to make it possible to bubble cache contexts, which is absolutely essential for Dynamic Page Cache.

Fast forward seven months and we’ve fixed many dozens of blockers for Dynamic Page Cache, and have landed Dynamic Page Cache itself, thanks to the hard work of many people!


  1. Originally, it was called Smart Cache. See issue #2429617↩

  2. Not every site serves the same exact responses to all their anonymous users, but most do. Therefore this is a great default. If your site serves different responses to different anonymous users, then you can just disable that module. The Page Cache therefore operates just like any other simple reverse proxy setup. â†©

  3. Because they contain user-specific or session-specific data, i.e. because they have the user or the session cache context associated. â†©

  4. That is significantly slower than 7. We still have lots of work ahead in Drupal 8.1, 8.2 etc to make bootstrapping and routing much faster. â†©

  5. We’re working on making forms cacheable in #2578855, which should then put the node/1 response time roughly on par with the front page’s response time when using Dynamic Page Cache. â†©

  6. And it always varies by the same key on all pages, even if page A only varies by role, and page B varies by role, user, interface language and organic group. So Authcache always varies by the superset of all possible variations across the entire site. Which means that AuthCache will generate many more variations, even unnecessary ones, and will therefore consume much more space in the cache back-end. â†© â†©

  7. When a user first logs in, they will get a full bootstrap, so that the user is authenticated. Since the key is the same on all pages (again, see 6), it is then able to cache this key per session and reuse it on all future requests, which means that all subsequent requests for this session will be much faster. See _authcache_builtin_cacheinc_cache_key_get()↩

  8. Technically, Drupal 7 supported ESI, but it was very, very complex, either via drupal_render_cache_set()’s ability to have cache backends modify the markup they’re being asked to cache (in other words: ESI via side effects) which links to custom endpoints in custom code to generate the dynamic content, or via the ESI module, which is complex to use and also requires custom code. “Full support” in my book means that anything can be served over ESI, transparently, without any custom code, code changes or deep technical knowledge. â†©

Erich Beyrent

9 years 3 months ago

When you say “caching responses for authenticated users in a CDN”, do you mean any CDN, including something like Amazon CloudFront?

CloudFront itself is capable of using lots of different things as cache keys, so if there’s anything in the stream to differentiate one authenticated user from another, it could use it.

To avoid security issues it would need to use something in the response, not the request, I would think. Anything in the request except maybe a cryptographic hash could be forged by an attacker.

Amazon CloudFront’s capabilities are currently far too limited to do this.

The only CDN currently capable of caching authenticated user responses is Fastly. They’re basically Varnish-as-a-Service. We need some level of control, some logic (though not much!), on the edge, in order to pull this off.

I recommend watching the talk I linked to for the full explanation.

Yes, that’s perfectly possible. Especially combined with Page Cache and Dynamic Page Cache, which lessen the load if multiple of Fastly’s (or any CDN’s) edges talk to the origin to populate that edge node’s cache.

Lostandfound

9 years 3 months ago

Wim, I just wanted to say thanks for the article. Both here and in the issue queues you always provide explanations which can be understood clearly, not just by core commiters. This really helps in developing understanding of Drupal in general, and more specifically the caching system.

Thank you! :)

And I’m very glad to hear that — that’s exactly what I’m trying to do. Sometimes that means parts of my comments can be repetitive (and perhaps annoying), but in my opinion that’s worth it if it allows more people to follow along and participate.

Thanks for letting me know :)

Thank you for excellent article. I confirm that Drupal 8 page cache works incredibly well in our Drupal 8 RC2 benchmarks. We were able to get our test 8 CPU 8GB RAM VPS to serve 1200 visitors per second (1 minute test via loader.io) with no timeouts. The results and our environment could be viewed on our blog.

Have you done more recent ab benchmarking with Drupal 8 RC2? The reason I am asking is that our ab testing gives over 100 requests per second on the homepage, while yours are much lower.

Before engaging in an actual discussion, I’d like to ask you to update your blog post to provide some missing information:

  • Did you test with anonymous or authenticated users?
  • If you tested with authenticated users, how many unique authenticated users? Do they have the same access patterns?
  • Did you disable the Dynamic Page Cache module?
  • What is this “clients per second” thing? I think you mean “500 simultaneous clients, that each perform a new request as soon as the last one finished”?
  • The tests mention “SSL port”. I suspect you mean you’re requesting those pages over HTTPS? What do the results look like without using HTTPS?
  • Note that the minimum response time for Drupal 8 is 240 ms versus 592 for Drupal 7. Both of those numbers seem much higher than they should be. Did you investigate why the minimum response time is so high?

I think it’s great you did this test, but without more precise information about the testing methodology and without determining the root cause for some of the suspicious numbers, this test raises more questions than it answers. It’d be great if you could update the blog post to update the remarks I made above, because it’d greatly improve the trustworthiness of your research :)

RK

7 years ago

Explained very well, Thank you so much for putting your time and creating such a nice tutorial. I would love to hear your thoughts on varnish and redis.