In this article, I explain what was required to integrate the Episodes page loading performance monitoring system with Drupal.
Episodes was written by Steve Souders, whom is well-known for his research on high performance web sites and has authored multiple books on this subject.
The work I am doing as part of bachelor thesis on improving Drupal's page loading performance should be practical, not theoretical. It should have a real-world impact.
To ensure that that also happens, I wrote the Episodes module. This module integrates the Episodes framework for timing web pages (see the “Episodes” section in my “Page loading profiling tools” article) with Drupal on several levels – all without modifying Drupal core:
head tag.Drupal.behaviors) into its own episode.css, headerjs and footerjs episodes, you need to change a couple of lines in the page.tpl.php file of your theme. That is the only modification you have to make by hand. It is acceptable because a theme always must be tweaked for a given web site.I actually wrote two Drupal modules: the Episodes module and the Episodes Server module. The former is the actual integration and can be used without the latter. The latter can be installed on a separate Drupal web site or on the same. It provides basic reports. It is recommended to install this on a separate Drupal web site, and preferably even a separate web server, because it has to process a lot of data and is not optimized. That would have led me too far outside of the scope of this bachelor thesis.
You could also choose to not enable the Episodes Server module and use an external web service to generate reports, but for now, no such services yet exist. This void will probably be filled in the next few years by the business world. It might become the subject of my master thesis.
The goal is to measure the different episodes of loading a web page. Let me clarify that via a timeline, while referencing the HTML:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" dir="ltr"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <link rel="shortcut icon" href="/misc/favicon.ico" type="image/x-icon" /> <link type="text/css" rel="stylesheet" media="all" href="main.css" /> <link type="text/css" rel="stylesheet" media="print" href="more.css" /> <script type="text/javascript"> <!--//--><![CDATA[//><!-- jQuery.extend(Drupal.settings, { "basePath": "/drupal/", "more": true }); //--><!]]> </script> <!--[if lt IE 7]> <link type="text/css" rel="stylesheet" media="all" href="fix-ie.css /> <![endif]--> </head> <body> <!-- lots of HTML here --> </body> </html>
The main measurement points are:
starttime: time of requesting the web page (when the onbeforeunload event fires, the time is stored in a cookie); not in the HTML filefirstbyte: time of arrival of the first byte of the HTML file (the JavaScript to measure this time should be as early in the HTML as possible for highest possible accuracy); line 1 of the HTML filedomready: when the entire HTML document is loaded, but just the HTML, not the referenced filespageready: when the onload event fires, this happens when also all referenced files are loadedtotaltime: when everything, including lazily-loaded content, is loaded (i.e. pageready + the time to lazy-load content)firstbyte - starttimepageready - firstbytedomready - firstbyte, this episode is contained within the frontend episodetotaltime - starttime, this episode contains the backend and frontend episodesThese are just the basic time measurements and episodes. It is possible to also measure the time it took to load the CSS (lines 8-9, this would be the css episode) and JavaScript files in the header (line 10, this would be the headerjs episode) and in the footer (line 27, this would be the footerjs episode), for example. It is possible to measure just about anything you want.
For a visual example of all the above, see the figure “Results of Episodes module in the Episodes Firebug add-on.”.
The episodes.js file provided at the Episodes example is in fact just a rough sample implementation, an implementation that indicates what it should look like. It contained several hardcoded URLs, does not measure the sensible default episodes, contains a few bugs. In short, it is an excellent and solid start, but it needs some work to be truly reusable.
There also seems to be a bug in Episodes when used in Internet Explorer 8. It is actually a bug in Internet Explorer 8: near the end of the page loading sequence, Internet Explorer 8 seems to be randomly disabling the window.postMessage() JavaScript function, thereby causing JavaScript errors. After a while of searching cluelessly for the cause, I gave up and made Internet Explorer 8 also use the backwards-compatibility script (episodes-compat.js), which overrides the window.postMessage() method. The problem had vanished. This is not ideal, but at least it works reliably now.
Finally, there also was a bug in the referrer matching logic, or more specifically, it only worked reliably in Internet Explorer and intermittently worked in Firefox, due to the differences between browsers in cookie handling. Because of this bug, many backend episodes were not being measured, and now they are.
I improved episodes.js to make it reusable, so that I could integrate it with Drupal without adding Drupal-specific code to it. I made it so that all you have to do is something like this:
<head> <!-- Initialize EPISODES. --> <script type="text/javascript"> var EPISODES = EPISODES || {}; EPISODES.frontendStartTime = Number(new Date()); EPISODES.compatScriptUrl = "lib/episodes-compat.js"; EPISODES.logging = true; EPISODES.beaconUrl = "episodes/beacon"; </script> <!-- Load episodes.js. --> <script type="text/javascript" src="lib/episodes.js" /> <!-- Rest of head tag. --> <!-- ... --> </head>
episodes.js. Line 6 should be as early in the page as possible, because it is the most important reference time stamp.
Here is a brief overview with the highlights of what had to be done to integrate the Episodes framework with Drupal.
hook_install(), through which I set a module weight of -1000. This extremely low module weight ensures the hook implementations of this module are always executed before all others.hook_init(), which is invoked at the end of the Drupal bootstrap process. Through this hook I automatically insert the JavaScript into the <head> tag that is necessary to make Episodes work (see the “Making episodes.js reusable” section). Thanks to the extremely low module weight, the JavaScript code it inserts is the first tag in the <head> tag.Drupal.episodes.js, which provides the actual integration with Drupal. It automatically creates an episode for each Drupal “behavior”. (A behavior is written in JavaScript and adds interactivity to the web page.) Each time new content is added to the page through AHAH, Drupal.attachBehaviors() is called and automatically attaches behaviors to new content, but not to existing content. Through Drupal.episodes.js, Drupal's default Drupal.attachBehaviors() method is overridden – this is very easy in JavaScript. In this overridden version, each behavior is automatically measured as an episode.
Drupal.attachBehaviors = function(context) { url = document.location; for (behavior in Drupal.behaviors) { window.postMessage("EPISODES:mark:" + behavior, url); Drupal.behaviors[behavior](context); window.postMessage("EPISODES:measure:" + behavior, url); } };
*.js files exist, create a scan job for each of these and queue them in Drupal's Batch API. Each of these jobs scans each *.js file, looking for behaviors. Every detected behavior is stored in the database and can be marked as ignored through a simple UI that uses the Hierarchical Select module.css and headerjs episodes, it is necessary to make a couple of simple (copy-and-paste) changes to the page.tpl.php of the Drupal theme(s) you are using. These changes are explained in the README.txt file that ships with the Episodes module. This is the only manual change to code that can be done — it is recommended, but not required.Only basic reports are provided, highlighting the most important statistics and visualizing them through charts. Advanced/detailed reports are beyond the scope of this bachelor thesis, because they require extensive performance research (to be able to handle massive datasets), database indexing optimization and usability research.
httpd.conf configuration file for his Apache HTTP server. As just mentioned, my implementation is derived from Jiffy's, yet every configuration line is different.Due to lack of time, the basic reports are … well … very basic. It would be nice to have more charts and to be able to filter the data of the charts. In particular, these three filters would be very useful:
onbeforeunload method to log the time when a next page was requested. The major disadvantage of this method is that it is impossible to measure the backend episode for each page load: it is only possible to measure the backend episode when the user navigates through our site (more specifically, when the referrer is the same as the current domain).hook_boot() and hook_exit() hooks and came to this conclusion.onbeforeunload cookie is not yet set and therefor the backend episode cannot be calculated, which in turn prevents the pageready and totaltime episodes from being calculated. This is of course also a problem when cookies are disabled, because then the backend episode can never be calculated. There is no way around this until the day that browsers provide something like document.requestTime.I explained Steve Souders what I wanted to achieve through this bachelor thesis and the initial work I had already done on integrating Episodes with Drupal. This is how his reply started:
Wow.
Wow, this is awesome.
So, at least he thinks that this was a worthwhile job, which suggests that it will probably be worthwhile/helpful for the Drupal community as well.
Unfortunately for me, Steve Souders is a very busy man, speaking at many web-related conferences, teaching at Stanford, writing books and working at Google. He did not manage to get back to the questions I asked him.
This is a republished part of my bachelor thesis text, with thanks to Hasselt University for allowing me to republish it. This is section eight in the full text.
Previously in this series:
Comments
nice, thanks for this.
It looks like this is going to be a great tool! I like how it picks up all the Drupal.behaviors automatically, and you even gave us unit tests- nice... Regarding the ability to scale, I think your idea of using a external web-service would be good idea. Think mollom, but for profiling ;-)
Thanks for all your hard work Wim.
Yes, exactly, like Mollom,
Yes, exactly, like Mollom, but for profiling. Now, if I only had the time … :) It's a good candidate for a master thesis I think? I'll see. You'll definitely hear more from me on that topic! :)
Pingback
[...] Improving Drupal: Episodes integration [...]
Pingback
[...] Improving Drupal: Episodes integration [...]
Post new comment