| Commit | Line | Data |
|---|---|---|
| 45b69d54 AB |
1 | // $Id$ |
| 2 | ||
| 3 | NOTE: this module is currently in an alpha state. Come back in a bit unless | |
| 4 | you're an experienced user and don't mind figuring things out on your own. | |
| 5 | ||
| 6 | DESCRIPTION | |
| 7 | ----------- | |
| 8 | This module provides static page caching for Drupal 4.7, enabling a | |
| 9 | potentially very significant performance and scalability boost for | |
| 10 | heavily-trafficked Drupal sites. | |
| 11 | ||
| 12 | For an introduction, read the original blog post at: | |
| 13 | http://bendiken.net/2006/05/28/static-page-caching-for-drupal | |
| 14 | ||
| 15 | FEATURES | |
| 16 | -------- | |
| 17 | * Maximally fast page serving for the anonymous visitors to your Drupal | |
| 18 | site, reducing web server load and boosting your site's scalability. | |
| 19 | * On-demand page caching (static file created after first page request). | |
| 20 | * Full support for multi-site Drupal installations. | |
| 21 | * Command line administration support (requires the drush module). | |
| 22 | ||
| 23 | INSTALLATION | |
| 24 | ------------ | |
| 25 | Please refer to the accompanying file INSTALL.txt for installation | |
| 26 | requirements and instructions. | |
| 27 | ||
| 28 | HOW IT WORKS | |
| 29 | ------------ | |
| 30 | Once Boost has been installed and enabled, page requests by anonymous | |
| 31 | visitors will be cached as static HTML pages on the server's file system. | |
| 32 | Periodically (when the Drupal cron job runs) stale pages (i.e. files | |
| 33 | exceeding the maximum cache lifetime setting) will be purged, allowing them | |
| 34 | to be recreated the first time that the next anonymous visitor requests that | |
| 35 | page again. | |
| 36 | ||
| 37 | New rewrite rules are added to the .htaccess file supplied with Drupal, | |
| 38 | directing the web server to try and fulfill page requests by anonymous | |
| 39 | visitors first and foremost from the static page cache, and to only pass the | |
| 40 | request through to Drupal if the requested page is not cacheable, hasn't yet | |
| 41 | been cached, or the cached copy is stale. | |
| 42 | ||
| 43 | FILE SYSTEM CACHE | |
| 44 | ----------------- | |
| 45 | The cached files are stored (by default) in the cache/ directory under your | |
| 46 | Drupal installation directory. The Drupal pages' URL paths are translated | |
| 47 | into file system names in the following manner: | |
| 48 | ||
| 49 | http://mysite.com/ | |
| 50 | => cache/mysite.com/0/index.html | |
| 51 | ||
| 52 | http://mysite.com/about | |
| 53 | => cache/mysite.com/0/about.html | |
| 54 | ||
| 55 | http://mysite.com/about/staff | |
| 56 | => cache/mysite.com/0/about/staff.html | |
| 57 | ||
| 58 | http://mysite.com/node/42 | |
| 59 | => cache/mysite.com/0/node/42.html | |
| 60 | ||
| 61 | You'll note that the directory path includes the Drupal site name, enabling | |
| 62 | support for multi-site Drupal installations. The zero that follows, on the | |
| 63 | other hand, denotes the user ID the content has been cached for -- in this | |
| 64 | case the anonymous user (which is the default, and only, choice available | |
| 65 | for the time being). | |
| 66 | ||
| 67 | DISPATCH MECHANISM | |
| 68 | ------------------ | |
| 69 | For each incoming page request, the new Apache mod_rewrite directives in | |
| 70 | .htaccess will check if a cached version of the requested page should be | |
| 71 | served as per the following simple rules: | |
| 72 | ||
| 73 | 1. First, we check that the HTTP request method being used is GET. | |
| 74 | POST requests are not cacheable, and are passed through to Drupal. | |
| 75 | ||
| 76 | 2. Next, we make sure that the URL doesn't contain a query string (i.e. | |
| 77 | the part after the `?' character, such as `?q=cats+and+dogs'). A query | |
| 78 | string implies dynamic data, and any request that contains one will | |
| 79 | be passed through to Drupal. (This also allows one to easily obtain the | |
| 80 | current, non-cached version of a page by simply adding a bogus query | |
| 81 | string to a URL path -- very useful for testing purposes.) | |
| 82 | ||
| 83 | 3. Since only anonymous visitors can benefit from the static page cache at | |
| 84 | present, we check that the page request doesn't include a cookie that | |
| 85 | is set when a user logs in to the Drupal site. If the cookie is | |
| 86 | present, we simply let Drupal handle the page request dynamically. | |
| 87 | ||
| 88 | 4. Now, for the important bit: we check whether we actually have a cached | |
| 89 | HTML file for the request URL path available in the file system cache. | |
| 90 | If we do, we direct the web server to serve that file directly and to | |
| 91 | terminate the request immediately after; in this case, Drupal (and | |
| 92 | indeed PHP) is never invoked, meaning the page request will be served | |
| 93 | by the web server itself at full speed. | |
| 94 | ||
| 95 | 5. If, however, we couldn't locate a cached version of the page, we just | |
| 96 | pass the request on to Drupal, which will serve it dynamically in the | |
| 97 | normal manner. | |
| 98 | ||
| 99 | IMPORTANT NOTES | |
| 100 | --------------- | |
| 74e552d0 AB |
101 | * Drupal URL aliases get written out to disk as relative symbolic links |
| 102 | pointing to the file representing the internal Drupal URL path. For this | |
| 103 | to work correctly with Apache, ensure your .htaccess file contains the | |
| 104 | following line (as it will by default if you've installed the file shipped | |
| 105 | with Boost): | |
| 106 | Options +FollowSymLinks | |
| 107 | * To check whether you got a static or dynamic version of a page, look at | |
| 108 | the very end of the page's HTML source. You have the static version if the | |
| 109 | last line looks like this: | |
| 110 | <!-- Page cached by Boost at 2006-11-24 15:06:31 --> | |
| 45b69d54 AB |
111 | * If your Drupal URL paths contain non-ASCII characters, you may have to |
| 112 | tweak your locate settings on the server in order to ensure the URL paths | |
| 113 | get correctly translated into directory paths on the file system. | |
| 114 | Non-ASCII URL paths have currently not been tested at all and feedback on | |
| 115 | them would be appreciated. | |
| 116 | ||
| 117 | LIMITATIONS | |
| 118 | ----------- | |
| 119 | * Only anonymous visitors will be served cached versions of pages; logged-in | |
| 120 | users will get dynamic content. This may somewhat limit the usefulness of | |
| 121 | this module for those community sites that require user registration and | |
| 122 | login for active participation. | |
| 123 | * Only content of the type `text/html' will get cached at present. RSS feeds | |
| 124 | and URL paths that have some other content type (e.g. set by a third-party | |
| 125 | module) will be silently ignored by Boost. | |
| 126 | * In contrast to Drupal's built-in caching, static caching will lose any | |
| 127 | additional HTTP headers set for an HTML page by a module. This is unlikely | |
| 128 | to be problem except for some very specific modules and rare use cases. | |
| 129 | * Web server software other than Apache is not supported at the moment. | |
| 130 | Adding Lighttpd support would be desirable but is not a high priority for | |
| 131 | the author at present (see TODO.txt). (Note that while the LiteSpeed web | |
| 132 | server has not been specifically tested by the author, it may, in fact, | |
| 133 | work, since they claim to support .htaccess files and to have mod_rewrite | |
| 134 | compatibility. Feedback on this would be appreciated.) | |
| 135 | * At the moment, Windows users are S.O.L. due to the use of symlinks and | |
| 136 | Unix-specific shell commands. The author has no personal interest in | |
| 137 | supporting Windows but will accept well-documented, non-detrimental | |
| 138 | patches to that effect. | |
| 139 | ||
| 140 | BUG REPORTS | |
| 141 | ----------- | |
| 142 | Post feature requests and bug reports to the issue tracking system at: | |
| 143 | http://drupal.org/node/add/project_issue/boost | |
| 144 | ||
| 145 | CREDITS | |
| 146 | ------- | |
| 147 | Developed and maintained by Arto Bendiken <http://bendiken.net/> |