Added mod_rewrite rules to ensure /admin and /user/login are never ever cached.
[project/boost.git] / README.txt
CommitLineData
45b69d54
AB
1// $Id$
2
3NOTE: this module is currently in an alpha state. Come back in a bit unless
4you're an experienced user and don't mind figuring things out on your own.
5
6DESCRIPTION
7-----------
8This module provides static page caching for Drupal 4.7, enabling a
9potentially very significant performance and scalability boost for
10heavily-trafficked Drupal sites.
11
12For an introduction, read the original blog post at:
13 http://bendiken.net/2006/05/28/static-page-caching-for-drupal
14
15FEATURES
16--------
17* Maximally fast page serving for the anonymous visitors to your Drupal
18 site, reducing web server load and boosting your site's scalability.
19* On-demand page caching (static file created after first page request).
20* Full support for multi-site Drupal installations.
21* Command line administration support (requires the drush module).
22
23INSTALLATION
24------------
25Please refer to the accompanying file INSTALL.txt for installation
26requirements and instructions.
27
28HOW IT WORKS
29------------
30Once Boost has been installed and enabled, page requests by anonymous
31visitors will be cached as static HTML pages on the server's file system.
32Periodically (when the Drupal cron job runs) stale pages (i.e. files
33exceeding the maximum cache lifetime setting) will be purged, allowing them
34to be recreated the first time that the next anonymous visitor requests that
35page again.
36
37New rewrite rules are added to the .htaccess file supplied with Drupal,
38directing the web server to try and fulfill page requests by anonymous
39visitors first and foremost from the static page cache, and to only pass the
40request through to Drupal if the requested page is not cacheable, hasn't yet
41been cached, or the cached copy is stale.
42
43FILE SYSTEM CACHE
44-----------------
45The cached files are stored (by default) in the cache/ directory under your
46Drupal installation directory. The Drupal pages' URL paths are translated
47into file system names in the following manner:
48
49 http://mysite.com/
50 => cache/mysite.com/0/index.html
51
52 http://mysite.com/about
53 => cache/mysite.com/0/about.html
54
55 http://mysite.com/about/staff
56 => cache/mysite.com/0/about/staff.html
57
58 http://mysite.com/node/42
59 => cache/mysite.com/0/node/42.html
60
61You'll note that the directory path includes the Drupal site name, enabling
62support for multi-site Drupal installations. The zero that follows, on the
63other hand, denotes the user ID the content has been cached for -- in this
64case the anonymous user (which is the default, and only, choice available
65for the time being).
66
67DISPATCH MECHANISM
68------------------
69For each incoming page request, the new Apache mod_rewrite directives in
70.htaccess will check if a cached version of the requested page should be
71served as per the following simple rules:
72
73 1. First, we check that the HTTP request method being used is GET.
74 POST requests are not cacheable, and are passed through to Drupal.
75
76 2. Next, we make sure that the URL doesn't contain a query string (i.e.
77 the part after the `?' character, such as `?q=cats+and+dogs'). A query
78 string implies dynamic data, and any request that contains one will
79 be passed through to Drupal. (This also allows one to easily obtain the
80 current, non-cached version of a page by simply adding a bogus query
81 string to a URL path -- very useful for testing purposes.)
82
83 3. Since only anonymous visitors can benefit from the static page cache at
84 present, we check that the page request doesn't include a cookie that
85 is set when a user logs in to the Drupal site. If the cookie is
86 present, we simply let Drupal handle the page request dynamically.
87
88 4. Now, for the important bit: we check whether we actually have a cached
89 HTML file for the request URL path available in the file system cache.
90 If we do, we direct the web server to serve that file directly and to
91 terminate the request immediately after; in this case, Drupal (and
92 indeed PHP) is never invoked, meaning the page request will be served
93 by the web server itself at full speed.
94
95 5. If, however, we couldn't locate a cached version of the page, we just
96 pass the request on to Drupal, which will serve it dynamically in the
97 normal manner.
98
99IMPORTANT NOTES
100---------------
74e552d0
AB
101* Drupal URL aliases get written out to disk as relative symbolic links
102 pointing to the file representing the internal Drupal URL path. For this
103 to work correctly with Apache, ensure your .htaccess file contains the
104 following line (as it will by default if you've installed the file shipped
105 with Boost):
106 Options +FollowSymLinks
107* To check whether you got a static or dynamic version of a page, look at
108 the very end of the page's HTML source. You have the static version if the
109 last line looks like this:
110 <!-- Page cached by Boost at 2006-11-24 15:06:31 -->
45b69d54
AB
111* If your Drupal URL paths contain non-ASCII characters, you may have to
112 tweak your locate settings on the server in order to ensure the URL paths
113 get correctly translated into directory paths on the file system.
114 Non-ASCII URL paths have currently not been tested at all and feedback on
115 them would be appreciated.
116
117LIMITATIONS
118-----------
119* Only anonymous visitors will be served cached versions of pages; logged-in
120 users will get dynamic content. This may somewhat limit the usefulness of
121 this module for those community sites that require user registration and
122 login for active participation.
123* Only content of the type `text/html' will get cached at present. RSS feeds
124 and URL paths that have some other content type (e.g. set by a third-party
125 module) will be silently ignored by Boost.
126* In contrast to Drupal's built-in caching, static caching will lose any
127 additional HTTP headers set for an HTML page by a module. This is unlikely
128 to be problem except for some very specific modules and rare use cases.
129* Web server software other than Apache is not supported at the moment.
130 Adding Lighttpd support would be desirable but is not a high priority for
131 the author at present (see TODO.txt). (Note that while the LiteSpeed web
132 server has not been specifically tested by the author, it may, in fact,
133 work, since they claim to support .htaccess files and to have mod_rewrite
134 compatibility. Feedback on this would be appreciated.)
135* At the moment, Windows users are S.O.L. due to the use of symlinks and
136 Unix-specific shell commands. The author has no personal interest in
137 supporting Windows but will accept well-documented, non-detrimental
138 patches to that effect.
139
140BUG REPORTS
141-----------
142Post feature requests and bug reports to the issue tracking system at:
143 http://drupal.org/node/add/project_issue/boost
144
145CREDITS
146-------
147Developed and maintained by Arto Bendiken <http://bendiken.net/>