| 1 |
/* $Id: README.txt,v 1.1.2.1.2.23 2009/06/21 20:11:45 pwolanin Exp $ */
|
| 2 |
|
| 3 |
This module integrates Drupal with the Apache Solr search platform. Solr search
|
| 4 |
can be used as a replacement for core content search and boasts both extra
|
| 5 |
features and better performance. Among the extra features is the ability to have
|
| 6 |
faceted search on facets ranging from content author to taxonomy to arbitrary
|
| 7 |
CCK fields.
|
| 8 |
|
| 9 |
The module comes with a schema.xml and solrconfig.xml file which should be used
|
| 10 |
in your Solr installation.
|
| 11 |
|
| 12 |
This module depends on the search framework in core. However, you may not want
|
| 13 |
the core searches and only want Solr search. If that is the case, you want to
|
| 14 |
use the Core Searches module in tandem with this module.
|
| 15 |
|
| 16 |
When used in combination with core search module, Apache Solr is not the default
|
| 17 |
search. Access it via a new tab on the default search page, called "Search".
|
| 18 |
|
| 19 |
Installation
|
| 20 |
------------
|
| 21 |
|
| 22 |
Prerequisite: Java 5 or higher (a.k.a. 1.5.x). PHP 5.1.4 or higher.
|
| 23 |
|
| 24 |
Those with PHP < 5.2.0 must install the PECL json module or download
|
| 25 |
the Json code from the Zend Framework (see below).
|
| 26 |
|
| 27 |
Install the Apache Solr Drupal module as you would any Drupal module.
|
| 28 |
|
| 29 |
Before enabling it, you must also do the following:
|
| 30 |
|
| 31 |
Get the PHP library from the external project. The project is
|
| 32 |
found at: http://code.google.com/p/solr-php-client/
|
| 33 |
From the apachesolr module directory, run this command:
|
| 34 |
|
| 35 |
svn checkout -r6 http://solr-php-client.googlecode.com/svn/trunk/ SolrPhpClient
|
| 36 |
|
| 37 |
Note that revision 6 is the currently tested and suggested revision.
|
| 38 |
Make sure that the final directory is named SolrPhpClient under the apachesolr
|
| 39 |
module directory. Note: the 2009-03-11 version of the library from the
|
| 40 |
googlecode page is r5 and will not work with beta6+.
|
| 41 |
|
| 42 |
If you are maintaing your code base in subversion, you may choose instead to
|
| 43 |
use svn export or svn externals. For an export (writing a copy to your local
|
| 44 |
directory without .svn files to track changes) use:
|
| 45 |
|
| 46 |
svn export -r6 http://solr-php-client.googlecode.com/svn/trunk/ SolrPhpClient
|
| 47 |
|
| 48 |
Instead of checking out, externals can be used too. Externals can be seen as
|
| 49 |
(remote) symlinks in svn. This requires your own project in your own svn ]
|
| 50 |
repository, off course. In the apachesolr module directory, issue the command:
|
| 51 |
|
| 52 |
svn propedit svn:externals
|
| 53 |
|
| 54 |
Your editor will open. Add a line
|
| 55 |
|
| 56 |
SolrPhpClient -r6 http://solr-php-client.googlecode.com/svn/trunk/
|
| 57 |
|
| 58 |
On exports and checkouts, svn will grab the externals, but it will keep the
|
| 59 |
references on the remote server.
|
| 60 |
|
| 61 |
Those without svn, etc may also choose to try the bundled Acquia Search
|
| 62 |
download, which includes all the items which are not in Drupal.org CVS due to
|
| 63 |
CVS use policy. See the download link here:
|
| 64 |
http://acquia.com/documentation/acquia-search/activation
|
| 65 |
|
| 66 |
Download Solr trunk (candidate 1.4.x build) from a nightly build or build it
|
| 67 |
from svn. http://people.apache.org/builds/lucene/solr/nightly/
|
| 68 |
|
| 69 |
Once Solr 1.4 is released, you will be able to download from:
|
| 70 |
http://www.apache.org/dyn/closer.cgi/lucene/solr/
|
| 71 |
|
| 72 |
Unpack the tarball somewhere not visible to the web (not in your apache docroot
|
| 73 |
and not inside of your drupal directory).
|
| 74 |
|
| 75 |
The Solr download comes with an example application that you can use for
|
| 76 |
testing, development, and even for smaller production sites. This
|
| 77 |
application is found at apache-solr-nightly/example.
|
| 78 |
|
| 79 |
Move apache-solr-nightly/example/solr/conf/schema.xml and rename it to
|
| 80 |
something like schema.bak. Then move the schema.xml that comes with the
|
| 81 |
ApacheSolr Drupal module to take its place.
|
| 82 |
|
| 83 |
Similarly, move apache-solr-nightly/example/solr/conf/solrconfig.xml and rename
|
| 84 |
it like solrconfig.bak. Then move the solrconfig.xml that comes with the
|
| 85 |
ApacheSolr Drupal module to take its place.
|
| 86 |
|
| 87 |
Now start the solr application by opening a shell, changing directory to
|
| 88 |
apache-solr-nightly/example, and executing the command java -jar start.jar
|
| 89 |
|
| 90 |
Test that your solr server is now available by visiting
|
| 91 |
http://localhost:8983/solr/admin/
|
| 92 |
|
| 93 |
For those using PHP 5.1, you must either install the PECL json extension
|
| 94 |
into PHP on your sever, or you may use the Zend framework Json library.
|
| 95 |
for the PECL extension see: http://pecl.php.net/package/json
|
| 96 |
The Solr client has been tested with Zend framework release 1.7.7.
|
| 97 |
To get this code, you may use svn from the apachesolr directory:
|
| 98 |
svn co http://framework.zend.com/svn/framework/standard/tags/release-1.7.7/library/Zend
|
| 99 |
However, the only required parts are:
|
| 100 |
http://framework.zend.com/svn/framework/standard/tags/release-1.7.7/library/Zend/Exception.php
|
| 101 |
http://framework.zend.com/svn/framework/standard/tags/release-1.7.7/library/Zend/Json/
|
| 102 |
The 'Zend' directory should normally be under the apachesolr
|
| 103 |
directory, but may be elsewhere if you set that location to be
|
| 104 |
in your PHP include path.
|
| 105 |
|
| 106 |
Now, you should enable the "Apache Solr framework" and "Apache Solr search"
|
| 107 |
modules. Check that you can connect to Solr at ?q=admin/setting/apachesolr
|
| 108 |
Now run cron on your Drupal site until your content is indexed. You
|
| 109 |
can monitor the index at ?q=admin/settings/apachesolr/index
|
| 110 |
|
| 111 |
The solrconfig.xml that comes with this modules defines auto-commit, so
|
| 112 |
it may take a few minutes between running cron and when the new content
|
| 113 |
is visible in search.
|
| 114 |
|
| 115 |
Enable blocks for facets first at Administer > Site configuration > Apache Solr > Enabled filters,
|
| 116 |
then position them as you like at Administer > Site building > Blocks.
|
| 117 |
|
| 118 |
Configuration variables
|
| 119 |
--------------
|
| 120 |
|
| 121 |
The module provides some (hidden) variables that can be used to tweak its
|
| 122 |
behavior:
|
| 123 |
|
| 124 |
- apachesolr_luke_limit: the limit (in terms of number of documents in the
|
| 125 |
index) above which the module will not retrieve the number of terms per field
|
| 126 |
when performing LUKE queries (for performance reasons).
|
| 127 |
|
| 128 |
- apachesolr_tags_to_index: the list of HTML tags that the module will index
|
| 129 |
(see apachesolr_add_tags_to_document()).
|
| 130 |
|
| 131 |
- apachesolr_ping_timeout: the timeout (in seconds) after which the module will
|
| 132 |
consider the Apache Solr server unavailable.
|
| 133 |
|
| 134 |
- apachesolr_optimize_interval: the interval (in seconds) between automatic
|
| 135 |
optimizations of the Apache Solr index. Set to 0 to disable.
|
| 136 |
|
| 137 |
- apachesolr_cache_delay: the interval (in seconds) after an update after which
|
| 138 |
the module will requery the Apache Solr for the index structure. Set it to
|
| 139 |
your autocommit delay plus a few seconds.
|
| 140 |
|
| 141 |
- apachesolr_service_class: the Apache_Solr_Service class used for communicating
|
| 142 |
with the Apache Solr server.
|
| 143 |
|
| 144 |
- apachesolr_query_class: the default query class to use.
|
| 145 |
|
| 146 |
Troubleshooting
|
| 147 |
--------------
|
| 148 |
Problem:
|
| 149 |
Links to nodes appear in the search results with a different host name or
|
| 150 |
subdomain than is preferred. e.g. sometimes at http://example.com
|
| 151 |
and sometimes at http://www.example.com
|
| 152 |
|
| 153 |
Solution:
|
| 154 |
Set $base_url in settings.php to insure that an identical absolute url is
|
| 155 |
generated at all times when nodes are indexed. Alternately, set up a re-direct
|
| 156 |
in .htaccess to prevent site visitors from accessing the site via more than one
|
| 157 |
site address.
|
| 158 |
|
| 159 |
|
| 160 |
Developers
|
| 161 |
--------------
|
| 162 |
|
| 163 |
Exposed Hooks in 6.x:
|
| 164 |
|
| 165 |
hook_apachesolr_modify_query(&$query, &$params, $caller);
|
| 166 |
|
| 167 |
Any module performing a search should call apachesolr_modify_query($query, $params, 'modulename').
|
| 168 |
That function then invokes this hook. It allows modules to modify the query object and params array.
|
| 169 |
$caller indicates which module is invoking the hook.
|
| 170 |
|
| 171 |
Example:
|
| 172 |
|
| 173 |
function my_module_apachesolr_modify_query(&$query, &$params, $caller) {
|
| 174 |
// I only want to see articles by the admin!
|
| 175 |
$query->add_field("uid", 1);
|
| 176 |
}
|
| 177 |
|
| 178 |
hook_apachesolr_cck_fields_alter(&$mappings)
|
| 179 |
|
| 180 |
Add or alter index mappings for CCK types. The default mappings array handles just
|
| 181 |
text fields with option widgets:
|
| 182 |
|
| 183 |
$mappings['text'] = array(
|
| 184 |
'optionwidgets_select' => array('callback' => '', 'index_type' => 'string'),
|
| 185 |
'optionwidgets_buttons' => array('callback' => '', 'index_type' => 'string')
|
| 186 |
);
|
| 187 |
|
| 188 |
In your _alter hook implementation you can add additional field types such as:
|
| 189 |
|
| 190 |
$mappings['number_integer']['number'] = array('callback' => '', 'index_type' => 'integer');
|
| 191 |
|
| 192 |
You can allso add a mapping for a specific field. This will take precedence over any
|
| 193 |
mapping for a general field type. A field-specific mapping would look like:
|
| 194 |
|
| 195 |
$mappings['per-field']['field_model_name'] = array('callback' => '', 'index_type' => 'string');
|
| 196 |
|
| 197 |
or
|
| 198 |
|
| 199 |
$mappings['per-field']['field_model_price'] = array('callback' => '', 'index_type' => 'float');
|
| 200 |
|
| 201 |
hook_apachesolr_types_exclude($namespace)
|
| 202 |
|
| 203 |
|
| 204 |
Invoked by apachesolr.module when generating a list of nodes to index for a given
|
| 205 |
namespace. Return an array of node types to be excldued from indexing for that namespace
|
| 206 |
(e.g. 'apachesolr_search'). This is used by apachesolr_search module to exclude
|
| 207 |
certain node types from the index.
|
| 208 |
|
| 209 |
hook_apachesolr_node_exclude($node, $namespace)
|
| 210 |
|
| 211 |
This is invoked by apachesolr.module for each node to be added to the index - if any module
|
| 212 |
returns TRUE, the node is skipped for indexing.
|
| 213 |
|
| 214 |
hook_apachesolr_update_index(&$document, $node)
|
| 215 |
|
| 216 |
Allows a module to change the contents of the $document object before it is sent to the Solr Server.
|
| 217 |
To add a new field to the document, you should generally use one of the pre-defined dynamic fields.
|
| 218 |
Follow the naming conventions for the type of data being added based on the schema.xml file.
|
| 219 |
|
| 220 |
hook_apachesolr_search_result_alter(&$doc)
|
| 221 |
|
| 222 |
The is invoked by apachesolr_search.module for each document returned in a search - new in 6.x-beta7
|
| 223 |
as a replacement for the call to hook_nodeapi().
|
| 224 |
|
| 225 |
hook_apachesolr_sort_links_alter(&$sort_links)
|
| 226 |
|
| 227 |
Called by the sort link block code. Allows other modules to modify, add or remove sorts.
|
| 228 |
|
| 229 |
|
| 230 |
Themers
|
| 231 |
----------------
|
| 232 |
|
| 233 |
See inline docs in apachesolr_theme and apachesolr_search_theme functions
|
| 234 |
within apachesolr.module and apachesolr_search.module.
|
| 235 |
|