| 1 |
Setting up the Xapian module
|
| 2 |
============================
|
| 3 |
In order to use the Xapian module, the following packages are necessary
|
| 4 |
|
| 5 |
Required
|
| 6 |
--------
|
| 7 |
. Xapian
|
| 8 |
. Xapian's PHP5 bindings (on the web server)
|
| 9 |
|
| 10 |
Optional (Perl extras need this)
|
| 11 |
--------------------------------
|
| 12 |
. Xapian's Perl bindings (Search::Xapian)
|
| 13 |
|
| 14 |
|
| 15 |
Installation of the Xapian requirements
|
| 16 |
=======================================
|
| 17 |
Debian/Ubuntu
|
| 18 |
-------------
|
| 19 |
apt-get install php5-xapian libsearch-xapian-perl libxapian15
|
| 20 |
|
| 21 |
Note: With ubuntu hardy, the library was not put in the built in include
|
| 22 |
path. Running this command resolved that
|
| 23 |
|
| 24 |
sudo ln -s /usr/share/php5/xapian.php /usr/share/php/xapian.php
|
| 25 |
|
| 26 |
Redhat
|
| 27 |
------
|
| 28 |
yum install xapian-core-libs xapian-core perl-Search-Xapian
|
| 29 |
|
| 30 |
Any other requirements should be satisfiable from the xapian project pages:
|
| 31 |
http://www.xapian.org/docs/install.html
|
| 32 |
http://www.xapian.org/download.php
|
| 33 |
|
| 34 |
At worst, you may have to build it from source, but that is not hard to do.
|
| 35 |
|
| 36 |
|
| 37 |
Module installation
|
| 38 |
===================
|
| 39 |
Follow the standard drupal installation procedure and copy the module
|
| 40 |
files to sites/xxx/modules/xapian
|
| 41 |
|
| 42 |
You should then be able to enable the xapian module in the admin interface
|
| 43 |
admin/build/modules
|
| 44 |
|
| 45 |
Note: The standard drupal search module still needs to be enabled as well,
|
| 46 |
as the Xapian module only provides a back end, not the front end forms etc.
|
| 47 |
|
| 48 |
Next, navigate to the settings page, admin/settings/xapian.
|
| 49 |
|
| 50 |
Settings
|
| 51 |
========
|
| 52 |
Database
|
| 53 |
--------
|
| 54 |
In the settings page, choose whether to use a local or remote database.
|
| 55 |
For the local database, the "local database options" fields must be
|
| 56 |
populated, specifically the path to the search database.
|
| 57 |
|
| 58 |
By default, the database will be created in the drupal files area,
|
| 59 |
ensuring that the web server user has access to update the database.
|
| 60 |
|
| 61 |
For the remote database, the "remote database options" fields must be
|
| 62 |
populated, and hostname and port being needed. See "Advanced Xapian usage"
|
| 63 |
for more information.
|
| 64 |
|
| 65 |
Display
|
| 66 |
-------
|
| 67 |
Enter the number of search results to appear per page.
|
| 68 |
|
| 69 |
The result count field is used to determine how Xapian returns the
|
| 70 |
total number of results. Xapian does not return an exact number, it
|
| 71 |
estimates the results. This setting modifies that behaviour.
|
| 72 |
|
| 73 |
Performance
|
| 74 |
-----------
|
| 75 |
If you have the "Queue reindexing of changed nodes" enabled, then the
|
| 76 |
xapian module will process changed/updated nodes when cron is run.
|
| 77 |
|
| 78 |
Enabling this will also enable the "Add all nodes to queue" button.
|
| 79 |
This button will add all nodes to the xapian module reindex queue.
|
| 80 |
|
| 81 |
Diagnostics
|
| 82 |
-----------
|
| 83 |
It is also possible to enable query logging, which records both the search
|
| 84 |
query and the time required to execute it. Once you have patched drupal
|
| 85 |
core, the reponse times for the drupal equivalent query are also logged,
|
| 86 |
making a convenient way to test the difference the Xapian module and the
|
| 87 |
standard drupal search.
|
| 88 |
|
| 89 |
Patching drupal core
|
| 90 |
====================
|
| 91 |
The included patch file - drupal-5-7.diff - modifies the core drupal search
|
| 92 |
module to not declare the do_seach function if the xapian module is enabled.
|
| 93 |
|
| 94 |
Usually this involves command line access to the root drupal directory, and
|
| 95 |
running this command:
|
| 96 |
|
| 97 |
patch -p0 < sites/default/modules/xapian/drupal-5-7.diff
|
| 98 |
|
| 99 |
|
| 100 |
Perl extras
|
| 101 |
===========
|
| 102 |
If you have trouble, or just want to take a look, there are a couple of
|
| 103 |
perl based utilities included as well.
|
| 104 |
|
| 105 |
xapian-index.pl
|
| 106 |
---------------
|
| 107 |
The xapian-index.pl script creates a database called xapian_database
|
| 108 |
in the current directory, size and time taken to build the index will depend
|
| 109 |
on the number of nodes indexed. 6000 nodes takes only about a minute.
|
| 110 |
|
| 111 |
xapian-query.pl
|
| 112 |
---------------
|
| 113 |
The xapian-query.pl script allows to you query the database from the
|
| 114 |
command line to test what results would be returned.
|
| 115 |
|
| 116 |
|
| 117 |
Advanced Xapian usage
|
| 118 |
=====================
|
| 119 |
Xapian has the built in ability to host the database on a seperate machine
|
| 120 |
than the web server. It you'd like to do this, the remote host must have
|
| 121 |
xapian installed and then they must run "xapian-tcpsrv --port <port>
|
| 122 |
<path/to/database>", which will start a server that can be accessed by the
|
| 123 |
Xapian library. The Drupal module defaults to port 6431, but any port can
|
| 124 |
be used.
|
| 125 |
|