| 1 |
Audit files for Drupal 6.x
|
| 2 |
==========================
|
| 3 |
|
| 4 |
Audit files is a module that is designed to help keep your upload files in
|
| 5 |
check. It can run four reports, which are accessed from Administer > Reports >
|
| 6 |
Audit files
|
| 7 |
|
| 8 |
|
| 9 |
Audit files not on the server
|
| 10 |
-----------------------------
|
| 11 |
This report lists files that are named in the {files} table in the database
|
| 12 |
but that do not exist on the server. These missing files may mean that
|
| 13 |
nodes do not display as expected, for example, images may not display or
|
| 14 |
downloads may not be available.
|
| 15 |
|
| 16 |
From this report you can view or edit the related node to try and discover
|
| 17 |
what is wrong and fix it by editing the node.
|
| 18 |
|
| 19 |
|
| 20 |
Audit files not in the database
|
| 21 |
-------------------------------
|
| 22 |
This report lists files that are on the server but are not referred to by
|
| 23 |
the {files} table. These may be orphan files whose parent node has been
|
| 24 |
deleted, or they may be the result of a module not tidying up after itself.
|
| 25 |
You can sort the table by node number or by filename as you prefer.
|
| 26 |
|
| 27 |
From this report you can mark files for deletion. There is intentionally no
|
| 28 |
"select all" checkbox because you probably don't want to accidentally get rid
|
| 29 |
of everything in one hit!
|
| 30 |
|
| 31 |
Be careful with the delete feature - the deletion is permanent - be sure the
|
| 32 |
file is no longer needed before erasing it!
|
| 33 |
|
| 34 |
If you're not sure what the file is then you can click on the filename to
|
| 35 |
open the file in your browser.
|
| 36 |
|
| 37 |
Missing references
|
| 38 |
------------------
|
| 39 |
Listed here are file references embedded in node bodies which do not have
|
| 40 |
exact correspondences in the {files} and {upload} tables. If there is a
|
| 41 |
file in the {files} table with a corresponding base name, that is listed.
|
| 42 |
Scenarios are:
|
| 43 |
|
| 44 |
* No match at all in the {files} table. Go to Files not in database and make
|
| 45 |
sure any files that exist have been added to the database. If they have, you
|
| 46 |
should either find and upload the missing file and run this report again, or
|
| 47 |
remove the reference from the node.
|
| 48 |
* Multiple matches in the {files} table. This can happen when the same filename
|
| 49 |
is in multiple directories in the uploded file hierarchy. You can review the
|
| 50 |
alternate files and delete any that are true duplicates. When different files
|
| 51 |
have the same basename, you can select the one that goes with the given node
|
| 52 |
and choose Attach selected files. This will rewrite the reference in the node
|
| 53 |
to use the canonical relative URL to the file, and if necessary add the reference
|
| 54 |
to the {upload} table.
|
| 55 |
* A single match in the {files} table. You can make the attachments between these
|
| 56 |
nodes and the corresponding files one-by-one by selecting them and choosing Attach
|
| 57 |
selected files, or automatically apply it to all single-match cases with Attach
|
| 58 |
all unique matches.
|
| 59 |
|
| 60 |
Unreferenced
|
| 61 |
------------
|
| 62 |
The files listed here are in the {files} table but no nodes are recorded as
|
| 63 |
referencing them (i.e., there's no entry in the {upload} table). This might mean
|
| 64 |
the node has been deleted without deleting the file, or that the files were uploaded
|
| 65 |
by some means other than the upload module (e.g., ftp) and the relationships between
|
| 66 |
files and nodes have not been made. If you have used the File references report and
|
| 67 |
accounted for all files that should be referenced, and are sure that the files below
|
| 68 |
are not needed, you can delete them.
|
| 69 |
|
| 70 |
Configuration
|
| 71 |
-------------
|
| 72 |
There may be some files, folders or extensions that are reported by the audit
|
| 73 |
that you do not want to be included. You can set exclusions at Administer >
|
| 74 |
Site configuration > Audit files. By default the audit excludes .htaccess files
|
| 75 |
and the contents of the color directory.
|
| 76 |
|
| 77 |
Migration
|
| 78 |
---------
|
| 79 |
In typical usage, the reports can be used independently and the occasional issues
|
| 80 |
revealed dealt with one-by-one. Another application is the migration of content and
|
| 81 |
images from another web site (for example, using the node_import module or pasting
|
| 82 |
HTML content manually into a node creation form), which typically will break any
|
| 83 |
embedded image references. The file audit tools can automate much of rectifying this
|
| 84 |
situation.
|
| 85 |
|
| 86 |
A typical workflow for migrating content containing embedded images would be:
|
| 87 |
|
| 88 |
1. Copy all referenced images to the Drupal files directory (typically sites/default/files).
|
| 89 |
2. Go to the Not In Database report, and after sanity-checking the list execute the Add All
|
| 90 |
Files to Database action. This will add each file you've copied in to the {files} table.
|
| 91 |
3. Go to the Missing References report, and after sanity-checking the list execute the
|
| 92 |
Attach All Unique Matches action. This will rewrite the image references in your content
|
| 93 |
to properly point to their path on the Drupal server, in every case where there is a
|
| 94 |
single matching filename.
|
| 95 |
4. The Missing References report will now show you image references which don't match any
|
| 96 |
files in the Drupal files directory - review these to see if you missed any. Some cases
|
| 97 |
may be offsite references that can be safely ignored - in other cases, where you can't
|
| 98 |
track down the missing image, you can go to the node and edit it to remove the image reference.
|
| 99 |
5. The Unreferenced report will now show you files that are not in the {files} and {upload}
|
| 100 |
tables. Review these carefully to make sure they aren't being used in some way not picked
|
| 101 |
up by the other tools (e.g., through Javascript). Suggestion: take some sample filenames
|
| 102 |
and query the Drupal database directly:
|
| 103 |
SELECT * FROM node_revisions WHERE body LIKE '%file.jpg%'
|
| 104 |
If you find any files you're sure are not being used, you may (after making sure you have
|
| 105 |
a fresh backup of the directory tree) delete them from the Unreferenced report.
|
| 106 |
|
| 107 |
Issues
|
| 108 |
------
|
| 109 |
Files are associated with nodes using the {upload} table, behind the upload module's back. The
|
| 110 |
main issue here is that we may associate a single file with multiple nodes, but the upload
|
| 111 |
module assumes that each file is only associated with a single node and thus deletes the file
|
| 112 |
when the node is deleted. This suggests that our generated associations should be saved in our
|
| 113 |
own table, which would require every place that now checks the {upload} table to check two
|
| 114 |
tables.
|
| 115 |
|
| 116 |
The newer functionality is untested with private downloads.
|
| 117 |
|
| 118 |
; $Id: README.txt,v 1.5 2007/12/05 22:33:19 stuartgreenfield Exp $
|