/[drupal]/contributions/modules/htmLawed/readme.txt
ViewVC logotype

Contents of /contributions/modules/htmLawed/readme.txt

Parent Directory Parent Directory | Revision Log Revision Log | View Revision Graph Revision Graph


Revision 1.10 - (show annotations) (download)
Mon Mar 23 02:44:56 2009 UTC (8 months ago) by patnaik
Branch: MAIN
CVS Tags: HEAD
Changes since 1.9: +15 -14 lines
File MIME type: text/plain
Altered some words
1 $Id: readme.txt,v 1.9 2008/05/18 01:40:29 patnaik Exp $
2
3 htmLawed Drupal 7.x module
4 ==========================
5
6 GPL v3 license
7 Copyright Santosh Patnaik, MD, PhD
8 Initiated May 2008
9
10
11 About the module
12 ----------------
13
14 The htmLawed Drupal module enables the use of the htmLawed (X)HTML filter/purifier in Drupal. Unlike Drupal's HTML filter, htmLawed allows fine control on the HTML markup (e.g., restricting URLs by protocols and limiting element-specific attributes), ensures proper nesting and balancing of tags, etc. Unlike filters like HTMLPurifier, the single-file htmLawed is much faster, more customizable, uses 10-20x less memory, is 10-20x smaller, works with PHP 4, covers all HTML markup, etc.
15
16 The module:
17
18 * allows CONTENT (node)-TYPE-SPECIFIC htmLawed settings (e.g., allowing a certain HTML tag/element in stories but not in blog-posts)
19
20 * allows INPUT FORMAT-SPECIFIC htmLawed settings
21
22 * provides OPTION TO FILTER BEFORE STORAGE in the database (in-built Drupal filters don't do this)
23
24 * allows DIFFERENT SETTINGS FOR COMMENTS & TEASERS
25
26 * allows setting DEFAULT VALUES for use with any content-type
27
28 The module does not install or modify (structures of) existing Drupal database tables; all information is stored in the 'variable' table in items named 'htmLawed_format_x' where 'x' refers to numbers identifying various input formats.
29
30 If you enable htmLawed, it is important that you understand the security implications of the settings you use and the limitations of htmLawed. It is also recommended that htmLawed be tried using various 'Config' and 'Spec' values using the demo page on the htmLawed website.
31
32 The version of htmLawed used by the module would be indicated on the web-page for the 'help' section of the module. Keeping the module up-to-date with the latest htmLawed version is as simple as replacing the htmLawed/htmLawed.php and htmLawed/htmLawed_README.htm files in the htmLawed module folder.
33
34
35 About htmLawed
36 --------------
37
38 htmLawed is a single-file PHP software that makes input text more secure and standard-compliant, and suitable in general from the viewpoint of a web-page administrator, for use in the body of HTML 4, or XHTML 1 or 1.1 documents. It thus is a customizable HTML/XHTML filter, processor, purifier, sanitizer, beautifier, etc., like HTML Tidy or the Kses, HTMLPurifier, etc., PHP scripts.
39
40 The lawing-in of input text is needed to ensure that HTML code in the text is standard-compliant, does not introduce security vulnerabilities, and does not break a web-page's design/layout. htmLawed tries to do this by, for example, making HTML well-formed with balanced and properly nested tags, neutralizing code that may be used for cross-site scripting (XSS) attacks, and allowing only specified HTML elements/tags and attributes.
41
42 For htmLawed download and forum-based support, visit the htmLawed home page at http://www.bioinformatics.org/phplabware/internal_utilities/htmlawed/index.php.
43
44
45 Module installation
46 -------------------
47
48 1. Move 'htmLawed' folder inside 'modules/' or 'sites/all/modules' (you may have to create the latter sub-folder).
49
50 2. Enable the 'htmLawed (X)HTML filter/purifier' module after browsing to the 'Administer' > 'Site building' > 'Modules' section of your Drupal site.
51
52 3. Browse to the 'Administer' > 'Site configuration' > 'Text formats' section. There you can 'configure' a text format to make it use htmLawed by selecting it in the list of filters available for it.
53
54 With htmLawed turned on, you may safely disable Drupal's 'HTML filter'. Depending on the other filters enabled for the text format, you may need to 'rearrange' the filters. Usually, htmLawed would be set to run as the last filter.
55
56 If a filter that relies on the '<', '>' or '&' character (such as Drupal's 'PHP evaluator') is being used with the text format, then that filter should run before htmLawed. Further, if that filter generates HTML markup, then htmLawed should be configured to permit such markup.
57
58 4. The htmLawed filter is a customizable one. Two values, those of 'Config.' and 'Spec.', dictate the customization. Configuring the htmLawed module thus involves specifying the 'Config.' and 'Spec.' values in the settings form. The htmLawed module permits you to use different 'Config.' and 'Spec.' values for different text formats, content-types, etc.
59
60 To get to the settings form, choose to 'configure' a text format and then choose the 'Configure' link on the ensuing page. A sub-form ('Default') can be used to set the default values to be used for any content-type. Content-type-specific sub-forms allow you to over-ride the default values as well as to choose to use (or disable) htmLawed.
61
62 The 'Config.' form-fields are filled with comma-separated, quoted, key-value pairs; e.g., '"safe"=>1, "elements"=>"a, em, strong"' (these are interpreted as PHP array elements). The 'Spec.' field is optional. The 'Help' field should be filled with information/tips about the filter (such as what tags are allowed) to be displayed to the users. A checkbox is provided in the content-type-specific sub-forms to allow the 'Default' values to be used. If it is unchecked, the content-type-specific values will be used during filtering.
63
64 Filtering is further individualized for 'Body', 'Comment', 'Teaser' and 'Other'. 'Body' refers to the main content (such as a blog-post). 'Comment' refers to a user comment on the main content. 'Teaser' (called 'RSS' in version 1 of the module) refers to the news-feed (RSS) items and teasers generated from the main content. You may have a need for 'Other' (in 'Default') if you use modules like 'Views' to have extra input fields (like 'Header') that are not content (node)-type-specific. Content-type-specific settings for 'Other' are obviously not possible.
65
66 * If htmLawed is enabled for 'Teaser', the htmLawed filtering is done at the end of all other filtering, including any prior htmLawed filtering because of 'Body'.
67
68 * For 'Body' and 'Comment', filtering can also be enabled for 'save', in which case the submitted input is first filtered before being saved in the site database. However, you have to check if this causes conflicts with filters (other than Drupal's 'PHP evaluator') that rely on the '<', '>' or '&' character.
69
70 * The default settings allow the a, em, strong, cite, code, ol, ul, li, dl, dt and dd HTML tags, and deny the id and style attributes, and any unsafe markup (such as the scriptable HTML attributes). For 'Teaser', the default settings will allow 'br' and 'p' as well.
71
72 * The default settings are used to pre-fill the htmLawed module form-fields and during the filtering only if the specific settings cannot be found. Emptying a 'Config.' field does not mean that the default settings will be used.
73
74 * Highly customized filtering can be achieved by appropriately setting 'Config.' and 'Spec.' Refer to htmLawed documentation for more.
75
76 5. For restricting user access to the administration of htmLawed settings, go to the 'Administer' > 'User management' section of your site. Ideally, only the main administrator of the site should have the access.
77
78 6. A Drupal handbook may be available for htmLawed. Check http://drupal.org/search/node/htmLawed+type%3Abook
79
80
81 Notes
82 -----
83
84 1. Check for conflicts with any third-party filter modules in use.
85
86 2. You can replace files inside 'htmLawed/htmLawed/' with the latest versions from http://www.bioinformatics.org/phplabware/internal_utilities/htmlawed/index.php.
87
88 3. Deleting a content-type will delete the associated htmLawed settings.
89
90 4. Deleting a text format will NOT automatically delete the associated htmLawed settings. You'll have to run cron to delete the not-needed htmLawed settings: 'Administer' > 'Reports' > 'Status report' > 'run cron manually'.
91
92 5. Disabling htmLawed for a text format will not delete the associated settings.
93
94 6. Uninstalling the htmLawed module through 'Administer' > 'Site building' > 'Modules' > 'Uninstall' will delete all htmLawed settings.
95
96 7. Disabling the module will not delete any htmLawed setting.
97
98 8. The 'save' functionality is turned off by default for all text formats and content-types.
99
100 9. When a new content-type is created, the htmLawed-settings to be used with it must be set; otherwise, the default settings will be used.
101
102 10. The latest version of Drupal 7 is recommended for use with this module.
103
104
105 Filter workflow
106 ---------------
107
108 The schematic below is to give an idea of how filtering works in Drupal. Note that the 'content-type' of a comment is the 'content-type' of the item (such as a blog-post) for which the comment was made.
109
110
111 STEP 1: Submission
112 ------------------------------------
113 | * Content such as a comment |
114 | or a blog-post is created/edited |
115 | and submitted by a user |
116 ------------------------------------
117
118
119 STEP 2: Storage
120 --------------------------------
121 | * Unfiltered content is stored | With htmLawed 'save' enabled
122 | * Teaser is auto-generated | content is first htmLawed-filtered
123 | * Teaser is stored | as per content's type
124 --------------------------------
125 Teaser (like an RSS item) is generated from
126 the stored content
127 STEP 3: Display
128 --------------------------------
129 | * Stored content is retrieved | With htmLawed 'show' enabled
130 | and filtered before display | content is htmLawed-filtered
131 | * Teaser is filtered for feeds | as per content's type
132 --------------------------------
133 Teaser is similarly filtered
134
135 Depending on the text format,
136 filters other than htmLawed
137 may also process the data

  ViewVC Help
Powered by ViewVC 1.1.2