| 1 |
/* $Id: README.txt,v 1.11 2009/11/29 15:59:17 smk Exp $ */
|
| 2 |
|
| 3 |
-- SUMMARY --
|
| 4 |
|
| 5 |
Provides a central transliteration service to other Drupal modules, and
|
| 6 |
sanitizes file names while uploading.
|
| 7 |
|
| 8 |
For a full description visit the project page:
|
| 9 |
http://drupal.org/project/transliteration
|
| 10 |
Bug reports, feature suggestions and latest developments:
|
| 11 |
http://drupal.org/project/issues/transliteration
|
| 12 |
|
| 13 |
|
| 14 |
-- INSTALLATION --
|
| 15 |
|
| 16 |
1. Install as usual, see http://drupal.org/node/70151 for further information.
|
| 17 |
|
| 18 |
2. If you are installing to an existing Drupal site, you might want to fix
|
| 19 |
existing file names after installation, which will update all file names
|
| 20 |
containing non-ASCII characters. However, if you have manually entered links
|
| 21 |
to those files in any contents, these links will break since the original
|
| 22 |
files are renamed. Therefore it is a good idea to test the conversion
|
| 23 |
first on a copy of your web site. You'll find the retroactive conversion at
|
| 24 |
Configuration and modules >> Media >> File system >> Transliteration.
|
| 25 |
|
| 26 |
|
| 27 |
-- CONFIGURATION --
|
| 28 |
|
| 29 |
This module doesn't require special permissions.
|
| 30 |
|
| 31 |
This module can be configured from the File system configuration page
|
| 32 |
(Configuration and modules >> Media >> File system >> Settings).
|
| 33 |
|
| 34 |
- Transliterate file names during upload: If you need more control over the
|
| 35 |
resulting file names you might want to disable this feature here and install
|
| 36 |
the FileField Paths module (http://drupal.org/project/filefield_paths)
|
| 37 |
instead.
|
| 38 |
|
| 39 |
- Lowercase transliterated file names: It is recommended to enable this option
|
| 40 |
to prevent issues with case-insensitive file systems.
|
| 41 |
|
| 42 |
|
| 43 |
-- 3RD PARTY INTEGRATION --
|
| 44 |
|
| 45 |
Third party developers seeking an easy way to transliterate text or file names
|
| 46 |
may use transliteration functions as follows:
|
| 47 |
|
| 48 |
if (function_exists('transliteration_get')) {
|
| 49 |
$transliterated = transliteration_get($text, $unknown, $source_langcode);
|
| 50 |
}
|
| 51 |
|
| 52 |
or, in case of file names:
|
| 53 |
|
| 54 |
if (function_exists('transliteration_clean_filename')) {
|
| 55 |
$transliterated = transliteration_clean_filename($filename, $source_langcode);
|
| 56 |
}
|
| 57 |
|
| 58 |
Note that the optional $source_langcode parameter specifies the language code
|
| 59 |
of the input. If the source language is not known at the time of transliter-
|
| 60 |
ation, it is recommended to set this argument to the site default language:
|
| 61 |
|
| 62 |
$output = transliteration_get($text, '?', language_default('language'));
|
| 63 |
|
| 64 |
Otherwise the current display language will be used, which might produce
|
| 65 |
inconsistent results.
|
| 66 |
|
| 67 |
|
| 68 |
-- LANGUAGE SPECIFIC REPLACEMENTS --
|
| 69 |
|
| 70 |
This module supports language specific variations in addition to the basic
|
| 71 |
transliteration replacements. The following guide explains how to add them:
|
| 72 |
|
| 73 |
1. First find the Unicode character code you want to replace. As an example,
|
| 74 |
we'll be adding a custom transliteration for the cyrillic character 'г'
|
| 75 |
(hexadecimal code 0x0433) using the ASCII character 'q' for Azerbaijani
|
| 76 |
input.
|
| 77 |
|
| 78 |
2. Transliteration stores its mappings in banks with 256 characters each. The
|
| 79 |
first two digits of the character code (04) tell you in which file you'll
|
| 80 |
find the corresponding mapping. In our case it is data/x04.php.
|
| 81 |
|
| 82 |
3. If you open that file in an editor, you'll find the base replacement matrix
|
| 83 |
consisting of 16 lines with 16 characters on each line, and zero or more
|
| 84 |
additional language-specific variants. To add our custom replacement, we need
|
| 85 |
to do two things: first, we need to create a new transliteration variant
|
| 86 |
for Azerbaijani since it doesn't exist yet, and second, we need to map the
|
| 87 |
last two digits of the hexadecimal character code (33) to the desired output
|
| 88 |
string:
|
| 89 |
|
| 90 |
$variant['az'] = array(0x33 => 'q');
|
| 91 |
|
| 92 |
(see http://people.w3.org/rishida/names/languages.html for a list of
|
| 93 |
language codes).
|
| 94 |
|
| 95 |
Any Azerbaijani input will now use the appropriate variant.
|
| 96 |
|
| 97 |
Also take a look at data/x00.php which already contains a bunch of language
|
| 98 |
specific replacements. If you think your overrides are useful for others please
|
| 99 |
file a patch at http://drupal.org/project/issues/transliteration.
|
| 100 |
|
| 101 |
|
| 102 |
-- CREDITS --
|
| 103 |
|
| 104 |
Authors:
|
| 105 |
* Stefan M. Kudwien (smk-ka) - http://drupal.org/user/48898
|
| 106 |
* Daniel F. Kudwien (sun) - http://drupal.org/user/54136
|
| 107 |
|
| 108 |
UTF-8 normalization is based on UtfNormal.php from MediaWiki
|
| 109 |
(http://www.mediawiki.org) and transliteration uses data from Sean M. Burke's
|
| 110 |
Text::Unidecode CPAN module
|
| 111 |
(http://search.cpan.org/~sburke/Text-Unidecode-0.04/lib/Text/Unidecode.pm).
|
| 112 |
|