summaryrefslogtreecommitdiff
path: root/includes/normal/README
diff options
context:
space:
mode:
Diffstat (limited to 'includes/normal/README')
-rw-r--r--includes/normal/README59
1 files changed, 0 insertions, 59 deletions
diff --git a/includes/normal/README b/includes/normal/README
deleted file mode 100644
index 0f718d2c..00000000
--- a/includes/normal/README
+++ /dev/null
@@ -1,59 +0,0 @@
-This directory contains some Unicode normalization routines. These routines
-are meant to be reusable in other projects, so I'm not tying them to the
-MediaWiki utility functions.
-
-The main function to care about is UtfNormal::toNFC(); this will convert
-a given UTF-8 string to Normalization Form C if it's not already such.
-The function assumes that the input string is already valid UTF-8; if there
-are corrupt characters this may produce erroneous results.
-
-To also check for illegal characters, use UtfNormal::cleanUp(). This will
-strip illegal UTF-8 sequences and characters that are illegal in XML, and
-if necessary convert to normalization form C.
-
-Performance is kind of stinky in absolute terms, though it should be speedy
-on pure ASCII text. ;) On text that can be determined quickly to already be
-in NFC it's not too awful but it can quickly get uncomfortably slow,
-particularly for Korean text (the hangul decomposition/composition code is
-extra slow).
-
-
-== Regenerating data tables ==
-
-UtfNormalData.inc and UtfNormalDataK.inc are generated from the Unicode
-Character Database by the script UtfNormalGenerate.php. On a *nix system
-'make' should fetch the necessary files and regenerate it if the scripts
-have been changed or you remove it.
-
-
-== Testing ==
-
-'make test' will run the conformance test (UtfNormalTest.php), fetching the
-data from from the net if necessary. If it reports failure, something is
-going wrong!
-
-You may have to set up PHPUnit first.
-
-$ pear channel-discover pear.phpunit.de
-$ pear install phpunit/PHPUnit
-
-== Benchmarks ==
-
-Run 'make bench' to download some sample texts from Wikipedia and run some
-cheap benchmarks of some of the functions. Take all numbers with large
-grains of salt.
-
-
-== PHP module extension ==
-
-There's an experimental PHP extension module which wraps the ICU library's
-normalization functions. This is *MUCH* faster than doing this work in pure
-PHP code. This is at https://git.wikimedia.org/summary/mediawiki%2Fextensions%2Fnormal.git.
-It is used by the WMF, which currently runs PHP 5.3.10 on Linux. It hasn't been
-thoroughly tested on other configurations, but may work.
-
-If the php_normal.so module is loaded in php.ini, the normalization functions
-will automatically use it. If you can't (or don't want to) load it in php.ini,
-you may be able to load it using the dl() function before the inclusion of
-UtfNormal.php, and it will be picked up.
-