annotate dwtx/dwtxhelper/mangoicu/UNormalize.d @ 89:040da1cb0d76

Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
author Frank Benoit <benoit@tionex.de>
date Sun, 22 Jun 2008 22:57:31 +0200
parents
children 11e8159caf7a
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
89
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
1 /*******************************************************************************
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
2
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
3 @file UNormalize.d
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
4
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
5 Copyright (c) 2004 Kris Bell
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
6
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
7 This software is provided 'as-is', without any express or implied
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
8 warranty. In no event will the authors be held liable for damages
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
9 of any kind arising from the use of this software.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
10
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
11 Permission is hereby granted to anyone to use this software for any
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
12 purpose, including commercial applications, and to alter it and/or
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
13 redistribute it freely, subject to the following restrictions:
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
14
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
15 1. The origin of this software must not be misrepresented; you must
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
16 not claim that you wrote the original software. If you use this
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
17 software in a product, an acknowledgment within documentation of
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
18 said product would be appreciated but is not required.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
19
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
20 2. Altered source versions must be plainly marked as such, and must
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
21 not be misrepresented as being the original software.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
22
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
23 3. This notice may not be removed or altered from any distribution
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
24 of the source.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
25
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
26 4. Derivative works are permitted, but they must carry this notice
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
27 in full and credit the original source.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
28
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
29
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
30 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
31
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
32
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
33 @version Initial version, October 2004
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
34 @author Kris
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
35
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
36 Note that this package and documentation is built around the ICU
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
37 project (http://oss.software.ibm.com/icu/). Below is the license
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
38 statement as specified by that software:
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
39
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
40
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
41 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
42
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
43
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
44 ICU License - ICU 1.8.1 and later
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
45
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
46 COPYRIGHT AND PERMISSION NOTICE
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
47
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
48 Copyright (c) 1995-2003 International Business Machines Corporation and
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
49 others.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
50
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
51 All rights reserved.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
52
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
53 Permission is hereby granted, free of charge, to any person obtaining a
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
54 copy of this software and associated documentation files (the
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
55 "Software"), to deal in the Software without restriction, including
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
56 without limitation the rights to use, copy, modify, merge, publish,
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
57 distribute, and/or sell copies of the Software, and to permit persons
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
58 to whom the Software is furnished to do so, provided that the above
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
59 copyright notice(s) and this permission notice appear in all copies of
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
60 the Software and that both the above copyright notice(s) and this
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
61 permission notice appear in supporting documentation.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
62
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
63 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
64 OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
65 MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
66 OF THIRD PARTY RIGHTS. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
67 HOLDERS INCLUDED IN THIS NOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
68 INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
69 FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT,
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
70 NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
71 WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
72
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
73 Except as contained in this notice, the name of a copyright holder
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
74 shall not be used in advertising or otherwise to promote the sale, use
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
75 or other dealings in this Software without prior written authorization
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
76 of the copyright holder.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
77
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
78 ----------------------------------------------------------------------
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
79
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
80 All trademarks and registered trademarks mentioned herein are the
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
81 property of their respective owners.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
82
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
83 *******************************************************************************/
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
84
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
85 module dwtx.dwthelper.mangoicu.UNormalize;
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
86
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
87 private import dwtx.dwthelper.mangoicu.ICU,
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
88 dwtx.dwthelper.mangoicu.UString,
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
89 dwtx.dwthelper.mangoicu.ULocale;
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
90
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
91 /*******************************************************************************
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
92
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
93 transforms Unicode text into an equivalent composed or
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
94 decomposed form, allowing for easier sorting and searching
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
95 of text. UNormalize supports the standard normalization forms
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
96 described in http://www.unicode.org/unicode/reports/tr15/
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
97
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
98 Characters with accents or other adornments can be encoded
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
99 in several different ways in Unicode. For example, take the
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
100 character A-acute. In Unicode, this can be encoded as a single
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
101 character (the "composed" form):
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
102
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
103 00C1 LATIN CAPITAL LETTER A WITH ACUTE
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
104
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
105 or as two separate characters (the "decomposed" form):
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
106
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
107 0041 LATIN CAPITAL LETTER A 0301 COMBINING ACUTE ACCENT
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
108
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
109 To a user of your program, however, both of these sequences
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
110 should be treated as the same "user-level" character "A with
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
111 acute accent". When you are searching or comparing text, you
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
112 must ensure that these two sequences are treated equivalently.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
113 In addition, you must handle characters with more than one
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
114 accent. Sometimes the order of a character's combining accents
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
115 is significant, while in other cases accent sequences in different
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
116 orders are really equivalent.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
117
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
118 Similarly, the string "ffi" can be encoded as three separate
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
119 letters:
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
120
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
121 0066 LATIN SMALL LETTER F 0066 LATIN SMALL LETTER F
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
122 0069 LATIN SMALL LETTER I
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
123
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
124 or as the single character
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
125
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
126 FB03 LATIN SMALL LIGATURE FFI
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
127
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
128 The ffi ligature is not a distinct semantic character, and strictly
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
129 speaking it shouldn't be in Unicode at all, but it was included for
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
130 compatibility with existing character sets that already provided it.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
131 The Unicode standard identifies such characters by giving them
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
132 "compatibility" decompositions into the corresponding semantic
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
133 characters. When sorting and searching, you will often want to use
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
134 these mappings.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
135
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
136 unorm_normalize helps solve these problems by transforming text into
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
137 the canonical composed and decomposed forms as shown in the first
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
138 example above. In addition, you can have it perform compatibility
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
139 decompositions so that you can treat compatibility characters the
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
140 same as their equivalents. Finally, UNormalize rearranges
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
141 accents into the proper canonical order, so that you do not have
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
142 to worry about accent rearrangement on your own.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
143
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
144 Form FCD, "Fast C or D", is also designed for collation. It allows
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
145 to work on strings that are not necessarily normalized with an
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
146 algorithm (like in collation) that works under "canonical closure",
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
147 i.e., it treats precomposed characters and their decomposed
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
148 equivalents the same.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
149
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
150 It is not a normalization form because it does not provide for
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
151 uniqueness of representation. Multiple strings may be canonically
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
152 equivalent (their NFDs are identical) and may all conform to FCD
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
153 without being identical themselves.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
154
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
155 The form is defined such that the "raw decomposition", the
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
156 recursive canonical decomposition of each character, results
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
157 in a string that is canonically ordered. This means that
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
158 precomposed characters are allowed for as long as their
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
159 decompositions do not need canonical reordering.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
160
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
161 Its advantage for a process like collation is that all NFD
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
162 and most NFC texts - and many unnormalized texts - already
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
163 conform to FCD and do not need to be normalized (NFD) for
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
164 such a process. The FCD quick check will return UNORM_YES
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
165 for most strings in practice.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
166
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
167 For more details on FCD see the collation design document:
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
168 http://oss.software.ibm.com/cvs/icu/~checkout~/icuhtml/design/collation/ICU_collation_design.htm
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
169
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
170 ICU collation performs either NFD or FCD normalization
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
171 automatically if normalization is turned on for the collator
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
172 object. Beyond collation and string search, normalized strings
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
173 may be useful for string equivalence comparisons, transliteration/
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
174 transcription, unique representations, etc.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
175
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
176 The W3C generally recommends to exchange texts in NFC. Note also
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
177 that most legacy character encodings use only precomposed forms
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
178 and often do not encode any combining marks by themselves. For
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
179 conversion to such character encodings the Unicode text needs to
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
180 be normalized to NFC. For more usage examples, see the Unicode
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
181 Standard Annex.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
182
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
183 See <A HREF="http://oss.software.ibm.com/icu/apiref/unorm_8h.html">
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
184 this page</A> for full details.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
185
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
186
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
187 *******************************************************************************/
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
188
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
189 class UNormalize : ICU
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
190 {
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
191 enum Mode
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
192 {
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
193 None = 1,
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
194 NFD = 2,
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
195 NFKD = 3,
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
196 NFC = 4,
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
197 Default = NFC,
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
198 NFKC = 5,
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
199 FCD = 6,
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
200 Count
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
201 }
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
202
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
203 enum Check
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
204 {
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
205 No,
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
206 Yes,
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
207 Maybe
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
208 }
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
209
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
210 enum Options
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
211 {
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
212 None = 0x00,
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
213 Unicode32 = 0x20
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
214 }
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
215
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
216 /***********************************************************************
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
217
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
218 Normalize a string. The string will be normalized according
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
219 the specified normalization mode and options
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
220
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
221 ***********************************************************************/
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
222
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
223 static void normalize (UText src, UString dst, Mode mode, Options o = Options.None)
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
224 {
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
225 uint fmt (wchar* dst, uint len, inout Error e)
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
226 {
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
227 return unorm_normalize (src.get.ptr, src.len, mode, o, dst, len, e);
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
228 }
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
229
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
230 dst.format (&fmt, "failed to normalize");
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
231 }
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
232
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
233 /***********************************************************************
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
234
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
235 Performing quick check on a string, to quickly determine
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
236 if the string is in a particular normalization format.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
237
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
238 Three types of result can be returned: Yes, No or Maybe.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
239 Result Yes indicates that the argument string is in the
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
240 desired normalized format, No determines that argument
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
241 string is not in the desired normalized format. A Maybe
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
242 result indicates that a more thorough check is required,
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
243 the user may have to put the string in its normalized
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
244 form and compare the results.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
245
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
246 ***********************************************************************/
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
247
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
248 static Check check (UText t, Mode mode, Options o = Options.None)
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
249 {
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
250 Error e;
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
251
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
252 Check c = cast(Check) unorm_quickCheckWithOptions (t.get.ptr, t.len, mode, o, e);
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
253 testError (e, "failed to perform normalization check");
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
254 return c;
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
255 }
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
256
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
257 /***********************************************************************
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
258
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
259 Test if a string is in a given normalization form.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
260
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
261 Unlike check(), this function returns a definitive result,
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
262 never a "maybe". For NFD, NFKD, and FCD, both functions
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
263 work exactly the same. For NFC and NFKC where quickCheck
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
264 may return "maybe", this function will perform further
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
265 tests to arrive at a TRUE/FALSE result.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
266
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
267 ***********************************************************************/
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
268
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
269 static bool isNormalized (UText t, Mode mode, Options o = Options.None)
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
270 {
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
271 Error e;
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
272
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
273 byte b = unorm_isNormalizedWithOptions (t.get.ptr, t.len, mode, o, e);
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
274 testError (e, "failed to perform normalization test");
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
275 return b != 0;
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
276 }
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
277
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
278 /***********************************************************************
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
279
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
280 Concatenate normalized strings, making sure that the result
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
281 is normalized as well. If both the left and the right strings
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
282 are in the normalization form according to "mode/options",
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
283 then the result will be
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
284
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
285 dest=normalize(left+right, mode, options)
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
286
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
287 With the input strings already being normalized, this function
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
288 will use unorm_next() and unorm_previous() to find the adjacent
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
289 end pieces of the input strings. Only the concatenation of these
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
290 end pieces will be normalized and then concatenated with the
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
291 remaining parts of the input strings.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
292
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
293 It is allowed to have dst==left to avoid copying the entire
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
294 left string.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
295
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
296 ***********************************************************************/
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
297
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
298 static void concatenate (UText left, UText right, UString dst, Mode mode, Options o = Options.None)
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
299 {
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
300 uint fmt (wchar* p, uint len, inout Error e)
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
301 {
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
302 return unorm_concatenate (left.get.ptr, left.len, right.get.ptr, right.len, p, len, mode, o, e);
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
303 }
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
304
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
305 dst.format (&fmt, "failed to concatenate");
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
306 }
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
307
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
308 /***********************************************************************
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
309
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
310 Compare two strings for canonical equivalence. Further
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
311 options include case-insensitive comparison and code
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
312 point order (as opposed to code unit order).
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
313
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
314 Canonical equivalence between two strings is defined as
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
315 their normalized forms (NFD or NFC) being identical.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
316 This function compares strings incrementally instead of
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
317 normalizing (and optionally case-folding) both strings
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
318 entirely, improving performance significantly.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
319
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
320 Bulk normalization is only necessary if the strings do
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
321 not fulfill the FCD conditions. Only in this case, and
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
322 only if the strings are relatively long, is memory
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
323 allocated temporarily. For FCD strings and short non-FCD
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
324 strings there is no memory allocation.
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
325
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
326 ***********************************************************************/
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
327
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
328 static int compare (UText left, UText right, Options o = Options.None)
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
329 {
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
330 Error e;
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
331
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
332 int i = unorm_compare (left.get.ptr, left.len, right.get.ptr, right.len, o, e);
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
333 testError (e, "failed to compare");
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
334 return i;
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
335 }
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
336
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
337
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
338 /***********************************************************************
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
339
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
340 Bind the ICU functions from a shared library. This is
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
341 complicated by the issues regarding D and DLLs on the
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
342 Windows platform
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
343
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
344 ***********************************************************************/
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
345
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
346 private static void* library;
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
347
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
348 /***********************************************************************
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
349
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
350 ***********************************************************************/
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
351
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
352 private static extern (C)
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
353 {
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
354 uint function (wchar*, uint, uint, uint, wchar*, uint, inout Error) unorm_normalize;
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
355 uint function (wchar*, uint, uint, uint, inout Error) unorm_quickCheckWithOptions;
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
356 byte function (wchar*, uint, uint, uint, inout Error) unorm_isNormalizedWithOptions;
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
357 uint function (wchar*, uint, wchar*, uint, wchar*, uint, uint, uint, inout Error) unorm_concatenate;
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
358 uint function (wchar*, uint, wchar*, uint, uint, inout Error) unorm_compare;
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
359 }
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
360
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
361 /***********************************************************************
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
362
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
363 ***********************************************************************/
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
364
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
365 static FunctionLoader.Bind[] targets =
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
366 [
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
367 {cast(void**) &unorm_normalize, "unorm_normalize"},
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
368 {cast(void**) &unorm_quickCheckWithOptions, "unorm_quickCheckWithOptions"},
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
369 {cast(void**) &unorm_isNormalizedWithOptions, "unorm_isNormalizedWithOptions"},
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
370 {cast(void**) &unorm_concatenate, "unorm_concatenate"},
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
371 {cast(void**) &unorm_compare, "unorm_compare"},
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
372 ];
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
373
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
374 /***********************************************************************
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
375
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
376 ***********************************************************************/
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
377
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
378 static this ()
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
379 {
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
380 library = FunctionLoader.bind (icuuc, targets);
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
381 }
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
382
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
383 /***********************************************************************
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
384
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
385 ***********************************************************************/
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
386
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
387 static ~this ()
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
388 {
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
389 FunctionLoader.unbind (library);
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
390 }
040da1cb0d76 Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff changeset
391 }