Mercurial > projects > dwt-addons
annotate dwtx/dwtxhelper/mangoicu/UNormalize.d @ 89:040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
author | Frank Benoit <benoit@tionex.de> |
---|---|
date | Sun, 22 Jun 2008 22:57:31 +0200 |
parents | |
children | 11e8159caf7a |
rev | line source |
---|---|
89
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
1 /******************************************************************************* |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
2 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
3 @file UNormalize.d |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
4 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
5 Copyright (c) 2004 Kris Bell |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
6 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
7 This software is provided 'as-is', without any express or implied |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
8 warranty. In no event will the authors be held liable for damages |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
9 of any kind arising from the use of this software. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
10 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
11 Permission is hereby granted to anyone to use this software for any |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
12 purpose, including commercial applications, and to alter it and/or |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
13 redistribute it freely, subject to the following restrictions: |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
14 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
15 1. The origin of this software must not be misrepresented; you must |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
16 not claim that you wrote the original software. If you use this |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
17 software in a product, an acknowledgment within documentation of |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
18 said product would be appreciated but is not required. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
19 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
20 2. Altered source versions must be plainly marked as such, and must |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
21 not be misrepresented as being the original software. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
22 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
23 3. This notice may not be removed or altered from any distribution |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
24 of the source. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
25 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
26 4. Derivative works are permitted, but they must carry this notice |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
27 in full and credit the original source. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
28 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
29 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
30 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
31 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
32 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
33 @version Initial version, October 2004 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
34 @author Kris |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
35 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
36 Note that this package and documentation is built around the ICU |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
37 project (http://oss.software.ibm.com/icu/). Below is the license |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
38 statement as specified by that software: |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
39 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
40 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
41 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
42 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
43 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
44 ICU License - ICU 1.8.1 and later |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
45 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
46 COPYRIGHT AND PERMISSION NOTICE |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
47 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
48 Copyright (c) 1995-2003 International Business Machines Corporation and |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
49 others. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
50 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
51 All rights reserved. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
52 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
53 Permission is hereby granted, free of charge, to any person obtaining a |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
54 copy of this software and associated documentation files (the |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
55 "Software"), to deal in the Software without restriction, including |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
56 without limitation the rights to use, copy, modify, merge, publish, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
57 distribute, and/or sell copies of the Software, and to permit persons |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
58 to whom the Software is furnished to do so, provided that the above |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
59 copyright notice(s) and this permission notice appear in all copies of |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
60 the Software and that both the above copyright notice(s) and this |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
61 permission notice appear in supporting documentation. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
62 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
63 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
64 OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
65 MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
66 OF THIRD PARTY RIGHTS. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
67 HOLDERS INCLUDED IN THIS NOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
68 INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
69 FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
70 NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
71 WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
72 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
73 Except as contained in this notice, the name of a copyright holder |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
74 shall not be used in advertising or otherwise to promote the sale, use |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
75 or other dealings in this Software without prior written authorization |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
76 of the copyright holder. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
77 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
78 ---------------------------------------------------------------------- |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
79 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
80 All trademarks and registered trademarks mentioned herein are the |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
81 property of their respective owners. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
82 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
83 *******************************************************************************/ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
84 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
85 module dwtx.dwthelper.mangoicu.UNormalize; |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
86 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
87 private import dwtx.dwthelper.mangoicu.ICU, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
88 dwtx.dwthelper.mangoicu.UString, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
89 dwtx.dwthelper.mangoicu.ULocale; |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
90 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
91 /******************************************************************************* |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
92 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
93 transforms Unicode text into an equivalent composed or |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
94 decomposed form, allowing for easier sorting and searching |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
95 of text. UNormalize supports the standard normalization forms |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
96 described in http://www.unicode.org/unicode/reports/tr15/ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
97 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
98 Characters with accents or other adornments can be encoded |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
99 in several different ways in Unicode. For example, take the |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
100 character A-acute. In Unicode, this can be encoded as a single |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
101 character (the "composed" form): |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
102 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
103 00C1 LATIN CAPITAL LETTER A WITH ACUTE |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
104 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
105 or as two separate characters (the "decomposed" form): |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
106 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
107 0041 LATIN CAPITAL LETTER A 0301 COMBINING ACUTE ACCENT |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
108 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
109 To a user of your program, however, both of these sequences |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
110 should be treated as the same "user-level" character "A with |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
111 acute accent". When you are searching or comparing text, you |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
112 must ensure that these two sequences are treated equivalently. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
113 In addition, you must handle characters with more than one |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
114 accent. Sometimes the order of a character's combining accents |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
115 is significant, while in other cases accent sequences in different |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
116 orders are really equivalent. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
117 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
118 Similarly, the string "ffi" can be encoded as three separate |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
119 letters: |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
120 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
121 0066 LATIN SMALL LETTER F 0066 LATIN SMALL LETTER F |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
122 0069 LATIN SMALL LETTER I |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
123 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
124 or as the single character |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
125 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
126 FB03 LATIN SMALL LIGATURE FFI |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
127 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
128 The ffi ligature is not a distinct semantic character, and strictly |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
129 speaking it shouldn't be in Unicode at all, but it was included for |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
130 compatibility with existing character sets that already provided it. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
131 The Unicode standard identifies such characters by giving them |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
132 "compatibility" decompositions into the corresponding semantic |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
133 characters. When sorting and searching, you will often want to use |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
134 these mappings. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
135 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
136 unorm_normalize helps solve these problems by transforming text into |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
137 the canonical composed and decomposed forms as shown in the first |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
138 example above. In addition, you can have it perform compatibility |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
139 decompositions so that you can treat compatibility characters the |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
140 same as their equivalents. Finally, UNormalize rearranges |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
141 accents into the proper canonical order, so that you do not have |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
142 to worry about accent rearrangement on your own. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
143 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
144 Form FCD, "Fast C or D", is also designed for collation. It allows |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
145 to work on strings that are not necessarily normalized with an |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
146 algorithm (like in collation) that works under "canonical closure", |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
147 i.e., it treats precomposed characters and their decomposed |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
148 equivalents the same. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
149 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
150 It is not a normalization form because it does not provide for |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
151 uniqueness of representation. Multiple strings may be canonically |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
152 equivalent (their NFDs are identical) and may all conform to FCD |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
153 without being identical themselves. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
154 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
155 The form is defined such that the "raw decomposition", the |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
156 recursive canonical decomposition of each character, results |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
157 in a string that is canonically ordered. This means that |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
158 precomposed characters are allowed for as long as their |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
159 decompositions do not need canonical reordering. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
160 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
161 Its advantage for a process like collation is that all NFD |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
162 and most NFC texts - and many unnormalized texts - already |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
163 conform to FCD and do not need to be normalized (NFD) for |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
164 such a process. The FCD quick check will return UNORM_YES |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
165 for most strings in practice. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
166 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
167 For more details on FCD see the collation design document: |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
168 http://oss.software.ibm.com/cvs/icu/~checkout~/icuhtml/design/collation/ICU_collation_design.htm |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
169 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
170 ICU collation performs either NFD or FCD normalization |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
171 automatically if normalization is turned on for the collator |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
172 object. Beyond collation and string search, normalized strings |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
173 may be useful for string equivalence comparisons, transliteration/ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
174 transcription, unique representations, etc. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
175 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
176 The W3C generally recommends to exchange texts in NFC. Note also |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
177 that most legacy character encodings use only precomposed forms |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
178 and often do not encode any combining marks by themselves. For |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
179 conversion to such character encodings the Unicode text needs to |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
180 be normalized to NFC. For more usage examples, see the Unicode |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
181 Standard Annex. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
182 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
183 See <A HREF="http://oss.software.ibm.com/icu/apiref/unorm_8h.html"> |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
184 this page</A> for full details. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
185 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
186 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
187 *******************************************************************************/ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
188 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
189 class UNormalize : ICU |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
190 { |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
191 enum Mode |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
192 { |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
193 None = 1, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
194 NFD = 2, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
195 NFKD = 3, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
196 NFC = 4, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
197 Default = NFC, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
198 NFKC = 5, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
199 FCD = 6, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
200 Count |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
201 } |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
202 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
203 enum Check |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
204 { |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
205 No, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
206 Yes, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
207 Maybe |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
208 } |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
209 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
210 enum Options |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
211 { |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
212 None = 0x00, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
213 Unicode32 = 0x20 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
214 } |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
215 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
216 /*********************************************************************** |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
217 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
218 Normalize a string. The string will be normalized according |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
219 the specified normalization mode and options |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
220 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
221 ***********************************************************************/ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
222 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
223 static void normalize (UText src, UString dst, Mode mode, Options o = Options.None) |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
224 { |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
225 uint fmt (wchar* dst, uint len, inout Error e) |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
226 { |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
227 return unorm_normalize (src.get.ptr, src.len, mode, o, dst, len, e); |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
228 } |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
229 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
230 dst.format (&fmt, "failed to normalize"); |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
231 } |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
232 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
233 /*********************************************************************** |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
234 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
235 Performing quick check on a string, to quickly determine |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
236 if the string is in a particular normalization format. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
237 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
238 Three types of result can be returned: Yes, No or Maybe. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
239 Result Yes indicates that the argument string is in the |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
240 desired normalized format, No determines that argument |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
241 string is not in the desired normalized format. A Maybe |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
242 result indicates that a more thorough check is required, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
243 the user may have to put the string in its normalized |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
244 form and compare the results. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
245 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
246 ***********************************************************************/ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
247 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
248 static Check check (UText t, Mode mode, Options o = Options.None) |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
249 { |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
250 Error e; |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
251 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
252 Check c = cast(Check) unorm_quickCheckWithOptions (t.get.ptr, t.len, mode, o, e); |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
253 testError (e, "failed to perform normalization check"); |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
254 return c; |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
255 } |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
256 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
257 /*********************************************************************** |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
258 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
259 Test if a string is in a given normalization form. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
260 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
261 Unlike check(), this function returns a definitive result, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
262 never a "maybe". For NFD, NFKD, and FCD, both functions |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
263 work exactly the same. For NFC and NFKC where quickCheck |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
264 may return "maybe", this function will perform further |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
265 tests to arrive at a TRUE/FALSE result. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
266 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
267 ***********************************************************************/ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
268 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
269 static bool isNormalized (UText t, Mode mode, Options o = Options.None) |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
270 { |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
271 Error e; |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
272 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
273 byte b = unorm_isNormalizedWithOptions (t.get.ptr, t.len, mode, o, e); |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
274 testError (e, "failed to perform normalization test"); |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
275 return b != 0; |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
276 } |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
277 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
278 /*********************************************************************** |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
279 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
280 Concatenate normalized strings, making sure that the result |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
281 is normalized as well. If both the left and the right strings |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
282 are in the normalization form according to "mode/options", |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
283 then the result will be |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
284 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
285 dest=normalize(left+right, mode, options) |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
286 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
287 With the input strings already being normalized, this function |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
288 will use unorm_next() and unorm_previous() to find the adjacent |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
289 end pieces of the input strings. Only the concatenation of these |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
290 end pieces will be normalized and then concatenated with the |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
291 remaining parts of the input strings. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
292 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
293 It is allowed to have dst==left to avoid copying the entire |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
294 left string. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
295 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
296 ***********************************************************************/ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
297 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
298 static void concatenate (UText left, UText right, UString dst, Mode mode, Options o = Options.None) |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
299 { |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
300 uint fmt (wchar* p, uint len, inout Error e) |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
301 { |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
302 return unorm_concatenate (left.get.ptr, left.len, right.get.ptr, right.len, p, len, mode, o, e); |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
303 } |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
304 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
305 dst.format (&fmt, "failed to concatenate"); |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
306 } |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
307 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
308 /*********************************************************************** |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
309 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
310 Compare two strings for canonical equivalence. Further |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
311 options include case-insensitive comparison and code |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
312 point order (as opposed to code unit order). |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
313 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
314 Canonical equivalence between two strings is defined as |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
315 their normalized forms (NFD or NFC) being identical. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
316 This function compares strings incrementally instead of |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
317 normalizing (and optionally case-folding) both strings |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
318 entirely, improving performance significantly. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
319 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
320 Bulk normalization is only necessary if the strings do |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
321 not fulfill the FCD conditions. Only in this case, and |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
322 only if the strings are relatively long, is memory |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
323 allocated temporarily. For FCD strings and short non-FCD |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
324 strings there is no memory allocation. |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
325 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
326 ***********************************************************************/ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
327 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
328 static int compare (UText left, UText right, Options o = Options.None) |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
329 { |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
330 Error e; |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
331 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
332 int i = unorm_compare (left.get.ptr, left.len, right.get.ptr, right.len, o, e); |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
333 testError (e, "failed to compare"); |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
334 return i; |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
335 } |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
336 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
337 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
338 /*********************************************************************** |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
339 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
340 Bind the ICU functions from a shared library. This is |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
341 complicated by the issues regarding D and DLLs on the |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
342 Windows platform |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
343 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
344 ***********************************************************************/ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
345 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
346 private static void* library; |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
347 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
348 /*********************************************************************** |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
349 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
350 ***********************************************************************/ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
351 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
352 private static extern (C) |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
353 { |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
354 uint function (wchar*, uint, uint, uint, wchar*, uint, inout Error) unorm_normalize; |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
355 uint function (wchar*, uint, uint, uint, inout Error) unorm_quickCheckWithOptions; |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
356 byte function (wchar*, uint, uint, uint, inout Error) unorm_isNormalizedWithOptions; |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
357 uint function (wchar*, uint, wchar*, uint, wchar*, uint, uint, uint, inout Error) unorm_concatenate; |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
358 uint function (wchar*, uint, wchar*, uint, uint, inout Error) unorm_compare; |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
359 } |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
360 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
361 /*********************************************************************** |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
362 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
363 ***********************************************************************/ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
364 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
365 static FunctionLoader.Bind[] targets = |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
366 [ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
367 {cast(void**) &unorm_normalize, "unorm_normalize"}, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
368 {cast(void**) &unorm_quickCheckWithOptions, "unorm_quickCheckWithOptions"}, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
369 {cast(void**) &unorm_isNormalizedWithOptions, "unorm_isNormalizedWithOptions"}, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
370 {cast(void**) &unorm_concatenate, "unorm_concatenate"}, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
371 {cast(void**) &unorm_compare, "unorm_compare"}, |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
372 ]; |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
373 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
374 /*********************************************************************** |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
375 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
376 ***********************************************************************/ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
377 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
378 static this () |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
379 { |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
380 library = FunctionLoader.bind (icuuc, targets); |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
381 } |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
382 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
383 /*********************************************************************** |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
384 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
385 ***********************************************************************/ |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
386 |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
387 static ~this () |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
388 { |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
389 FunctionLoader.unbind (library); |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
390 } |
040da1cb0d76
Add a local copy of the mango ICU binding to work out the utf8 usability. Will hopefully go back into mango.
Frank Benoit <benoit@tionex.de>
parents:
diff
changeset
|
391 } |