annotate lphobos/std/utf.d @ 86:fd32135dca3e trunk

[svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!! Lots of bugfixes. Added support for special foreach on strings. Added std.array, std.utf, std.ctype and std.uni to phobos. Changed all the .c files in the gen dir to .cpp (it *is* C++ after all)
author lindquist
date Sat, 03 Nov 2007 14:44:58 +0100
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
86
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
1 // utf.d
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
2
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
3 /*
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
4 * Copyright (C) 2003-2004 by Digital Mars, www.digitalmars.com
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
5 * Written by Walter Bright
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
6 *
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
7 * This software is provided 'as-is', without any express or implied
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
8 * warranty. In no event will the authors be held liable for any damages
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
9 * arising from the use of this software.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
10 *
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
11 * Permission is granted to anyone to use this software for any purpose,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
12 * including commercial applications, and to alter it and redistribute it
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
13 * freely, subject to the following restrictions:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
14 *
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
15 * o The origin of this software must not be misrepresented; you must not
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
16 * claim that you wrote the original software. If you use this software
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
17 * in a product, an acknowledgment in the product documentation would be
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
18 * appreciated but is not required.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
19 * o Altered source versions must be plainly marked as such, and must not
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
20 * be misrepresented as being the original software.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
21 * o This notice may not be removed or altered from any source
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
22 * distribution.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
23 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
24
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
25 /********************************************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
26 * Encode and decode UTF-8, UTF-16 and UTF-32 strings.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
27 *
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
28 * For Win32 systems, the C wchar_t type is UTF-16 and corresponds to the D
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
29 * wchar type.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
30 * For linux systems, the C wchar_t type is UTF-32 and corresponds to
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
31 * the D utf.dchar type.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
32 *
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
33 * UTF character support is restricted to (\u0000 <= character <= \U0010FFFF).
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
34 *
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
35 * See_Also:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
36 * $(LINK2 http://en.wikipedia.org/wiki/Unicode, Wikipedia)<br>
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
37 * $(LINK http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8)<br>
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
38 * $(LINK http://anubis.dkuug.dk/JTC1/SC2/WG2/docs/n1335)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
39 * Macros:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
40 * WIKI = Phobos/StdUtf
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
41 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
42
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
43 module std.utf;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
44
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
45 private import std.stdio;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
46
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
47 //debug=utf; // uncomment to turn on debugging printf's
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
48
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
49 deprecated class UtfError : Error
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
50 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
51 size_t idx; // index in string of where error occurred
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
52
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
53 this(char[] s, size_t i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
54 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
55 idx = i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
56 super(s);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
57 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
58 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
59
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
60 /**********************************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
61 * Exception class that is thrown upon any errors.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
62 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
63
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
64 class UtfException : Exception
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
65 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
66 size_t idx; /// index in string of where error occurred
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
67
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
68 this(char[] s, size_t i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
69 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
70 idx = i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
71 super(s);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
72 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
73 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
74
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
75 /*******************************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
76 * Test if c is a valid UTF-32 character.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
77 *
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
78 * \uFFFE and \uFFFF are considered valid by this function,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
79 * as they are permitted for internal use by an application,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
80 * but they are not allowed for interchange by the Unicode standard.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
81 *
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
82 * Returns: true if it is, false if not.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
83 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
84
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
85 bool isValidDchar(dchar c)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
86 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
87 /* Note: FFFE and FFFF are specifically permitted by the
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
88 * Unicode standard for application internal use, but are not
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
89 * allowed for interchange.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
90 * (thanks to Arcane Jill)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
91 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
92
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
93 return c < 0xD800 ||
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
94 (c > 0xDFFF && c <= 0x10FFFF /*&& c != 0xFFFE && c != 0xFFFF*/);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
95 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
96
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
97 unittest
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
98 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
99 debug(utf) printf("utf.isValidDchar.unittest\n");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
100 assert(isValidDchar(cast(dchar)'a') == true);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
101 assert(isValidDchar(cast(dchar)0x1FFFFF) == false);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
102 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
103
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
104
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
105 ubyte[256] UTF8stride =
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
106 [
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
107 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
108 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
109 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
110 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
111 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
112 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
113 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
114 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
115 0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
116 0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
117 0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
118 0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
119 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
120 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
121 3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
122 4,4,4,4,4,4,4,4,5,5,5,5,6,6,0xFF,0xFF,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
123 ];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
124
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
125 /**
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
126 * stride() returns the length of a UTF-8 sequence starting at index i
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
127 * in string s.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
128 * Returns:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
129 * The number of bytes in the UTF-8 sequence or
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
130 * 0xFF meaning s[i] is not the start of of UTF-8 sequence.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
131 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
132
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
133 uint stride(char[] s, size_t i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
134 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
135 return UTF8stride[s[i]];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
136 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
137
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
138 /**
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
139 * stride() returns the length of a UTF-16 sequence starting at index i
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
140 * in string s.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
141 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
142
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
143 uint stride(wchar[] s, size_t i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
144 { uint u = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
145 return 1 + (u >= 0xD800 && u <= 0xDBFF);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
146 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
147
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
148 /**
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
149 * stride() returns the length of a UTF-32 sequence starting at index i
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
150 * in string s.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
151 * Returns: The return value will always be 1.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
152 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
153
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
154 uint stride(dchar[] s, size_t i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
155 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
156 return 1;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
157 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
158
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
159 /*******************************************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
160 * Given an index i into an array of characters s[],
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
161 * and assuming that index i is at the start of a UTF character,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
162 * determine the number of UCS characters up to that index i.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
163 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
164
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
165 size_t toUCSindex(char[] s, size_t i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
166 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
167 size_t n;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
168 size_t j;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
169 size_t stride;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
170
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
171 for (j = 0; j < i; j += stride)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
172 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
173 stride = UTF8stride[s[j]];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
174 if (stride == 0xFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
175 goto Lerr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
176 n++;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
177 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
178 if (j > i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
179 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
180 Lerr:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
181 throw new UtfException("1invalid UTF-8 sequence", j);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
182 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
183 return n;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
184 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
185
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
186 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
187
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
188 size_t toUCSindex(wchar[] s, size_t i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
189 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
190 size_t n;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
191 size_t j;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
192
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
193 for (j = 0; j < i; )
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
194 { uint u = s[j];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
195
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
196 j += 1 + (u >= 0xD800 && u <= 0xDBFF);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
197 n++;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
198 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
199 if (j > i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
200 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
201 Lerr:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
202 throw new UtfException("2invalid UTF-16 sequence", j);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
203 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
204 return n;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
205 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
206
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
207 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
208
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
209 size_t toUCSindex(dchar[] s, size_t i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
210 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
211 return i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
212 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
213
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
214 /******************************************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
215 * Given a UCS index n into an array of characters s[], return the UTF index.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
216 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
217
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
218 size_t toUTFindex(char[] s, size_t n)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
219 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
220 size_t i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
221
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
222 while (n--)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
223 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
224 uint j = UTF8stride[s[i]];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
225 if (j == 0xFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
226 throw new UtfException("3invalid UTF-8 sequence", i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
227 i += j;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
228 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
229 return i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
230 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
231
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
232 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
233
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
234 size_t toUTFindex(wchar[] s, size_t n)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
235 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
236 size_t i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
237
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
238 while (n--)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
239 { wchar u = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
240
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
241 i += 1 + (u >= 0xD800 && u <= 0xDBFF);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
242 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
243 return i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
244 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
245
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
246 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
247
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
248 size_t toUTFindex(dchar[] s, size_t n)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
249 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
250 return n;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
251 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
252
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
253 /* =================== Decode ======================= */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
254
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
255 /***************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
256 * Decodes and returns character starting at s[idx]. idx is advanced past the
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
257 * decoded character. If the character is not well formed, a UtfException is
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
258 * thrown and idx remains unchanged.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
259 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
260
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
261 dchar decode(char[] s, inout size_t idx)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
262 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
263 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
264 assert(idx >= 0 && idx < s.length);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
265 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
266 out (result)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
267 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
268 assert(isValidDchar(result));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
269 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
270 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
271 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
272 size_t len = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
273 dchar V;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
274 size_t i = idx;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
275 char u = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
276
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
277 if (u & 0x80)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
278 { uint n;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
279 char u2;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
280
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
281 /* The following encodings are valid, except for the 5 and 6 byte
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
282 * combinations:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
283 * 0xxxxxxx
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
284 * 110xxxxx 10xxxxxx
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
285 * 1110xxxx 10xxxxxx 10xxxxxx
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
286 * 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
287 * 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
288 * 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
289 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
290 for (n = 1; ; n++)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
291 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
292 if (n > 4)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
293 goto Lerr; // only do the first 4 of 6 encodings
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
294 if (((u << n) & 0x80) == 0)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
295 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
296 if (n == 1)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
297 goto Lerr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
298 break;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
299 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
300 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
301
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
302 // Pick off (7 - n) significant bits of B from first byte of octet
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
303 V = cast(dchar)(u & ((1 << (7 - n)) - 1));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
304
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
305 if (i + (n - 1) >= len)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
306 goto Lerr; // off end of string
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
307
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
308 /* The following combinations are overlong, and illegal:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
309 * 1100000x (10xxxxxx)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
310 * 11100000 100xxxxx (10xxxxxx)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
311 * 11110000 1000xxxx (10xxxxxx 10xxxxxx)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
312 * 11111000 10000xxx (10xxxxxx 10xxxxxx 10xxxxxx)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
313 * 11111100 100000xx (10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
314 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
315 u2 = s[i + 1];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
316 if ((u & 0xFE) == 0xC0 ||
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
317 (u == 0xE0 && (u2 & 0xE0) == 0x80) ||
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
318 (u == 0xF0 && (u2 & 0xF0) == 0x80) ||
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
319 (u == 0xF8 && (u2 & 0xF8) == 0x80) ||
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
320 (u == 0xFC && (u2 & 0xFC) == 0x80))
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
321 goto Lerr; // overlong combination
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
322
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
323 for (uint j = 1; j != n; j++)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
324 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
325 u = s[i + j];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
326 if ((u & 0xC0) != 0x80)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
327 goto Lerr; // trailing bytes are 10xxxxxx
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
328 V = (V << 6) | (u & 0x3F);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
329 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
330 if (!isValidDchar(V))
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
331 goto Lerr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
332 i += n;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
333 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
334 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
335 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
336 V = cast(dchar) u;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
337 i++;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
338 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
339
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
340 idx = i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
341 return V;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
342
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
343 Lerr:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
344 //printf("\ndecode: idx = %d, i = %d, length = %d s = \n'%.*s'\n%x\n'%.*s'\n", idx, i, s.length, s, s[i], s[i .. length]);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
345 throw new UtfException("4invalid UTF-8 sequence", i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
346 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
347
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
348 unittest
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
349 { size_t i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
350 dchar c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
351
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
352 debug(utf) printf("utf.decode.unittest\n");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
353
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
354 static char[] s1 = "abcd";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
355 i = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
356 c = decode(s1, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
357 assert(c == cast(dchar)'a');
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
358 assert(i == 1);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
359 c = decode(s1, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
360 assert(c == cast(dchar)'b');
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
361 assert(i == 2);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
362
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
363 static char[] s2 = "\xC2\xA9";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
364 i = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
365 c = decode(s2, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
366 assert(c == cast(dchar)'\u00A9');
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
367 assert(i == 2);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
368
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
369 static char[] s3 = "\xE2\x89\xA0";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
370 i = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
371 c = decode(s3, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
372 assert(c == cast(dchar)'\u2260');
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
373 assert(i == 3);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
374
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
375 static char[][] s4 =
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
376 [ "\xE2\x89", // too short
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
377 "\xC0\x8A",
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
378 "\xE0\x80\x8A",
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
379 "\xF0\x80\x80\x8A",
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
380 "\xF8\x80\x80\x80\x8A",
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
381 "\xFC\x80\x80\x80\x80\x8A",
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
382 ];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
383
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
384 for (int j = 0; j < s4.length; j++)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
385 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
386 try
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
387 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
388 i = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
389 c = decode(s4[j], i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
390 assert(0);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
391 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
392 catch (UtfException u)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
393 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
394 i = 23;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
395 delete u;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
396 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
397 assert(i == 23);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
398 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
399 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
400
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
401 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
402
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
403 dchar decode(wchar[] s, inout size_t idx)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
404 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
405 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
406 assert(idx >= 0 && idx < s.length);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
407 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
408 out (result)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
409 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
410 assert(isValidDchar(result));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
411 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
412 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
413 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
414 char[] msg;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
415 dchar V;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
416 size_t i = idx;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
417 uint u = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
418
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
419 if (u & ~0x7F)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
420 { if (u >= 0xD800 && u <= 0xDBFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
421 { uint u2;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
422
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
423 if (i + 1 == s.length)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
424 { msg = "surrogate UTF-16 high value past end of string";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
425 goto Lerr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
426 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
427 u2 = s[i + 1];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
428 if (u2 < 0xDC00 || u2 > 0xDFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
429 { msg = "surrogate UTF-16 low value out of range";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
430 goto Lerr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
431 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
432 u = ((u - 0xD7C0) << 10) + (u2 - 0xDC00);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
433 i += 2;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
434 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
435 else if (u >= 0xDC00 && u <= 0xDFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
436 { msg = "unpaired surrogate UTF-16 value";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
437 goto Lerr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
438 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
439 else if (u == 0xFFFE || u == 0xFFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
440 { msg = "illegal UTF-16 value";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
441 goto Lerr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
442 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
443 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
444 i++;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
445 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
446 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
447 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
448 i++;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
449 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
450
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
451 idx = i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
452 return cast(dchar)u;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
453
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
454 Lerr:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
455 throw new UtfException(msg, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
456 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
457
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
458 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
459
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
460 dchar decode(dchar[] s, inout size_t idx)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
461 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
462 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
463 assert(idx >= 0 && idx < s.length);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
464 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
465 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
466 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
467 size_t i = idx;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
468 dchar c = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
469
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
470 if (!isValidDchar(c))
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
471 goto Lerr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
472 idx = i + 1;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
473 return c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
474
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
475 Lerr:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
476 throw new UtfException("5invalid UTF-32 value", i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
477 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
478
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
479
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
480 /* =================== Encode ======================= */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
481
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
482 /*******************************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
483 * Encodes character c and appends it to array s[].
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
484 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
485
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
486 void encode(inout char[] s, dchar c)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
487 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
488 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
489 assert(isValidDchar(c));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
490 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
491 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
492 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
493 char[] r = s;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
494
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
495 if (c <= 0x7F)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
496 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
497 r ~= cast(char) c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
498 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
499 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
500 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
501 char[4] buf;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
502 uint L;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
503
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
504 if (c <= 0x7FF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
505 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
506 buf[0] = cast(char)(0xC0 | (c >> 6));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
507 buf[1] = cast(char)(0x80 | (c & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
508 L = 2;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
509 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
510 else if (c <= 0xFFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
511 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
512 buf[0] = cast(char)(0xE0 | (c >> 12));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
513 buf[1] = cast(char)(0x80 | ((c >> 6) & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
514 buf[2] = cast(char)(0x80 | (c & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
515 L = 3;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
516 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
517 else if (c <= 0x10FFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
518 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
519 buf[0] = cast(char)(0xF0 | (c >> 18));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
520 buf[1] = cast(char)(0x80 | ((c >> 12) & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
521 buf[2] = cast(char)(0x80 | ((c >> 6) & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
522 buf[3] = cast(char)(0x80 | (c & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
523 L = 4;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
524 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
525 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
526 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
527 assert(0);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
528 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
529 r ~= buf[0 .. L];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
530 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
531 s = r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
532 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
533
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
534 unittest
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
535 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
536 debug(utf) printf("utf.encode.unittest\n");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
537
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
538 char[] s = "abcd";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
539 encode(s, cast(dchar)'a');
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
540 assert(s.length == 5);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
541 assert(s == "abcda");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
542
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
543 encode(s, cast(dchar)'\u00A9');
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
544 assert(s.length == 7);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
545 assert(s == "abcda\xC2\xA9");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
546 //assert(s == "abcda\u00A9"); // BUG: fix compiler
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
547
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
548 encode(s, cast(dchar)'\u2260');
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
549 assert(s.length == 10);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
550 assert(s == "abcda\xC2\xA9\xE2\x89\xA0");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
551 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
552
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
553 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
554
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
555 void encode(inout wchar[] s, dchar c)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
556 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
557 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
558 assert(isValidDchar(c));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
559 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
560 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
561 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
562 wchar[] r = s;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
563
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
564 if (c <= 0xFFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
565 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
566 r ~= cast(wchar) c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
567 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
568 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
569 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
570 wchar[2] buf;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
571
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
572 buf[0] = cast(wchar) ((((c - 0x10000) >> 10) & 0x3FF) + 0xD800);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
573 buf[1] = cast(wchar) (((c - 0x10000) & 0x3FF) + 0xDC00);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
574 r ~= buf;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
575 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
576 s = r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
577 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
578
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
579 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
580
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
581 void encode(inout dchar[] s, dchar c)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
582 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
583 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
584 assert(isValidDchar(c));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
585 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
586 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
587 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
588 s ~= c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
589 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
590
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
591 /* =================== Validation ======================= */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
592
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
593 /***********************************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
594 * Checks to see if string is well formed or not. Throws a UtfException if it is
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
595 * not. Use to check all untrusted input for correctness.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
596 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
597
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
598 void validate(char[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
599 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
600 size_t len = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
601 size_t i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
602
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
603 for (i = 0; i < len; )
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
604 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
605 decode(s, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
606 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
607 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
608
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
609 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
610
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
611 void validate(wchar[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
612 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
613 size_t len = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
614 size_t i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
615
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
616 for (i = 0; i < len; )
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
617 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
618 decode(s, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
619 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
620 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
621
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
622 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
623
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
624 void validate(dchar[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
625 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
626 size_t len = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
627 size_t i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
628
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
629 for (i = 0; i < len; )
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
630 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
631 decode(s, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
632 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
633 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
634
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
635 /* =================== Conversion to UTF8 ======================= */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
636
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
637 char[] toUTF8(char[4] buf, dchar c)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
638 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
639 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
640 assert(isValidDchar(c));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
641 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
642 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
643 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
644 if (c <= 0x7F)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
645 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
646 buf[0] = cast(char) c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
647 return buf[0 .. 1];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
648 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
649 else if (c <= 0x7FF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
650 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
651 buf[0] = cast(char)(0xC0 | (c >> 6));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
652 buf[1] = cast(char)(0x80 | (c & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
653 return buf[0 .. 2];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
654 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
655 else if (c <= 0xFFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
656 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
657 buf[0] = cast(char)(0xE0 | (c >> 12));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
658 buf[1] = cast(char)(0x80 | ((c >> 6) & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
659 buf[2] = cast(char)(0x80 | (c & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
660 return buf[0 .. 3];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
661 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
662 else if (c <= 0x10FFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
663 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
664 buf[0] = cast(char)(0xF0 | (c >> 18));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
665 buf[1] = cast(char)(0x80 | ((c >> 12) & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
666 buf[2] = cast(char)(0x80 | ((c >> 6) & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
667 buf[3] = cast(char)(0x80 | (c & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
668 return buf[0 .. 4];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
669 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
670 assert(0);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
671 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
672
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
673 /*******************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
674 * Encodes string s into UTF-8 and returns the encoded string.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
675 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
676
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
677 char[] toUTF8(char[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
678 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
679 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
680 validate(s);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
681 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
682 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
683 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
684 return s;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
685 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
686
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
687 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
688
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
689 char[] toUTF8(wchar[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
690 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
691 char[] r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
692 size_t i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
693 size_t slen = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
694
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
695 r.length = slen;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
696
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
697 for (i = 0; i < slen; i++)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
698 { wchar c = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
699
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
700 if (c <= 0x7F)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
701 r[i] = cast(char)c; // fast path for ascii
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
702 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
703 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
704 r.length = i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
705 foreach (dchar c; s[i .. slen])
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
706 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
707 encode(r, c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
708 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
709 break;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
710 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
711 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
712 return r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
713 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
714
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
715 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
716
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
717 char[] toUTF8(dchar[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
718 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
719 char[] r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
720 size_t i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
721 size_t slen = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
722
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
723 r.length = slen;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
724
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
725 for (i = 0; i < slen; i++)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
726 { dchar c = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
727
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
728 if (c <= 0x7F)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
729 r[i] = cast(char)c; // fast path for ascii
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
730 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
731 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
732 r.length = i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
733 foreach (dchar d; s[i .. slen])
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
734 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
735 encode(r, d);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
736 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
737 break;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
738 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
739 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
740 return r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
741 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
742
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
743 /* =================== Conversion to UTF16 ======================= */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
744
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
745 wchar[] toUTF16(wchar[2] buf, dchar c)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
746 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
747 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
748 assert(isValidDchar(c));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
749 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
750 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
751 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
752 if (c <= 0xFFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
753 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
754 buf[0] = cast(wchar) c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
755 return buf[0 .. 1];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
756 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
757 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
758 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
759 buf[0] = cast(wchar) ((((c - 0x10000) >> 10) & 0x3FF) + 0xD800);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
760 buf[1] = cast(wchar) (((c - 0x10000) & 0x3FF) + 0xDC00);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
761 return buf[0 .. 2];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
762 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
763 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
764
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
765 /****************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
766 * Encodes string s into UTF-16 and returns the encoded string.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
767 * toUTF16z() is suitable for calling the 'W' functions in the Win32 API that take
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
768 * an LPWSTR or LPCWSTR argument.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
769 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
770
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
771 wchar[] toUTF16(char[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
772 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
773 wchar[] r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
774 size_t slen = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
775
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
776 r.length = slen;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
777 r.length = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
778 for (size_t i = 0; i < slen; )
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
779 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
780 dchar c = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
781 if (c <= 0x7F)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
782 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
783 i++;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
784 r ~= cast(wchar)c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
785 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
786 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
787 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
788 c = decode(s, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
789 encode(r, c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
790 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
791 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
792 return r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
793 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
794
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
795 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
796
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
797 wchar* toUTF16z(char[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
798 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
799 wchar[] r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
800 size_t slen = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
801
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
802 r.length = slen + 1;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
803 r.length = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
804 for (size_t i = 0; i < slen; )
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
805 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
806 dchar c = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
807 if (c <= 0x7F)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
808 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
809 i++;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
810 r ~= cast(wchar)c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
811 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
812 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
813 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
814 c = decode(s, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
815 encode(r, c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
816 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
817 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
818 r ~= "\000";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
819 return r.ptr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
820 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
821
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
822 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
823
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
824 wchar[] toUTF16(wchar[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
825 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
826 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
827 validate(s);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
828 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
829 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
830 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
831 return s;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
832 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
833
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
834 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
835
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
836 wchar[] toUTF16(dchar[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
837 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
838 wchar[] r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
839 size_t slen = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
840
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
841 r.length = slen;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
842 r.length = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
843 for (size_t i = 0; i < slen; i++)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
844 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
845 encode(r, s[i]);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
846 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
847 return r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
848 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
849
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
850 /* =================== Conversion to UTF32 ======================= */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
851
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
852 /*****
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
853 * Encodes string s into UTF-32 and returns the encoded string.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
854 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
855
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
856 dchar[] toUTF32(char[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
857 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
858 dchar[] r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
859 size_t slen = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
860 size_t j = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
861
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
862 r.length = slen; // r[] will never be longer than s[]
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
863 for (size_t i = 0; i < slen; )
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
864 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
865 dchar c = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
866 if (c >= 0x80)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
867 c = decode(s, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
868 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
869 i++; // c is ascii, no need for decode
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
870 r[j++] = c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
871 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
872 return r[0 .. j];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
873 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
874
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
875 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
876
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
877 dchar[] toUTF32(wchar[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
878 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
879 dchar[] r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
880 size_t slen = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
881 size_t j = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
882
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
883 r.length = slen; // r[] will never be longer than s[]
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
884 for (size_t i = 0; i < slen; )
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
885 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
886 dchar c = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
887 if (c >= 0x80)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
888 c = decode(s, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
889 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
890 i++; // c is ascii, no need for decode
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
891 r[j++] = c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
892 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
893 return r[0 .. j];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
894 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
895
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
896 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
897
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
898 dchar[] toUTF32(dchar[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
899 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
900 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
901 validate(s);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
902 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
903 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
904 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
905 return s;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
906 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
907
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
908 /* ================================ tests ================================== */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
909
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
910 unittest
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
911 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
912 debug(utf) printf("utf.toUTF.unittest\n");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
913
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
914 char[] c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
915 wchar[] w;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
916 dchar[] d;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
917
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
918 c = "hello";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
919 w = toUTF16(c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
920 assert(w == "hello");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
921 d = toUTF32(c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
922 assert(d == "hello");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
923
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
924 c = toUTF8(w);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
925 assert(c == "hello");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
926 d = toUTF32(w);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
927 assert(d == "hello");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
928
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
929 c = toUTF8(d);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
930 assert(c == "hello");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
931 w = toUTF16(d);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
932 assert(w == "hello");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
933
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
934
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
935 c = "hel\u1234o";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
936 w = toUTF16(c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
937 assert(w == "hel\u1234o");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
938 d = toUTF32(c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
939 assert(d == "hel\u1234o");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
940
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
941 c = toUTF8(w);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
942 assert(c == "hel\u1234o");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
943 d = toUTF32(w);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
944 assert(d == "hel\u1234o");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
945
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
946 c = toUTF8(d);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
947 assert(c == "hel\u1234o");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
948 w = toUTF16(d);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
949 assert(w == "hel\u1234o");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
950
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
951
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
952 c = "he\U0010AAAAllo";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
953 w = toUTF16(c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
954 //foreach (wchar c; w) printf("c = x%x\n", c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
955 //foreach (wchar c; cast(wchar[])"he\U0010AAAAllo") printf("c = x%x\n", c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
956 assert(w == "he\U0010AAAAllo");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
957 d = toUTF32(c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
958 assert(d == "he\U0010AAAAllo");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
959
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
960 c = toUTF8(w);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
961 assert(c == "he\U0010AAAAllo");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
962 d = toUTF32(w);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
963 assert(d == "he\U0010AAAAllo");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
964
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
965 c = toUTF8(d);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
966 assert(c == "he\U0010AAAAllo");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
967 w = toUTF16(d);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
968 assert(w == "he\U0010AAAAllo");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
969 }