annotate lphobos/std/utf.d @ 883:b52d5de7783f

GC defines and linkage changes.
author Christian Kamm <kamm incasoftware de>
date Thu, 08 Jan 2009 18:20:02 +0100
parents fd32135dca3e
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
86
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
1 // utf.d
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
2
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
3 /*
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
4 * Copyright (C) 2003-2004 by Digital Mars, www.digitalmars.com
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
5 * Written by Walter Bright
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
6 *
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
7 * This software is provided 'as-is', without any express or implied
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
8 * warranty. In no event will the authors be held liable for any damages
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
9 * arising from the use of this software.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
10 *
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
11 * Permission is granted to anyone to use this software for any purpose,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
12 * including commercial applications, and to alter it and redistribute it
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
13 * freely, subject to the following restrictions:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
14 *
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
15 * o The origin of this software must not be misrepresented; you must not
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
16 * claim that you wrote the original software. If you use this software
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
17 * in a product, an acknowledgment in the product documentation would be
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
18 * appreciated but is not required.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
19 * o Altered source versions must be plainly marked as such, and must not
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
20 * be misrepresented as being the original software.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
21 * o This notice may not be removed or altered from any source
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
22 * distribution.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
23 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
24
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
25 /********************************************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
26 * Encode and decode UTF-8, UTF-16 and UTF-32 strings.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
27 *
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
28 * For Win32 systems, the C wchar_t type is UTF-16 and corresponds to the D
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
29 * wchar type.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
30 * For linux systems, the C wchar_t type is UTF-32 and corresponds to
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
31 * the D utf.dchar type.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
32 *
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
33 * UTF character support is restricted to (\u0000 &lt;= character &lt;= \U0010FFFF).
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
34 *
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
35 * See_Also:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
36 * $(LINK2 http://en.wikipedia.org/wiki/Unicode, Wikipedia)<br>
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
37 * $(LINK http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8)<br>
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
38 * $(LINK http://anubis.dkuug.dk/JTC1/SC2/WG2/docs/n1335)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
39 * Macros:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
40 * WIKI = Phobos/StdUtf
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
41 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
42
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
43 module std.utf;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
44
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
45 private import std.stdio;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
46
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
47 //debug=utf; // uncomment to turn on debugging printf's
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
48
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
49 deprecated class UtfError : Error
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
50 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
51 size_t idx; // index in string of where error occurred
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
52
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
53 this(char[] s, size_t i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
54 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
55 idx = i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
56 super(s);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
57 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
58 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
59
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
60 /**********************************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
61 * Exception class that is thrown upon any errors.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
62 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
63
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
64 class UtfException : Exception
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
65 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
66 size_t idx; /// index in string of where error occurred
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
67
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
68 this(char[] s, size_t i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
69 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
70 idx = i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
71 super(s);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
72 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
73 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
74
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
75 /*******************************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
76 * Test if c is a valid UTF-32 character.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
77 *
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
78 * \uFFFE and \uFFFF are considered valid by this function,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
79 * as they are permitted for internal use by an application,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
80 * but they are not allowed for interchange by the Unicode standard.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
81 *
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
82 * Returns: true if it is, false if not.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
83 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
84
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
85 bool isValidDchar(dchar c)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
86 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
87 /* Note: FFFE and FFFF are specifically permitted by the
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
88 * Unicode standard for application internal use, but are not
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
89 * allowed for interchange.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
90 * (thanks to Arcane Jill)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
91 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
92
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
93 return c < 0xD800 ||
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
94 (c > 0xDFFF && c <= 0x10FFFF /*&& c != 0xFFFE && c != 0xFFFF*/);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
95 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
96
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
97 unittest
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
98 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
99 debug(utf) printf("utf.isValidDchar.unittest\n");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
100 assert(isValidDchar(cast(dchar)'a') == true);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
101 assert(isValidDchar(cast(dchar)0x1FFFFF) == false);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
102 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
103
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
104
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
105 ubyte[256] UTF8stride =
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
106 [
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
107 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
108 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
109 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
110 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
111 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
112 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
113 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
114 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
115 0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
116 0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
117 0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
118 0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
119 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
120 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
121 3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
122 4,4,4,4,4,4,4,4,5,5,5,5,6,6,0xFF,0xFF,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
123 ];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
124
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
125 /**
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
126 * stride() returns the length of a UTF-8 sequence starting at index i
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
127 * in string s.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
128 * Returns:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
129 * The number of bytes in the UTF-8 sequence or
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
130 * 0xFF meaning s[i] is not the start of of UTF-8 sequence.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
131 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
132
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
133 uint stride(char[] s, size_t i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
134 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
135 return UTF8stride[s[i]];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
136 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
137
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
138 /**
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
139 * stride() returns the length of a UTF-16 sequence starting at index i
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
140 * in string s.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
141 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
142
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
143 uint stride(wchar[] s, size_t i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
144 { uint u = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
145 return 1 + (u >= 0xD800 && u <= 0xDBFF);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
146 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
147
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
148 /**
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
149 * stride() returns the length of a UTF-32 sequence starting at index i
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
150 * in string s.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
151 * Returns: The return value will always be 1.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
152 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
153
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
154 uint stride(dchar[] s, size_t i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
155 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
156 return 1;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
157 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
158
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
159 /*******************************************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
160 * Given an index i into an array of characters s[],
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
161 * and assuming that index i is at the start of a UTF character,
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
162 * determine the number of UCS characters up to that index i.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
163 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
164
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
165 size_t toUCSindex(char[] s, size_t i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
166 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
167 size_t n;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
168 size_t j;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
169 size_t stride;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
170
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
171 for (j = 0; j < i; j += stride)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
172 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
173 stride = UTF8stride[s[j]];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
174 if (stride == 0xFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
175 goto Lerr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
176 n++;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
177 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
178 if (j > i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
179 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
180 Lerr:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
181 throw new UtfException("1invalid UTF-8 sequence", j);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
182 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
183 return n;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
184 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
185
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
186 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
187
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
188 size_t toUCSindex(wchar[] s, size_t i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
189 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
190 size_t n;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
191 size_t j;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
192
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
193 for (j = 0; j < i; )
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
194 { uint u = s[j];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
195
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
196 j += 1 + (u >= 0xD800 && u <= 0xDBFF);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
197 n++;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
198 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
199 if (j > i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
200 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
201 Lerr:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
202 throw new UtfException("2invalid UTF-16 sequence", j);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
203 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
204 return n;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
205 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
206
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
207 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
208
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
209 size_t toUCSindex(dchar[] s, size_t i)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
210 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
211 return i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
212 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
213
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
214 /******************************************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
215 * Given a UCS index n into an array of characters s[], return the UTF index.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
216 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
217
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
218 size_t toUTFindex(char[] s, size_t n)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
219 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
220 size_t i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
221
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
222 while (n--)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
223 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
224 uint j = UTF8stride[s[i]];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
225 if (j == 0xFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
226 throw new UtfException("3invalid UTF-8 sequence", i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
227 i += j;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
228 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
229 return i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
230 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
231
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
232 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
233
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
234 size_t toUTFindex(wchar[] s, size_t n)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
235 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
236 size_t i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
237
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
238 while (n--)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
239 { wchar u = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
240
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
241 i += 1 + (u >= 0xD800 && u <= 0xDBFF);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
242 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
243 return i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
244 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
245
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
246 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
247
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
248 size_t toUTFindex(dchar[] s, size_t n)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
249 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
250 return n;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
251 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
252
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
253 /* =================== Decode ======================= */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
254
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
255 /***************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
256 * Decodes and returns character starting at s[idx]. idx is advanced past the
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
257 * decoded character. If the character is not well formed, a UtfException is
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
258 * thrown and idx remains unchanged.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
259 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
260
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
261 dchar decode(char[] s, inout size_t idx)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
262 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
263 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
264 assert(idx >= 0 && idx < s.length);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
265 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
266 out (result)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
267 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
268 assert(isValidDchar(result));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
269 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
270 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
271 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
272 size_t len = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
273 dchar V;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
274 size_t i = idx;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
275 char u = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
276
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
277 if (u & 0x80)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
278 { uint n;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
279 char u2;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
280
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
281 /* The following encodings are valid, except for the 5 and 6 byte
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
282 * combinations:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
283 * 0xxxxxxx
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
284 * 110xxxxx 10xxxxxx
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
285 * 1110xxxx 10xxxxxx 10xxxxxx
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
286 * 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
287 * 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
288 * 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
289 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
290 for (n = 1; ; n++)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
291 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
292 if (n > 4)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
293 goto Lerr; // only do the first 4 of 6 encodings
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
294 if (((u << n) & 0x80) == 0)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
295 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
296 if (n == 1)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
297 goto Lerr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
298 break;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
299 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
300 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
301
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
302 // Pick off (7 - n) significant bits of B from first byte of octet
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
303 V = cast(dchar)(u & ((1 << (7 - n)) - 1));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
304
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
305 if (i + (n - 1) >= len)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
306 goto Lerr; // off end of string
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
307
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
308 /* The following combinations are overlong, and illegal:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
309 * 1100000x (10xxxxxx)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
310 * 11100000 100xxxxx (10xxxxxx)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
311 * 11110000 1000xxxx (10xxxxxx 10xxxxxx)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
312 * 11111000 10000xxx (10xxxxxx 10xxxxxx 10xxxxxx)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
313 * 11111100 100000xx (10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
314 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
315 u2 = s[i + 1];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
316 if ((u & 0xFE) == 0xC0 ||
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
317 (u == 0xE0 && (u2 & 0xE0) == 0x80) ||
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
318 (u == 0xF0 && (u2 & 0xF0) == 0x80) ||
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
319 (u == 0xF8 && (u2 & 0xF8) == 0x80) ||
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
320 (u == 0xFC && (u2 & 0xFC) == 0x80))
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
321 goto Lerr; // overlong combination
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
322
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
323 for (uint j = 1; j != n; j++)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
324 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
325 u = s[i + j];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
326 if ((u & 0xC0) != 0x80)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
327 goto Lerr; // trailing bytes are 10xxxxxx
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
328 V = (V << 6) | (u & 0x3F);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
329 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
330 if (!isValidDchar(V))
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
331 goto Lerr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
332 i += n;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
333 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
334 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
335 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
336 V = cast(dchar) u;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
337 i++;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
338 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
339
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
340 idx = i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
341 return V;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
342
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
343 Lerr:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
344 //printf("\ndecode: idx = %d, i = %d, length = %d s = \n'%.*s'\n%x\n'%.*s'\n", idx, i, s.length, s, s[i], s[i .. length]);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
345 throw new UtfException("4invalid UTF-8 sequence", i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
346 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
347
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
348 unittest
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
349 { size_t i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
350 dchar c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
351
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
352 debug(utf) printf("utf.decode.unittest\n");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
353
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
354 static char[] s1 = "abcd";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
355 i = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
356 c = decode(s1, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
357 assert(c == cast(dchar)'a');
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
358 assert(i == 1);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
359 c = decode(s1, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
360 assert(c == cast(dchar)'b');
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
361 assert(i == 2);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
362
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
363 static char[] s2 = "\xC2\xA9";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
364 i = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
365 c = decode(s2, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
366 assert(c == cast(dchar)'\u00A9');
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
367 assert(i == 2);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
368
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
369 static char[] s3 = "\xE2\x89\xA0";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
370 i = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
371 c = decode(s3, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
372 assert(c == cast(dchar)'\u2260');
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
373 assert(i == 3);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
374
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
375 static char[][] s4 =
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
376 [ "\xE2\x89", // too short
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
377 "\xC0\x8A",
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
378 "\xE0\x80\x8A",
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
379 "\xF0\x80\x80\x8A",
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
380 "\xF8\x80\x80\x80\x8A",
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
381 "\xFC\x80\x80\x80\x80\x8A",
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
382 ];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
383
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
384 for (int j = 0; j < s4.length; j++)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
385 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
386 try
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
387 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
388 i = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
389 c = decode(s4[j], i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
390 assert(0);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
391 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
392 catch (UtfException u)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
393 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
394 i = 23;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
395 delete u;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
396 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
397 assert(i == 23);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
398 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
399 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
400
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
401 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
402
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
403 dchar decode(wchar[] s, inout size_t idx)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
404 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
405 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
406 assert(idx >= 0 && idx < s.length);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
407 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
408 out (result)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
409 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
410 assert(isValidDchar(result));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
411 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
412 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
413 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
414 char[] msg;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
415 dchar V;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
416 size_t i = idx;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
417 uint u = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
418
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
419 if (u & ~0x7F)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
420 { if (u >= 0xD800 && u <= 0xDBFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
421 { uint u2;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
422
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
423 if (i + 1 == s.length)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
424 { msg = "surrogate UTF-16 high value past end of string";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
425 goto Lerr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
426 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
427 u2 = s[i + 1];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
428 if (u2 < 0xDC00 || u2 > 0xDFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
429 { msg = "surrogate UTF-16 low value out of range";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
430 goto Lerr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
431 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
432 u = ((u - 0xD7C0) << 10) + (u2 - 0xDC00);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
433 i += 2;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
434 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
435 else if (u >= 0xDC00 && u <= 0xDFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
436 { msg = "unpaired surrogate UTF-16 value";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
437 goto Lerr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
438 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
439 else if (u == 0xFFFE || u == 0xFFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
440 { msg = "illegal UTF-16 value";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
441 goto Lerr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
442 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
443 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
444 i++;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
445 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
446 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
447 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
448 i++;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
449 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
450
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
451 idx = i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
452 return cast(dchar)u;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
453
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
454 Lerr:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
455 throw new UtfException(msg, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
456 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
457
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
458 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
459
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
460 dchar decode(dchar[] s, inout size_t idx)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
461 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
462 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
463 assert(idx >= 0 && idx < s.length);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
464 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
465 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
466 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
467 size_t i = idx;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
468 dchar c = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
469
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
470 if (!isValidDchar(c))
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
471 goto Lerr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
472 idx = i + 1;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
473 return c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
474
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
475 Lerr:
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
476 throw new UtfException("5invalid UTF-32 value", i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
477 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
478
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
479
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
480 /* =================== Encode ======================= */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
481
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
482 /*******************************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
483 * Encodes character c and appends it to array s[].
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
484 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
485
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
486 void encode(inout char[] s, dchar c)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
487 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
488 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
489 assert(isValidDchar(c));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
490 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
491 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
492 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
493 char[] r = s;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
494
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
495 if (c <= 0x7F)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
496 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
497 r ~= cast(char) c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
498 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
499 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
500 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
501 char[4] buf;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
502 uint L;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
503
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
504 if (c <= 0x7FF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
505 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
506 buf[0] = cast(char)(0xC0 | (c >> 6));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
507 buf[1] = cast(char)(0x80 | (c & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
508 L = 2;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
509 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
510 else if (c <= 0xFFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
511 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
512 buf[0] = cast(char)(0xE0 | (c >> 12));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
513 buf[1] = cast(char)(0x80 | ((c >> 6) & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
514 buf[2] = cast(char)(0x80 | (c & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
515 L = 3;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
516 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
517 else if (c <= 0x10FFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
518 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
519 buf[0] = cast(char)(0xF0 | (c >> 18));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
520 buf[1] = cast(char)(0x80 | ((c >> 12) & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
521 buf[2] = cast(char)(0x80 | ((c >> 6) & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
522 buf[3] = cast(char)(0x80 | (c & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
523 L = 4;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
524 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
525 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
526 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
527 assert(0);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
528 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
529 r ~= buf[0 .. L];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
530 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
531 s = r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
532 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
533
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
534 unittest
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
535 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
536 debug(utf) printf("utf.encode.unittest\n");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
537
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
538 char[] s = "abcd";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
539 encode(s, cast(dchar)'a');
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
540 assert(s.length == 5);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
541 assert(s == "abcda");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
542
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
543 encode(s, cast(dchar)'\u00A9');
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
544 assert(s.length == 7);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
545 assert(s == "abcda\xC2\xA9");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
546 //assert(s == "abcda\u00A9"); // BUG: fix compiler
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
547
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
548 encode(s, cast(dchar)'\u2260');
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
549 assert(s.length == 10);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
550 assert(s == "abcda\xC2\xA9\xE2\x89\xA0");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
551 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
552
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
553 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
554
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
555 void encode(inout wchar[] s, dchar c)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
556 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
557 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
558 assert(isValidDchar(c));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
559 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
560 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
561 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
562 wchar[] r = s;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
563
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
564 if (c <= 0xFFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
565 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
566 r ~= cast(wchar) c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
567 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
568 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
569 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
570 wchar[2] buf;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
571
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
572 buf[0] = cast(wchar) ((((c - 0x10000) >> 10) & 0x3FF) + 0xD800);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
573 buf[1] = cast(wchar) (((c - 0x10000) & 0x3FF) + 0xDC00);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
574 r ~= buf;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
575 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
576 s = r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
577 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
578
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
579 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
580
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
581 void encode(inout dchar[] s, dchar c)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
582 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
583 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
584 assert(isValidDchar(c));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
585 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
586 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
587 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
588 s ~= c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
589 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
590
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
591 /* =================== Validation ======================= */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
592
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
593 /***********************************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
594 * Checks to see if string is well formed or not. Throws a UtfException if it is
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
595 * not. Use to check all untrusted input for correctness.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
596 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
597
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
598 void validate(char[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
599 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
600 size_t len = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
601 size_t i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
602
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
603 for (i = 0; i < len; )
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
604 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
605 decode(s, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
606 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
607 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
608
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
609 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
610
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
611 void validate(wchar[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
612 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
613 size_t len = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
614 size_t i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
615
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
616 for (i = 0; i < len; )
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
617 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
618 decode(s, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
619 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
620 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
621
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
622 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
623
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
624 void validate(dchar[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
625 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
626 size_t len = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
627 size_t i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
628
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
629 for (i = 0; i < len; )
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
630 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
631 decode(s, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
632 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
633 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
634
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
635 /* =================== Conversion to UTF8 ======================= */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
636
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
637 char[] toUTF8(char[4] buf, dchar c)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
638 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
639 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
640 assert(isValidDchar(c));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
641 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
642 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
643 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
644 if (c <= 0x7F)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
645 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
646 buf[0] = cast(char) c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
647 return buf[0 .. 1];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
648 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
649 else if (c <= 0x7FF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
650 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
651 buf[0] = cast(char)(0xC0 | (c >> 6));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
652 buf[1] = cast(char)(0x80 | (c & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
653 return buf[0 .. 2];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
654 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
655 else if (c <= 0xFFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
656 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
657 buf[0] = cast(char)(0xE0 | (c >> 12));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
658 buf[1] = cast(char)(0x80 | ((c >> 6) & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
659 buf[2] = cast(char)(0x80 | (c & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
660 return buf[0 .. 3];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
661 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
662 else if (c <= 0x10FFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
663 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
664 buf[0] = cast(char)(0xF0 | (c >> 18));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
665 buf[1] = cast(char)(0x80 | ((c >> 12) & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
666 buf[2] = cast(char)(0x80 | ((c >> 6) & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
667 buf[3] = cast(char)(0x80 | (c & 0x3F));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
668 return buf[0 .. 4];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
669 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
670 assert(0);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
671 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
672
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
673 /*******************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
674 * Encodes string s into UTF-8 and returns the encoded string.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
675 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
676
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
677 char[] toUTF8(char[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
678 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
679 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
680 validate(s);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
681 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
682 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
683 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
684 return s;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
685 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
686
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
687 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
688
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
689 char[] toUTF8(wchar[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
690 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
691 char[] r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
692 size_t i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
693 size_t slen = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
694
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
695 r.length = slen;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
696
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
697 for (i = 0; i < slen; i++)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
698 { wchar c = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
699
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
700 if (c <= 0x7F)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
701 r[i] = cast(char)c; // fast path for ascii
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
702 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
703 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
704 r.length = i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
705 foreach (dchar c; s[i .. slen])
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
706 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
707 encode(r, c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
708 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
709 break;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
710 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
711 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
712 return r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
713 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
714
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
715 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
716
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
717 char[] toUTF8(dchar[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
718 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
719 char[] r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
720 size_t i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
721 size_t slen = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
722
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
723 r.length = slen;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
724
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
725 for (i = 0; i < slen; i++)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
726 { dchar c = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
727
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
728 if (c <= 0x7F)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
729 r[i] = cast(char)c; // fast path for ascii
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
730 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
731 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
732 r.length = i;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
733 foreach (dchar d; s[i .. slen])
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
734 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
735 encode(r, d);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
736 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
737 break;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
738 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
739 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
740 return r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
741 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
742
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
743 /* =================== Conversion to UTF16 ======================= */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
744
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
745 wchar[] toUTF16(wchar[2] buf, dchar c)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
746 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
747 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
748 assert(isValidDchar(c));
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
749 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
750 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
751 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
752 if (c <= 0xFFFF)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
753 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
754 buf[0] = cast(wchar) c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
755 return buf[0 .. 1];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
756 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
757 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
758 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
759 buf[0] = cast(wchar) ((((c - 0x10000) >> 10) & 0x3FF) + 0xD800);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
760 buf[1] = cast(wchar) (((c - 0x10000) & 0x3FF) + 0xDC00);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
761 return buf[0 .. 2];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
762 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
763 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
764
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
765 /****************
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
766 * Encodes string s into UTF-16 and returns the encoded string.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
767 * toUTF16z() is suitable for calling the 'W' functions in the Win32 API that take
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
768 * an LPWSTR or LPCWSTR argument.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
769 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
770
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
771 wchar[] toUTF16(char[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
772 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
773 wchar[] r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
774 size_t slen = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
775
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
776 r.length = slen;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
777 r.length = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
778 for (size_t i = 0; i < slen; )
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
779 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
780 dchar c = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
781 if (c <= 0x7F)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
782 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
783 i++;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
784 r ~= cast(wchar)c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
785 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
786 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
787 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
788 c = decode(s, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
789 encode(r, c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
790 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
791 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
792 return r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
793 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
794
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
795 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
796
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
797 wchar* toUTF16z(char[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
798 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
799 wchar[] r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
800 size_t slen = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
801
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
802 r.length = slen + 1;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
803 r.length = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
804 for (size_t i = 0; i < slen; )
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
805 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
806 dchar c = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
807 if (c <= 0x7F)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
808 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
809 i++;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
810 r ~= cast(wchar)c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
811 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
812 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
813 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
814 c = decode(s, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
815 encode(r, c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
816 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
817 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
818 r ~= "\000";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
819 return r.ptr;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
820 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
821
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
822 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
823
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
824 wchar[] toUTF16(wchar[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
825 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
826 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
827 validate(s);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
828 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
829 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
830 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
831 return s;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
832 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
833
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
834 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
835
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
836 wchar[] toUTF16(dchar[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
837 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
838 wchar[] r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
839 size_t slen = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
840
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
841 r.length = slen;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
842 r.length = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
843 for (size_t i = 0; i < slen; i++)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
844 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
845 encode(r, s[i]);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
846 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
847 return r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
848 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
849
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
850 /* =================== Conversion to UTF32 ======================= */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
851
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
852 /*****
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
853 * Encodes string s into UTF-32 and returns the encoded string.
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
854 */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
855
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
856 dchar[] toUTF32(char[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
857 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
858 dchar[] r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
859 size_t slen = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
860 size_t j = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
861
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
862 r.length = slen; // r[] will never be longer than s[]
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
863 for (size_t i = 0; i < slen; )
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
864 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
865 dchar c = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
866 if (c >= 0x80)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
867 c = decode(s, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
868 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
869 i++; // c is ascii, no need for decode
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
870 r[j++] = c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
871 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
872 return r[0 .. j];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
873 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
874
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
875 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
876
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
877 dchar[] toUTF32(wchar[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
878 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
879 dchar[] r;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
880 size_t slen = s.length;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
881 size_t j = 0;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
882
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
883 r.length = slen; // r[] will never be longer than s[]
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
884 for (size_t i = 0; i < slen; )
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
885 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
886 dchar c = s[i];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
887 if (c >= 0x80)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
888 c = decode(s, i);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
889 else
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
890 i++; // c is ascii, no need for decode
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
891 r[j++] = c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
892 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
893 return r[0 .. j];
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
894 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
895
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
896 /** ditto */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
897
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
898 dchar[] toUTF32(dchar[] s)
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
899 in
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
900 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
901 validate(s);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
902 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
903 body
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
904 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
905 return s;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
906 }
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
907
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
908 /* ================================ tests ================================== */
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
909
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
910 unittest
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
911 {
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
912 debug(utf) printf("utf.toUTF.unittest\n");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
913
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
914 char[] c;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
915 wchar[] w;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
916 dchar[] d;
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
917
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
918 c = "hello";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
919 w = toUTF16(c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
920 assert(w == "hello");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
921 d = toUTF32(c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
922 assert(d == "hello");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
923
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
924 c = toUTF8(w);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
925 assert(c == "hello");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
926 d = toUTF32(w);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
927 assert(d == "hello");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
928
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
929 c = toUTF8(d);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
930 assert(c == "hello");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
931 w = toUTF16(d);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
932 assert(w == "hello");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
933
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
934
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
935 c = "hel\u1234o";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
936 w = toUTF16(c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
937 assert(w == "hel\u1234o");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
938 d = toUTF32(c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
939 assert(d == "hel\u1234o");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
940
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
941 c = toUTF8(w);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
942 assert(c == "hel\u1234o");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
943 d = toUTF32(w);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
944 assert(d == "hel\u1234o");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
945
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
946 c = toUTF8(d);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
947 assert(c == "hel\u1234o");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
948 w = toUTF16(d);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
949 assert(w == "hel\u1234o");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
950
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
951
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
952 c = "he\U0010AAAAllo";
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
953 w = toUTF16(c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
954 //foreach (wchar c; w) printf("c = x%x\n", c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
955 //foreach (wchar c; cast(wchar[])"he\U0010AAAAllo") printf("c = x%x\n", c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
956 assert(w == "he\U0010AAAAllo");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
957 d = toUTF32(c);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
958 assert(d == "he\U0010AAAAllo");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
959
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
960 c = toUTF8(w);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
961 assert(c == "he\U0010AAAAllo");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
962 d = toUTF32(w);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
963 assert(d == "he\U0010AAAAllo");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
964
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
965 c = toUTF8(d);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
966 assert(c == "he\U0010AAAAllo");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
967 w = toUTF16(d);
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
968 assert(w == "he\U0010AAAAllo");
fd32135dca3e [svn r90] Major updates to the gen directory. Redesigned the 'elem' struct. Much more... !!!
lindquist
parents:
diff changeset
969 }