annotate druntime/src/compiler/dmd/util/utf.d @ 759:d3eb054172f9

Added copy of druntime from DMD 2.020 modified for LDC.
author Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
date Tue, 11 Nov 2008 01:52:37 +0100
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
759
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
1 // Written in the D programming language
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
2
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
3 /*
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
4 * Copyright (C) 2003-2004 by Digital Mars, www.digitalmars.com
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
5 * Written by Walter Bright
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
6 *
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
7 * This software is provided 'as-is', without any express or implied
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
8 * warranty. In no event will the authors be held liable for any damages
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
9 * arising from the use of this software.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
10 *
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
11 * Permission is granted to anyone to use this software for any purpose,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
12 * including commercial applications, and to alter it and redistribute it
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
13 * freely, subject to the following restrictions:
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
14 *
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
15 * o The origin of this software must not be misrepresented; you must not
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
16 * claim that you wrote the original software. If you use this software
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
17 * in a product, an acknowledgment in the product documentation would be
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
18 * appreciated but is not required.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
19 * o Altered source versions must be plainly marked as such, and must not
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
20 * be misrepresented as being the original software.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
21 * o This notice may not be removed or altered from any source
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
22 * distribution.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
23 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
24
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
25 /********************************************
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
26 * Encode and decode UTF-8, UTF-16 and UTF-32 strings.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
27 *
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
28 * For Win32 systems, the C wchar_t type is UTF-16 and corresponds to the D
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
29 * wchar type.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
30 * For linux systems, the C wchar_t type is UTF-32 and corresponds to
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
31 * the D utf.dchar type.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
32 *
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
33 * UTF character support is restricted to (\u0000 &lt;= character &lt;= \U0010FFFF).
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
34 *
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
35 * See_Also:
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
36 * $(LINK2 http://en.wikipedia.org/wiki/Unicode, Wikipedia)<br>
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
37 * $(LINK http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8)<br>
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
38 * $(LINK http://anubis.dkuug.dk/JTC1/SC2/WG2/docs/n1335)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
39 * Macros:
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
40 * WIKI = Phobos/StdUtf
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
41 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
42
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
43 module rt.util.utf;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
44
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
45
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
46 extern (C) void onUnicodeError( string msg, size_t idx );
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
47
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
48 /*******************************
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
49 * Test if c is a valid UTF-32 character.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
50 *
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
51 * \uFFFE and \uFFFF are considered valid by this function,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
52 * as they are permitted for internal use by an application,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
53 * but they are not allowed for interchange by the Unicode standard.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
54 *
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
55 * Returns: true if it is, false if not.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
56 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
57
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
58 bool isValidDchar(dchar c)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
59 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
60 /* Note: FFFE and FFFF are specifically permitted by the
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
61 * Unicode standard for application internal use, but are not
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
62 * allowed for interchange.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
63 * (thanks to Arcane Jill)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
64 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
65
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
66 return c < 0xD800 ||
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
67 (c > 0xDFFF && c <= 0x10FFFF /*&& c != 0xFFFE && c != 0xFFFF*/);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
68 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
69
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
70 unittest
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
71 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
72 debug(utf) printf("utf.isValidDchar.unittest\n");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
73 assert(isValidDchar(cast(dchar)'a') == true);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
74 assert(isValidDchar(cast(dchar)0x1FFFFF) == false);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
75 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
76
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
77
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
78
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
79 auto UTF8stride =
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
80 [
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
81 cast(ubyte)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
82 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
83 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
84 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
85 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
86 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
87 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
88 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
89 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
90 0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
91 0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
92 0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
93 0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,0xFF,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
94 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
95 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
96 3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
97 4,4,4,4,4,4,4,4,5,5,5,5,6,6,0xFF,0xFF,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
98 ];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
99
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
100 /**
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
101 * stride() returns the length of a UTF-8 sequence starting at index i
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
102 * in string s.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
103 * Returns:
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
104 * The number of bytes in the UTF-8 sequence or
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
105 * 0xFF meaning s[i] is not the start of of UTF-8 sequence.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
106 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
107 uint stride(in char[] s, size_t i)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
108 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
109 return UTF8stride[s[i]];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
110 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
111
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
112 /**
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
113 * stride() returns the length of a UTF-16 sequence starting at index i
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
114 * in string s.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
115 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
116 uint stride(in wchar[] s, size_t i)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
117 { uint u = s[i];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
118 return 1 + (u >= 0xD800 && u <= 0xDBFF);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
119 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
120
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
121 /**
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
122 * stride() returns the length of a UTF-32 sequence starting at index i
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
123 * in string s.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
124 * Returns: The return value will always be 1.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
125 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
126 uint stride(in dchar[] s, size_t i)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
127 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
128 return 1;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
129 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
130
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
131 /*******************************************
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
132 * Given an index i into an array of characters s[],
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
133 * and assuming that index i is at the start of a UTF character,
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
134 * determine the number of UCS characters up to that index i.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
135 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
136
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
137 size_t toUCSindex(in char[] s, size_t i)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
138 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
139 size_t n;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
140 size_t j;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
141
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
142 for (j = 0; j < i; )
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
143 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
144 j += stride(s, j);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
145 n++;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
146 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
147 if (j > i)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
148 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
149 onUnicodeError("invalid UTF-8 sequence", j);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
150 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
151 return n;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
152 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
153
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
154 /** ditto */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
155 size_t toUCSindex(in wchar[] s, size_t i)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
156 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
157 size_t n;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
158 size_t j;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
159
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
160 for (j = 0; j < i; )
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
161 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
162 j += stride(s, j);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
163 n++;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
164 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
165 if (j > i)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
166 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
167 onUnicodeError("invalid UTF-16 sequence", j);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
168 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
169 return n;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
170 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
171
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
172 /** ditto */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
173 size_t toUCSindex(in dchar[] s, size_t i)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
174 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
175 return i;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
176 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
177
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
178 /******************************************
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
179 * Given a UCS index n into an array of characters s[], return the UTF index.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
180 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
181
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
182 size_t toUTFindex(in char[] s, size_t n)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
183 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
184 size_t i;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
185
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
186 while (n--)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
187 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
188 uint j = UTF8stride[s[i]];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
189 if (j == 0xFF)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
190 onUnicodeError("invalid UTF-8 sequence", i);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
191 i += j;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
192 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
193 return i;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
194 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
195
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
196 /** ditto */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
197 size_t toUTFindex(in wchar[] s, size_t n)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
198 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
199 size_t i;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
200
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
201 while (n--)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
202 { wchar u = s[i];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
203
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
204 i += 1 + (u >= 0xD800 && u <= 0xDBFF);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
205 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
206 return i;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
207 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
208
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
209 /** ditto */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
210 size_t toUTFindex(in dchar[] s, size_t n)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
211 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
212 return n;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
213 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
214
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
215 /* =================== Decode ======================= */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
216
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
217 /***************
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
218 * Decodes and returns character starting at s[idx]. idx is advanced past the
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
219 * decoded character. If the character is not well formed, a UtfException is
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
220 * thrown and idx remains unchanged.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
221 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
222 dchar decode(in char[] s, inout size_t idx)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
223 in
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
224 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
225 assert(idx >= 0 && idx < s.length);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
226 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
227 out (result)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
228 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
229 assert(isValidDchar(result));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
230 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
231 body
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
232 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
233 size_t len = s.length;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
234 dchar V;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
235 size_t i = idx;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
236 char u = s[i];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
237
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
238 if (u & 0x80)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
239 { uint n;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
240 char u2;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
241
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
242 /* The following encodings are valid, except for the 5 and 6 byte
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
243 * combinations:
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
244 * 0xxxxxxx
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
245 * 110xxxxx 10xxxxxx
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
246 * 1110xxxx 10xxxxxx 10xxxxxx
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
247 * 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
248 * 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
249 * 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
250 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
251 for (n = 1; ; n++)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
252 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
253 if (n > 4)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
254 goto Lerr; // only do the first 4 of 6 encodings
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
255 if (((u << n) & 0x80) == 0)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
256 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
257 if (n == 1)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
258 goto Lerr;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
259 break;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
260 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
261 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
262
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
263 // Pick off (7 - n) significant bits of B from first byte of octet
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
264 V = cast(dchar)(u & ((1 << (7 - n)) - 1));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
265
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
266 if (i + (n - 1) >= len)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
267 goto Lerr; // off end of string
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
268
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
269 /* The following combinations are overlong, and illegal:
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
270 * 1100000x (10xxxxxx)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
271 * 11100000 100xxxxx (10xxxxxx)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
272 * 11110000 1000xxxx (10xxxxxx 10xxxxxx)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
273 * 11111000 10000xxx (10xxxxxx 10xxxxxx 10xxxxxx)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
274 * 11111100 100000xx (10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
275 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
276 u2 = s[i + 1];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
277 if ((u & 0xFE) == 0xC0 ||
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
278 (u == 0xE0 && (u2 & 0xE0) == 0x80) ||
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
279 (u == 0xF0 && (u2 & 0xF0) == 0x80) ||
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
280 (u == 0xF8 && (u2 & 0xF8) == 0x80) ||
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
281 (u == 0xFC && (u2 & 0xFC) == 0x80))
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
282 goto Lerr; // overlong combination
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
283
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
284 for (uint j = 1; j != n; j++)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
285 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
286 u = s[i + j];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
287 if ((u & 0xC0) != 0x80)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
288 goto Lerr; // trailing bytes are 10xxxxxx
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
289 V = (V << 6) | (u & 0x3F);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
290 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
291 if (!isValidDchar(V))
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
292 goto Lerr;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
293 i += n;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
294 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
295 else
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
296 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
297 V = cast(dchar) u;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
298 i++;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
299 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
300
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
301 idx = i;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
302 return V;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
303
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
304 Lerr:
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
305 onUnicodeError("invalid UTF-8 sequence", i);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
306 return V; // dummy return
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
307 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
308
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
309 unittest
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
310 { size_t i;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
311 dchar c;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
312
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
313 debug(utf) printf("utf.decode.unittest\n");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
314
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
315 static s1 = "abcd"c;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
316 i = 0;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
317 c = decode(s1, i);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
318 assert(c == cast(dchar)'a');
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
319 assert(i == 1);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
320 c = decode(s1, i);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
321 assert(c == cast(dchar)'b');
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
322 assert(i == 2);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
323
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
324 static s2 = "\xC2\xA9"c;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
325 i = 0;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
326 c = decode(s2, i);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
327 assert(c == cast(dchar)'\u00A9');
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
328 assert(i == 2);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
329
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
330 static s3 = "\xE2\x89\xA0"c;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
331 i = 0;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
332 c = decode(s3, i);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
333 assert(c == cast(dchar)'\u2260');
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
334 assert(i == 3);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
335
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
336 static s4 =
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
337 [ "\xE2\x89"c, // too short
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
338 "\xC0\x8A",
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
339 "\xE0\x80\x8A",
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
340 "\xF0\x80\x80\x8A",
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
341 "\xF8\x80\x80\x80\x8A",
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
342 "\xFC\x80\x80\x80\x80\x8A",
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
343 ];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
344
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
345 for (int j = 0; j < s4.length; j++)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
346 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
347 try
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
348 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
349 i = 0;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
350 c = decode(s4[j], i);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
351 assert(0);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
352 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
353 catch (Object o)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
354 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
355 i = 23;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
356 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
357 assert(i == 23);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
358 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
359 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
360
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
361 /** ditto */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
362
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
363 dchar decode(in wchar[] s, inout size_t idx)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
364 in
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
365 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
366 assert(idx >= 0 && idx < s.length);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
367 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
368 out (result)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
369 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
370 assert(isValidDchar(result));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
371 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
372 body
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
373 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
374 string msg;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
375 dchar V;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
376 size_t i = idx;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
377 uint u = s[i];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
378
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
379 if (u & ~0x7F)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
380 { if (u >= 0xD800 && u <= 0xDBFF)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
381 { uint u2;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
382
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
383 if (i + 1 == s.length)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
384 { msg = "surrogate UTF-16 high value past end of string";
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
385 goto Lerr;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
386 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
387 u2 = s[i + 1];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
388 if (u2 < 0xDC00 || u2 > 0xDFFF)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
389 { msg = "surrogate UTF-16 low value out of range";
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
390 goto Lerr;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
391 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
392 u = ((u - 0xD7C0) << 10) + (u2 - 0xDC00);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
393 i += 2;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
394 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
395 else if (u >= 0xDC00 && u <= 0xDFFF)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
396 { msg = "unpaired surrogate UTF-16 value";
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
397 goto Lerr;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
398 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
399 else if (u == 0xFFFE || u == 0xFFFF)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
400 { msg = "illegal UTF-16 value";
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
401 goto Lerr;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
402 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
403 else
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
404 i++;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
405 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
406 else
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
407 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
408 i++;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
409 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
410
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
411 idx = i;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
412 return cast(dchar)u;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
413
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
414 Lerr:
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
415 onUnicodeError(msg, i);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
416 return cast(dchar)u; // dummy return
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
417 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
418
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
419 /** ditto */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
420
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
421 dchar decode(in dchar[] s, inout size_t idx)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
422 in
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
423 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
424 assert(idx >= 0 && idx < s.length);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
425 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
426 body
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
427 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
428 size_t i = idx;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
429 dchar c = s[i];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
430
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
431 if (!isValidDchar(c))
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
432 goto Lerr;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
433 idx = i + 1;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
434 return c;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
435
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
436 Lerr:
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
437 onUnicodeError("invalid UTF-32 value", i);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
438 return c; // dummy return
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
439 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
440
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
441
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
442 /* =================== Encode ======================= */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
443
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
444 /*******************************
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
445 * Encodes character c and appends it to array s[].
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
446 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
447 void encode(inout char[] s, dchar c)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
448 in
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
449 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
450 assert(isValidDchar(c));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
451 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
452 body
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
453 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
454 char[] r = s;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
455
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
456 if (c <= 0x7F)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
457 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
458 r ~= cast(char) c;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
459 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
460 else
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
461 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
462 char[4] buf;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
463 uint L;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
464
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
465 if (c <= 0x7FF)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
466 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
467 buf[0] = cast(char)(0xC0 | (c >> 6));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
468 buf[1] = cast(char)(0x80 | (c & 0x3F));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
469 L = 2;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
470 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
471 else if (c <= 0xFFFF)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
472 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
473 buf[0] = cast(char)(0xE0 | (c >> 12));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
474 buf[1] = cast(char)(0x80 | ((c >> 6) & 0x3F));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
475 buf[2] = cast(char)(0x80 | (c & 0x3F));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
476 L = 3;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
477 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
478 else if (c <= 0x10FFFF)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
479 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
480 buf[0] = cast(char)(0xF0 | (c >> 18));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
481 buf[1] = cast(char)(0x80 | ((c >> 12) & 0x3F));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
482 buf[2] = cast(char)(0x80 | ((c >> 6) & 0x3F));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
483 buf[3] = cast(char)(0x80 | (c & 0x3F));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
484 L = 4;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
485 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
486 else
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
487 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
488 assert(0);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
489 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
490 r ~= buf[0 .. L];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
491 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
492 s = r;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
493 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
494
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
495 unittest
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
496 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
497 debug(utf) printf("utf.encode.unittest\n");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
498
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
499 char[] s = "abcd".dup;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
500 encode(s, cast(dchar)'a');
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
501 assert(s.length == 5);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
502 assert(s == "abcda");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
503
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
504 encode(s, cast(dchar)'\u00A9');
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
505 assert(s.length == 7);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
506 assert(s == "abcda\xC2\xA9");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
507 //assert(s == "abcda\u00A9"); // BUG: fix compiler
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
508
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
509 encode(s, cast(dchar)'\u2260');
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
510 assert(s.length == 10);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
511 assert(s == "abcda\xC2\xA9\xE2\x89\xA0");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
512 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
513
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
514 /** ditto */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
515
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
516 void encode(inout wchar[] s, dchar c)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
517 in
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
518 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
519 assert(isValidDchar(c));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
520 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
521 body
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
522 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
523 wchar[] r = s;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
524
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
525 if (c <= 0xFFFF)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
526 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
527 r ~= cast(wchar) c;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
528 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
529 else
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
530 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
531 wchar[2] buf;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
532
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
533 buf[0] = cast(wchar) ((((c - 0x10000) >> 10) & 0x3FF) + 0xD800);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
534 buf[1] = cast(wchar) (((c - 0x10000) & 0x3FF) + 0xDC00);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
535 r ~= buf;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
536 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
537 s = r;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
538 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
539
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
540 /** ditto */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
541 void encode(inout dchar[] s, dchar c)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
542 in
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
543 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
544 assert(isValidDchar(c));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
545 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
546 body
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
547 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
548 s ~= c;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
549 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
550
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
551 /**
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
552 Returns the code length of $(D c) in the encoding using $(D C) as a
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
553 code point. The code is returned in character count, not in bytes.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
554 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
555
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
556 ubyte codeLength(C)(dchar c)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
557 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
558
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
559 static if (C.sizeof == 1)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
560 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
561 return
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
562 c <= 0x7F ? 1
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
563 : c <= 0x7FF ? 2
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
564 : c <= 0xFFFF ? 3
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
565 : c <= 0x10FFFF ? 4
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
566 : (assert(false), 6);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
567 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
568
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
569 else static if (C.sizeof == 2)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
570 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
571 return c <= 0xFFFF ? 1 : 2;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
572 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
573 else
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
574 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
575 static assert(C.sizeof == 4);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
576 return 1;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
577 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
578 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
579
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
580 /* =================== Validation ======================= */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
581
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
582 /***********************************
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
583 Checks to see if string is well formed or not. $(D S) can be an array
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
584 of $(D char), $(D wchar), or $(D dchar). Throws a $(D UtfException)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
585 if it is not. Use to check all untrusted input for correctness.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
586 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
587 void validate(S)(in S s)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
588 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
589 auto len = s.length;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
590 for (size_t i = 0; i < len; )
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
591 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
592 decode(s, i);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
593 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
594 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
595
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
596 /* =================== Conversion to UTF8 ======================= */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
597
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
598 char[] toUTF8(char[4] buf, dchar c)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
599 in
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
600 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
601 assert(isValidDchar(c));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
602 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
603 body
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
604 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
605 if (c <= 0x7F)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
606 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
607 buf[0] = cast(char) c;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
608 return buf[0 .. 1];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
609 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
610 else if (c <= 0x7FF)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
611 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
612 buf[0] = cast(char)(0xC0 | (c >> 6));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
613 buf[1] = cast(char)(0x80 | (c & 0x3F));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
614 return buf[0 .. 2];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
615 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
616 else if (c <= 0xFFFF)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
617 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
618 buf[0] = cast(char)(0xE0 | (c >> 12));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
619 buf[1] = cast(char)(0x80 | ((c >> 6) & 0x3F));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
620 buf[2] = cast(char)(0x80 | (c & 0x3F));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
621 return buf[0 .. 3];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
622 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
623 else if (c <= 0x10FFFF)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
624 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
625 buf[0] = cast(char)(0xF0 | (c >> 18));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
626 buf[1] = cast(char)(0x80 | ((c >> 12) & 0x3F));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
627 buf[2] = cast(char)(0x80 | ((c >> 6) & 0x3F));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
628 buf[3] = cast(char)(0x80 | (c & 0x3F));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
629 return buf[0 .. 4];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
630 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
631 assert(0);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
632 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
633
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
634 /*******************
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
635 * Encodes string s into UTF-8 and returns the encoded string.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
636 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
637 string toUTF8(string s)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
638 in
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
639 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
640 validate(s);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
641 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
642 body
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
643 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
644 return s;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
645 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
646
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
647 /** ditto */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
648 string toUTF8(in wchar[] s)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
649 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
650 char[] r;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
651 size_t i;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
652 size_t slen = s.length;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
653
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
654 r.length = slen;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
655
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
656 for (i = 0; i < slen; i++)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
657 { wchar c = s[i];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
658
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
659 if (c <= 0x7F)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
660 r[i] = cast(char)c; // fast path for ascii
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
661 else
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
662 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
663 r.length = i;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
664 foreach (dchar c; s[i .. slen])
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
665 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
666 encode(r, c);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
667 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
668 break;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
669 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
670 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
671 return cast(string)r;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
672 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
673
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
674 /** ditto */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
675 string toUTF8(in dchar[] s)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
676 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
677 char[] r;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
678 size_t i;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
679 size_t slen = s.length;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
680
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
681 r.length = slen;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
682
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
683 for (i = 0; i < slen; i++)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
684 { dchar c = s[i];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
685
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
686 if (c <= 0x7F)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
687 r[i] = cast(char)c; // fast path for ascii
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
688 else
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
689 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
690 r.length = i;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
691 foreach (dchar d; s[i .. slen])
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
692 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
693 encode(r, d);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
694 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
695 break;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
696 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
697 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
698 return cast(string)r;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
699 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
700
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
701 /* =================== Conversion to UTF16 ======================= */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
702
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
703 wchar[] toUTF16(wchar[2] buf, dchar c)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
704 in
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
705 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
706 assert(isValidDchar(c));
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
707 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
708 body
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
709 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
710 if (c <= 0xFFFF)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
711 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
712 buf[0] = cast(wchar) c;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
713 return buf[0 .. 1];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
714 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
715 else
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
716 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
717 buf[0] = cast(wchar) ((((c - 0x10000) >> 10) & 0x3FF) + 0xD800);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
718 buf[1] = cast(wchar) (((c - 0x10000) & 0x3FF) + 0xDC00);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
719 return buf[0 .. 2];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
720 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
721 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
722
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
723 /****************
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
724 * Encodes string s into UTF-16 and returns the encoded string.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
725 * toUTF16z() is suitable for calling the 'W' functions in the Win32 API that take
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
726 * an LPWSTR or LPCWSTR argument.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
727 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
728 wstring toUTF16(in char[] s)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
729 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
730 wchar[] r;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
731 size_t slen = s.length;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
732
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
733 r.length = slen;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
734 r.length = 0;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
735 for (size_t i = 0; i < slen; )
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
736 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
737 dchar c = s[i];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
738 if (c <= 0x7F)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
739 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
740 i++;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
741 r ~= cast(wchar)c;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
742 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
743 else
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
744 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
745 c = decode(s, i);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
746 encode(r, c);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
747 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
748 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
749 return cast(wstring)r;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
750 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
751
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
752 alias const(wchar)* wptr;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
753 /** ditto */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
754 wptr toUTF16z(in char[] s)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
755 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
756 wchar[] r;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
757 size_t slen = s.length;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
758
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
759 r.length = slen + 1;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
760 r.length = 0;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
761 for (size_t i = 0; i < slen; )
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
762 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
763 dchar c = s[i];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
764 if (c <= 0x7F)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
765 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
766 i++;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
767 r ~= cast(wchar)c;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
768 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
769 else
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
770 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
771 c = decode(s, i);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
772 encode(r, c);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
773 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
774 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
775 r ~= "\000";
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
776 return r.ptr;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
777 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
778
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
779 /** ditto */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
780 wstring toUTF16(wstring s)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
781 in
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
782 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
783 validate(s);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
784 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
785 body
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
786 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
787 return s;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
788 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
789
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
790 /** ditto */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
791 wstring toUTF16(in dchar[] s)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
792 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
793 wchar[] r;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
794 size_t slen = s.length;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
795
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
796 r.length = slen;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
797 r.length = 0;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
798 for (size_t i = 0; i < slen; i++)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
799 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
800 encode(r, s[i]);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
801 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
802 return cast(wstring)r;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
803 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
804
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
805 /* =================== Conversion to UTF32 ======================= */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
806
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
807 /*****
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
808 * Encodes string s into UTF-32 and returns the encoded string.
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
809 */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
810 dstring toUTF32(in char[] s)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
811 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
812 dchar[] r;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
813 size_t slen = s.length;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
814 size_t j = 0;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
815
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
816 r.length = slen; // r[] will never be longer than s[]
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
817 for (size_t i = 0; i < slen; )
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
818 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
819 dchar c = s[i];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
820 if (c >= 0x80)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
821 c = decode(s, i);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
822 else
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
823 i++; // c is ascii, no need for decode
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
824 r[j++] = c;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
825 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
826 return cast(dstring)r[0 .. j];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
827 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
828
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
829 /** ditto */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
830 dstring toUTF32(in wchar[] s)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
831 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
832 dchar[] r;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
833 size_t slen = s.length;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
834 size_t j = 0;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
835
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
836 r.length = slen; // r[] will never be longer than s[]
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
837 for (size_t i = 0; i < slen; )
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
838 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
839 dchar c = s[i];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
840 if (c >= 0x80)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
841 c = decode(s, i);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
842 else
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
843 i++; // c is ascii, no need for decode
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
844 r[j++] = c;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
845 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
846 return cast(dstring)r[0 .. j];
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
847 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
848
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
849 /** ditto */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
850 dstring toUTF32(dstring s)
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
851 in
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
852 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
853 validate(s);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
854 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
855 body
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
856 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
857 return s;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
858 }
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
859
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
860 /* ================================ tests ================================== */
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
861
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
862 unittest
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
863 {
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
864 debug(utf) printf("utf.toUTF.unittest\n");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
865
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
866 auto c = "hello"c;
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
867 auto w = toUTF16(c);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
868 assert(w == "hello");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
869 auto d = toUTF32(c);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
870 assert(d == "hello");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
871
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
872 c = toUTF8(w);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
873 assert(c == "hello");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
874 d = toUTF32(w);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
875 assert(d == "hello");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
876
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
877 c = toUTF8(d);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
878 assert(c == "hello");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
879 w = toUTF16(d);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
880 assert(w == "hello");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
881
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
882
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
883 c = "hel\u1234o";
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
884 w = toUTF16(c);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
885 assert(w == "hel\u1234o");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
886 d = toUTF32(c);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
887 assert(d == "hel\u1234o");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
888
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
889 c = toUTF8(w);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
890 assert(c == "hel\u1234o");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
891 d = toUTF32(w);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
892 assert(d == "hel\u1234o");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
893
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
894 c = toUTF8(d);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
895 assert(c == "hel\u1234o");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
896 w = toUTF16(d);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
897 assert(w == "hel\u1234o");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
898
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
899
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
900 c = "he\U0010AAAAllo";
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
901 w = toUTF16(c);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
902 //foreach (wchar c; w) printf("c = x%x\n", c);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
903 //foreach (wchar c; cast(wstring)"he\U0010AAAAllo") printf("c = x%x\n", c);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
904 assert(w == "he\U0010AAAAllo");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
905 d = toUTF32(c);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
906 assert(d == "he\U0010AAAAllo");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
907
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
908 c = toUTF8(w);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
909 assert(c == "he\U0010AAAAllo");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
910 d = toUTF32(w);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
911 assert(d == "he\U0010AAAAllo");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
912
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
913 c = toUTF8(d);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
914 assert(c == "he\U0010AAAAllo");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
915 w = toUTF16(d);
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
916 assert(w == "he\U0010AAAAllo");
d3eb054172f9 Added copy of druntime from DMD 2.020 modified for LDC.
Tomas Lindquist Olsen <tomas.l.olsen@gmail.com>
parents:
diff changeset
917 }