Mercurial > projects > mde
annotate mde/mergetag/doc/file-format-text.txt @ 8:f63f4f41a2dc
Big changes to init; got some way towards input event support; changed mergetag ID to char[] from uint.
committer: Diggory Hardy <diggory.hardy@gmail.com>
author | Diggory Hardy <diggory.hardy@gmail.com> |
---|---|
date | Fri, 25 Jan 2008 18:17:38 +0000 |
parents | 9a990644948c |
children |
rev | line source |
---|---|
0 | 1 This is the file format for mergetag text files. |
2 Version: 0.1 unfinalised | |
3 | |
4 | |
5 The encoding should be unicode UTF-8, UTF-16 or UTF-32, and for anything other than UTF-8 must include a BOM. | |
6 | |
7 | |
8 Hierarchy: | |
9 + Sections (special section: see header) | |
10 ++ Data Tags | |
11 | |
12 | |
13 IDs: | |
8
f63f4f41a2dc
Big changes to init; got some way towards input event support; changed mergetag ID to char[] from uint.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
4
diff
changeset
|
14 IDs are used for several purposes; they are UTF-8 strings. They are stored in text files as unquoted strings; escape sequences are not supported and the strings should not contain the following characters, although this is not checked: <|=>{} |
f63f4f41a2dc
Big changes to init; got some way towards input event support; changed mergetag ID to char[] from uint.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
4
diff
changeset
|
15 All characters between the appropriate markers are consumed into the ID, hence whitespace is meaningful. |
0 | 16 Multiple section or data tags with the same ID are allowed; see the "Merging rules" section. |
17 | |
18 | |
8
f63f4f41a2dc
Big changes to init; got some way towards input event support; changed mergetag ID to char[] from uint.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
4
diff
changeset
|
19 Outside of tags only whitespace or valid tags is allowed. Whitespace is ignored. |
0 | 20 The following tags are valid (see below for details): |
21 tag purpose | |
22 {...} section identifiers | |
23 <...> data items | |
24 !{...} simple comment block | |
25 !<...> comment block parsed the same as <...> | |
26 Within tags, type specifications or data items whitespace is allowed between symbols. | |
27 | |
28 | |
29 Section identifier tags: | |
8
f63f4f41a2dc
Big changes to init; got some way towards input event support; changed mergetag ID to char[] from uint.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
4
diff
changeset
|
30 Format: {ID} |
f63f4f41a2dc
Big changes to init; got some way towards input event support; changed mergetag ID to char[] from uint.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
4
diff
changeset
|
31 The ID is the section identifier/name. The ID type is DefaultData unless overriden by the code using the reader. |
f63f4f41a2dc
Big changes to init; got some way towards input event support; changed mergetag ID to char[] from uint.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
4
diff
changeset
|
32 A section identifier marks the beginning of a new section, extending until the next section identifier or the end of the file. |
0 | 33 |
34 | |
35 Data item tags: | |
36 Format: <tp|ID=dt> | |
8
f63f4f41a2dc
Big changes to init; got some way towards input event support; changed mergetag ID to char[] from uint.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
4
diff
changeset
|
37 A data item with type tp, identifier ID and data dt. If the data does not fit the given type it is an error and the tag is ignored. Once split into a type string, ID and data string, the contents are passed to an addTag() function within the DataSection class which will parse tags of a recognised format and either ignore or print a warning about other tags. |
0 | 38 |
39 | |
40 Data item tags: Type format: | |
41 Note: | |
2
78eb491bd642
mergetag: partially redesigned dataset and text reader classes. Changed text format.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
0
diff
changeset
|
42 The type is read as a single token terminated by any of these characters: <>|= |
78eb491bd642
mergetag: partially redesigned dataset and text reader classes. Changed text format.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
0
diff
changeset
|
43 There must not be spaces within the type, e.g. "char []". |
0 | 44 Of course any character other than a | terminating the token is an error. |
45 Format: | |
46 tp a basic type | |
47 tp[] a dynamic list of sub-type tp | |
4
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
48 t1[t2] an associative array with key-type t2 |
0 | 49 Possible future additions: |
8
f63f4f41a2dc
Big changes to init; got some way towards input event support; changed mergetag ID to char[] from uint.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
4
diff
changeset
|
50 tp() a dynamic merging list of sub-type tp (only valid as the primary type, ie <subtype()|...>, not a sub-type of a tuple or another dynamic list) |
0 | 51 {t1,t2,...,tn} a tuple with sub-types t1, t2, ..., tn |
52 | |
4
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
53 Basic types (only items with a + are currently supported, items with * are in DefaultData): |
2
78eb491bd642
mergetag: partially redesigned dataset and text reader classes. Changed text format.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
0
diff
changeset
|
54 name |
0 | 55 |
2
78eb491bd642
mergetag: partially redesigned dataset and text reader classes. Changed text format.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
0
diff
changeset
|
56 void --- less useful type |
4
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
57 +* bool --- integer types |
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
58 +* byte |
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
59 +* ubyte |
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
60 +* short |
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
61 +* ushort |
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
62 +* int |
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
63 +* uint |
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
64 +* long |
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
65 +* ulong |
2
78eb491bd642
mergetag: partially redesigned dataset and text reader classes. Changed text format.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
0
diff
changeset
|
66 cent |
78eb491bd642
mergetag: partially redesigned dataset and text reader classes. Changed text format.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
0
diff
changeset
|
67 ucent |
0 | 68 |
4
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
69 +* binary --- alias for ubyte[] |
0 | 70 |
4
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
71 +* float --- floating point types |
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
72 +* double |
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
73 +* real |
2
78eb491bd642
mergetag: partially redesigned dataset and text reader classes. Changed text format.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
0
diff
changeset
|
74 ifloat |
78eb491bd642
mergetag: partially redesigned dataset and text reader classes. Changed text format.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
0
diff
changeset
|
75 idouble |
78eb491bd642
mergetag: partially redesigned dataset and text reader classes. Changed text format.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
0
diff
changeset
|
76 ireal |
78eb491bd642
mergetag: partially redesigned dataset and text reader classes. Changed text format.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
0
diff
changeset
|
77 cfloat |
78eb491bd642
mergetag: partially redesigned dataset and text reader classes. Changed text format.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
0
diff
changeset
|
78 cdouble |
78eb491bd642
mergetag: partially redesigned dataset and text reader classes. Changed text format.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
0
diff
changeset
|
79 creal |
0 | 80 |
4
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
81 +* char --- single character types (actually these CANNOT support UTF8 symbols with length > 1) |
2
78eb491bd642
mergetag: partially redesigned dataset and text reader classes. Changed text format.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
0
diff
changeset
|
82 wchar |
78eb491bd642
mergetag: partially redesigned dataset and text reader classes. Changed text format.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
0
diff
changeset
|
83 dchar |
4
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
84 +* string --- alias for char[] --- (DOES support UTF8) |
2
78eb491bd642
mergetag: partially redesigned dataset and text reader classes. Changed text format.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
0
diff
changeset
|
85 wstring --- alias for wchar[] |
78eb491bd642
mergetag: partially redesigned dataset and text reader classes. Changed text format.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
0
diff
changeset
|
86 dstring --- alias for dchar[] |
0 | 87 |
88 | |
89 Data item tags: Data format: | |
90 Valid chars: [](){},+-.0-9eEixXa-fA-F '.' ".*" | |
91 Format: | |
92 [d1,d2,...,dn] data all of type t corresponding to t[] | |
93 (d1,d2,...,dn) data all of type t corresponding to t() | |
94 {d1,d2,...,dn} data corresponding to a type declaration of {t1,t2,...,tn} | |
95 d a single data element | |
96 | |
97 Single data elements: | |
98 z an integer number (regexp: [+-]?[0-9]+) | |
99 z a floating point number (rough regexp: [+-]?[0-9]*[.]?[0-9]*(e[+-]?[0-9]+)?) | |
100 zi an imaginary floating point number (z is a floating point number) | |
101 y+zi, y-zi a complex number (4+0i may be written as 4, etc) (y, z are f.p.s) | |
102 0xz, -0xz a hexadecimal integer z (composed of chars 0-9,a-f,A-F) | |
8
f63f4f41a2dc
Big changes to init; got some way towards input event support; changed mergetag ID to char[] from uint.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
4
diff
changeset
|
103 'c' a char/wchar/dchar character, depending on the type specified (c may be any single character except ' or an escape sequence) |
f63f4f41a2dc
Big changes to init; got some way towards input event support; changed mergetag ID to char[] from uint.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
4
diff
changeset
|
104 "string" equivalent to ['s','t','r','i','n','g'] --- may contain the following escape sequences as defined in D: \" \' \\ \a \b \f \n \r \t \v |
0 | 105 XX...XX Binary (ubyte[]); each pair of chars is read as a hex ubyte |
106 <void> void "data" has no symbols | |
107 | |
108 | |
109 Data format: Escape sequences: | |
110 To be created and written. | |
111 | |
112 | |
113 Comment tags (there are no line comments): | |
114 Simple comment blocks: | |
115 Format: !{...} | |
8
f63f4f41a2dc
Big changes to init; got some way towards input event support; changed mergetag ID to char[] from uint.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
4
diff
changeset
|
116 This is a simple comment block, and only curly braces ({,}) are treated specially. A {, whether or not it is preceded by a !, starts an embedded comment block, and a } ends either an embedded block or the actual comment block. Note: beware commenting out anything containing curly braces which aren't in matching pairs. |
0 | 117 Commented data tags: |
118 Format: !<tp|ID=dt> | |
8
f63f4f41a2dc
Big changes to init; got some way towards input event support; changed mergetag ID to char[] from uint.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
4
diff
changeset
|
119 Basically a commented out data tag. Conformance to the above spec may not be checked as strictly as normal, but the dt section is checked for strings so that a > within a string won't end the tag. |
0 | 120 |
121 | |
122 Merging rules: | |
123 if, when a data item is read, a data item with the same identifier | |
124 within the same section exists in the DataSet being read into: | |
125 + if the types are identical: | |
126 ++ if the primary type is a tp() mergeable dynamic list: | |
127 +++ the entries from the item being read are concatenated to those in the item | |
128 +++ in the DataSet | |
129 ++ else: | |
130 ++- the item already in the DataSet takes priority and is left untouched | |
131 + else: | |
132 +- a warning is issued, and the data item within the DataSet is left untouched | |
133 This allows merging some config settings in a user config file with the remaining settings in a | |
134 complete system config file and some support for modifications overriding or adding to some data. | |
135 | |
136 | |
137 Header: | |
8
f63f4f41a2dc
Big changes to init; got some way towards input event support; changed mergetag ID to char[] from uint.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
4
diff
changeset
|
138 The header is a standard section which is mandatory and must be the first section. Its section identifier must start at the beginning of the file with no whitespace, declared with: |
f63f4f41a2dc
Big changes to init; got some way towards input event support; changed mergetag ID to char[] from uint.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
4
diff
changeset
|
139 {MTXY} where XY is a two digit CAPITAL HEX version number representing the mergetag format version, e.g. {MT01} . |
0 | 140 If these are not the first 6 characters of the file the file will not be regarded as valid. |
141 This formatting is very strict to allow reliable low-level parsing. | |
142 | |
143 | |
144 The data tags within the header have no special meaning; any may be used such as the following: | |
145 <string|"Author"="..."> | |
146 <string|"Name"="..."> | |
147 <string|"Description"="..."> | |
148 <string|"Program"="..."> (which program created/uses this?) | |
149 <*|"Version"=...> (use any supported type) | |
150 <string|"Date"="YYYYMMDD"> (reverse date format; optionally "YYYYMMDDhhmmss") | |
8
f63f4f41a2dc
Big changes to init; got some way towards input event support; changed mergetag ID to char[] from uint.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
4
diff
changeset
|
151 <{u16,u8,u8}|"Date"={YYYY,MM,DD}> (actually this type probably won't be supported by a standard section) |
0 | 152 <string|"Copyright"=...> |
153 | |
154 | |
4
9a990644948c
Many changes: upgraded to tango 0.99.4, reorganised mde/input, large changes to mde/mergetag and mde/init, separated off test/MTTest.d and more.
Diggory Hardy <diggory.hardy@gmail.com>
parents:
2
diff
changeset
|
155 Example: !THIS IS NO LONGER VALID! |
0 | 156 {MT01} |
157 {example section} | |
158 <u32|"num"=5> | |
159 <{u32,UTF8[]}()|"DATA"=( | |
160 {1,['a']}, | |
161 {59,['w','o','r','d']}, | |
162 {2,"strings can be written like this"} )> | |
163 <wchar[]|"name"="This string is stored in UTF16, regardless of the file's encoding."> | |
164 <{u32,UTF8[]}()|"DATA"=( | |
165 {3,"this is appended to the previous 'DATA' item"} )> | |
166 {"section: section identifiers and tuples are not confused since tuples only occur inside <...> items"} | |
167 <void|Empty tag= > | |
168 !{this is a comment {containing a comment}} |