Mercurial > projects > dwt2

/*******************************************************************************

        @file UChar.d

        Copyright (c) 2004 Kris Bell

        This software is provided 'as-is', without any express or implied
        warranty. In no event will the authors be held liable for damages
        of any kind arising from the use of this software.

        Permission is hereby granted to anyone to use this software for any
        purpose, including commercial applications, and to alter it and/or
        redistribute it freely, subject to the following restrictions:

        1. The origin of this software must not be misrepresented; you must
           not claim that you wrote the original software. If you use this
           software in a product, an acknowledgment within documentation of
           said product would be appreciated but is not required.

        2. Altered source versions must be plainly marked as such, and must
           not be misrepresented as being the original software.

        3. This notice may not be removed or altered from any distribution
           of the source.

        4. Derivative works are permitted, but they must carry this notice
           in full and credit the original source.


                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


        @version        Initial version, October 2004
        @author         Kris


        Note that this package and documentation is built around the ICU
        project (http://oss.software.ibm.com/icu/). Below is the license
        statement as specified by that software:


                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


        ICU License - ICU 1.8.1 and later

        COPYRIGHT AND PERMISSION NOTICE

        Copyright (c) 1995-2003 International Business Machines Corporation and
        others.

        All rights reserved.

        Permission is hereby granted, free of charge, to any person obtaining a
        copy of this software and associated documentation files (the
        "Software"), to deal in the Software without restriction, including
        without limitation the rights to use, copy, modify, merge, publish,
        distribute, and/or sell copies of the Software, and to permit persons
        to whom the Software is furnished to do so, provided that the above
        copyright notice(s) and this permission notice appear in all copies of
        the Software and that both the above copyright notice(s) and this
        permission notice appear in supporting documentation.

        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
        OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
        MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT
        OF THIRD PARTY RIGHTS. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
        HOLDERS INCLUDED IN THIS NOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL
        INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING
        FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT,
        NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION
        WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

        Except as contained in this notice, the name of a copyright holder
        shall not be used in advertising or otherwise to promote the sale, use
        or other dealings in this Software without prior written authorization
        of the copyright holder.

        ----------------------------------------------------------------------

        All trademarks and registered trademarks mentioned herein are the
        property of their respective owners.

*******************************************************************************/

module com.ibm.icu.mangoicu.UChar;

private import  com.ibm.icu.mangoicu.ICU;

/*******************************************************************************

        This API provides low-level access to the Unicode Character
        Database. In addition to raw property values, some convenience
        functions calculate derived properties, for example for Java-style
        programming.

        Unicode assigns each code point (not just assigned character)
        values for many properties. Most of them are simple boolean
        flags, or constants from a small enumerated list. For some
        properties, values are strings or other relatively more complex
        types.

        For more information see "About the Unicode Character Database"
        (http://www.unicode.org/ucd/) and the ICU User Guide chapter on
        Properties (http://oss.software.ibm.com/icu/userguide/properties.html).

        Many functions are designed to match java.lang.Character functions.
        See the individual function documentation, and see the JDK 1.4.1
        java.lang.Character documentation at
        http://java.sun.com/j2se/1.4.1/docs/api/java/lang/Character.html

        There are also functions that provide easy migration from C/POSIX
        functions like isblank(). Their use is generally discouraged because
        the C/POSIX standards do not define their semantics beyond the ASCII
        range, which means that different implementations exhibit very different
        behavior. Instead, Unicode properties should be used directly.

        There are also only a few, broad C/POSIX character classes, and they
        tend to be used for conflicting purposes. For example, the "isalpha()"
        class is sometimes used to determine word boundaries, while a more
        sophisticated approach would at least distinguish initial letters from
        continuation characters (the latter including combining marks). (In
        ICU, BreakIterator is the most sophisticated API for word boundaries.)
        Another example: There is no "istitle()" class for titlecase characters.

        A summary of the behavior of some C/POSIX character classification
        implementations for Unicode is available at
        http://oss.software.ibm.com/cvs/icu/~checkout~/icuhtml/design/posix_classes.html

        See <A HREF="http://oss.software.ibm.com/icu/apiref/uchar_8h.html">
        this page</A> for full details.

*******************************************************************************/

class UChar : ICU
{
        public enum     Property
                        {
                        Alphabetic = 0,
                        BinaryStart = Alphabetic,
                        AsciiHexDigit,
                        BidiControl,
                        BidiMirrored,
                        Dash,
                        DefaultIgnorableCodePoint,
                        Deprecated,
                        Diacritic,
                        Extender,
                        FullCompositionExclusion,
                        GraphemeBase,
                        GraphemeExtend,
                        GraphemeLink,
                        HexDigit,
                        Hyphen,
                        IdContinue,
                        IdStart,
                        Ideographic,
                        IdsBinaryOperator,
                        IdsTrinaryOperator,
                        JoinControl,
                        LogicalOrderException,
                        Lowercase,
                        Math,
                        NoncharacterCodePoint,
                        QuotationMark,
                        Radical,
                        SoftDotted,
                        TerminalPunctuation,
                        UnifiedIdeograph,
                        Uppercase,
                        WhiteSpace,
                        XidContinue,
                        XidStart,
                        CaseSensitive,
                        STerm,
                        VariationSelector,
                        NfdInert,
                        NfkdInert,
                        NfcInert,
                        NfkcInert,
                        SegmentStarter,
                        BinaryLimit,
                        BidiClass = 0x1000,
                        IntStart = BidiClass,
                        Block, CanonicalCombiningClass,
                        DecompositionType,
                        EastAsianWidth,
                        GeneralCategory,
                        JoiningGroup,
                        JoiningType,
                        LineBreak,
                        NumericType,
                        Script,
                        HangulSyllableType,
                        NfdQuickCheck,
                        NfkdQuickCheck,
                        NfcQuickCheck,
                        NfkcQuickCheck,
                        LeadCanonicalCombiningClass,
                        TrailCanonicalCombiningClass,
                        IntLimit,
                        GeneralCategoryMask = 0x2000,
                        MaskStart = GeneralCategoryMask,
                        MaskLimit,
                        NumericValue = 0x3000,
                        DoubleStart = NumericValue,
                        DoubleLimit,
                        Age = 0x4000,
                        StringStart = Age,
                        BidiMirroringGlyph,
                        CaseFolding,
                        IsoComment,
                        LowercaseMapping,
                        Name,
                        SimpleCaseFolding,
                        SimpleLowercaseMapping,
                        SimpleTitlecaseMapping,
                        SimpleUppercaseMapping,
                        TitlecaseMapping,
                        Unicode1Name,
                        UppercaseMapping,
                        StringLimit,
                        InvalidCode = -1
                        }

        public enum     Category
                        {
                        Unassigned = 0,
                        GeneralOtherTypes = 0,
                        UppercaseLetter = 1,
                        LowercaseLetter = 2,
                        TitlecaseLetter = 3,
                        ModifierLetter = 4,
                        OtherLetter = 5,
                        NonSpacingMark = 6,
                        EnclosingMark = 7,
                        CombiningSpacingMark = 8,
                        DecimalDigitNumber = 9,
                        LetterNumber = 10,
                        OtherNumber = 11,
                        SpaceSeparator = 12,
                        LineSeparator = 13,
                        ParagraphSeparator = 14,
                        ControlChar = 15,
                        FormatChar = 16,
                        PrivateUseChar = 17,
                        Surrogate = 18,
                        DashPunctuation = 19,
                        StartPunctuation = 20,
                        EndPunctuation = 21,
                        ConnectorPunctuation = 22,
                        OtherPunctuation = 23,
                        MathSymbol = 24,
                        CurrencySymbol = 25,
                        ModifierSymbol = 26,
                        OtherSymbol = 27,
                        InitialPunctuation = 28,
                        FinalPunctuation = 29,
                        Count
                        }

        public enum     Direction
                        {
                        LeftToRight = 0,
                        RightToLeft = 1,
                        EuropeanNumber = 2,
                        EuropeanNumberSeparator = 3,
                        EuropeanNumberTerminator = 4,
                        ArabicNumber = 5,
                        CommonNumberSeparator = 6,
                        BlockSeparator = 7,
                        SegmentSeparator = 8,
                        WhiteSpaceNeutral = 9,
                        OtherNeutral = 10,
                        LeftToRightEmbedding = 11,
                        LeftToRightOverride = 12,
                        RightToLeftArabic = 13,
                        RightToLeftEmbedding = 14,
                        RightToLeftOverride = 15,
                        PopDirectionalFormat = 16,
                        DirNonSpacingMark = 17,
                        BoundaryNeutral = 18,
                        Count
                        }

        public enum     BlockCode
                        {
                        NoBlock = 0,
                        BasicLatin = 1,
                        Latin1Supplement = 2,
                        LatinExtendedA = 3,
                        LatinExtendedB = 4,
                        IpaExtensions = 5,
                        SpacingModifierLetters = 6,
                        CombiningDiacriticalMarks = 7,
                        Greek = 8,
                        Cyrillic = 9,
                        Armenian = 10,
                        Hebrew = 11,
                        Arabic = 12,
                        Syriac = 13,
                        Thaana = 14,
                        Devanagari = 15,
                        Bengali = 16,
                        Gurmukhi = 17,
                        Gujarati = 18,
                        Oriya = 19,
                        Tamil = 20,
                        Telugu = 21,
                        Kannada = 22,
                        Malayalam = 23,
                        Sinhala = 24,
                        Thai = 25,
                        Lao = 26,
                        Tibetan = 27,
                        Myanmar = 28,
                        Georgian = 29,
                        HangulJamo = 30,
                        Ethiopic = 31,
                        Cherokee = 32,
                        UnifiedCanadianAboriginalSyllabics = 33,
                        Ogham = 34,
                        Runic = 35,
                        Khmer = 36,
                        Mongolian = 37,
                        LatinExtendedAdditional = 38,
                        GreekExtended = 39,
                        GeneralPunctuation = 40,
                        SuperscriptsAndSubscripts = 41,
                        CurrencySymbols = 42,
                        CombiningMarksForSymbols = 43,
                        LetterlikeSymbols = 44,
                        NumberForms = 45,
                        Arrows = 46,
                        MathematicalOperators = 47,
                        MiscellaneousTechnical = 48,
                        ControlPictures = 49,
                        OpticalCharacterRecognition = 50,
                        EnclosedAlphanumerics = 51,
                        BoxDrawing = 52,
                        BlockElements = 53,
                        GeometricShapes = 54,
                        MiscellaneousSymbols = 55,
                        Dingbats = 56,
                        BraillePatterns = 57,
                        CjkRadicalsSupplement = 58,
                        KangxiRadicals = 59,
                        IdeographicDescriptionCharacters = 60,
                        CjkSymbolsAndPunctuation = 61,
                        Hiragana = 62,
                        Katakana = 63,
                        Bopomofo = 64,
                        HangulCompatibilityJamo = 65,
                        Kanbun = 66,
                        BopomofoExtended = 67,
                        EnclosedCjkLettersAndMonths = 68,
                        CjkCompatibility = 69,
                        CjkUnifiedIdeographsExtensionA = 70,
                        CjkUnifiedIdeographs = 71,
                        YiSyllables = 72,
                        YiRadicals = 73,
                        HangulSyllables = 74,
                        HighSurrogates = 75,
                        HighPrivateUseSurrogates = 76,
                        LowSurrogates = 77,
                        PrivateUse = 78,
                        PrivateUseArea = PrivateUse,
                        CjkCompatibilityIdeographs = 79,
                        AlphabeticPresentationForms = 80,
                        ArabicPresentationFormsA = 81,
                        CombiningHalfMarks = 82,
                        CjkCompatibilityForms = 83,
                        SmallFormVariants = 84,
                        ArabicPresentationFormsB = 85,
                        Specials = 86,
                        HalfwidthAndFullwidthForms = 87,
                        OldItalic = 88,
                        Gothic = 89,
                        Deseret = 90,
                        ByzantineMusicalSymbols = 91,
                        MusicalSymbols = 92,
                        MathematicalAlphanumericSymbols = 93,
                        CjkUnifiedIdeographsExtensionB = 94,
                        CjkCompatibilityIdeographsSupplement = 95,
                        Tags = 96,
                        CyrillicSupplementary = 97,
                        CyrillicSupplement = CyrillicSupplementary,
                        Tagalog = 98,
                        Hanunoo = 99,
                        Buhid = 100,
                        Tagbanwa = 101,
                        MiscellaneousMathematicalSymbolsA = 102,
                        SupplementalArrowsA = 103,
                        SupplementalArrowsB = 104,
                        MiscellaneousMathematicalSymbolsB = 105,
                        SupplementalMathematicalOperators = 106,
                        KatakanaPhoneticExtensions = 107,
                        VariationSelectors = 108,
                        SupplementaryPrivateUseAreaA = 109,
                        SupplementaryPrivateUseAreaB = 110,
                        Limbu = 111,
                        TaiLe = 112,
                        KhmerSymbols = 113,
                        PhoneticExtensions = 114,
                        MiscellaneousSymbolsAndArrows = 115,
                        YijingHexagramSymbols = 116,
                        LinearBSyllabary = 117,
                        LinearBIdeograms = 118,
                        AegeanNumbers = 119,
                        Ugaritic = 120,
                        Shavian = 121,
                        Osmanya = 122,
                        CypriotSyllabary = 123,
                        TaiXuanJingSymbols = 124,
                        VariationSelectorsSupplement = 125,
                        Count,
                        InvalidCode = -1
                        }

        public enum     EastAsianWidth
                        {
                        Neutral,
                        Ambiguous,
                        Halfwidth,
                        Fullwidth,
                        Narrow,
                        Wide,
                        Count
                        }

        public enum     CharNameChoice
                        {
                        Unicode,
                        Unicode10,
                        Extended,
                        Count
                        }

        public enum     NameChoice
                        {
                        Short,
                        Long,
                        Count
                        }

        public enum     DecompositionType
                        {
                        None,
                        Canonical,
                        Compat,
                        Circle,
                        Final,
                        Font,
                        Fraction,
                        Initial,
                        Isolated,
                        Medial,
                        Narrow,
                        Nobreak,
                        Small,
                        Square,
                        Sub,
                        Super,
                        Vertical,
                        Wide,
                        Count
                        }

        public enum     JoiningType
                        {
                        NonJoining,
                        JoinCausing,
                        DualJoining,
                        LeftJoining,
                        RightJoining,
                        Transparent,
                        Count
                        }

        public enum     JoiningGroup
                        {
                        NoJoiningGroup,
                        Ain,
                        Alaph,
                        Alef,
                        Beh,
                        Beth,
                        Dal,
                        DalathRish,
                        E,
                        Feh,
                        FinalSemkath,
                        Gaf,
                        Gamal,
                        Hah,
                        HamzaOnHehGoal,
                        He,
                        Heh,
                        HehGoal,
                        Heth,
                        Kaf,
                        Kaph,
                        KnottedHeh,
                        Lam,
                        Lamadh,
                        Meem,
                        Mim,
                        Noon,
                        Nun,
                        Pe,
                        Qaf,
                        Qaph,
                        Reh,
                        Reversed_Pe,
                        Sad,
                        Sadhe,
                        Seen,
                        Semkath,
                        Shin,
                        Swash_Kaf,
                        Syriac_Waw,
                        Tah,
                        Taw,
                        Teh_Marbuta,
                        Teth,
                        Waw,
                        Yeh,
                        Yeh_Barree,
                        Yeh_With_Tail,
                        Yudh,
                        Yudh_He,
                        Zain,
                        Fe,
                        Khaph,
                        Zhain,
                        Count
                        }

        public enum     LineBreak
                        {
                        Unknown,
                        Ambiguous,
                        Alphabetic,
                        BreakBoth,
                        BreakAfter,
                        BreakBefore,
                        MandatoryBreak,
                        ContingentBreak,
                        ClosePunctuation,
                        CombiningMark,
                        CarriageReturn,
                        Exclamation,
                        Glue,
                        Hyphen,
                        Ideographic,
                        Inseperable,
                        Inseparable = Inseperable,
                        InfixNumeric,
                        LineFeed,
                        Nonstarter,
                        Numeric,
                        OpenPunctuation,
                        PostfixNumeric,
                        PrefixNumeric,
                        Quotation,
                        ComplexContext,
                        Surrogate,
                        Space,
                        BreakSymbols,
                        Zwspace,
                        NextLine,
                        WordJoiner,
                        Count
                        }

        public enum     NumericType
                        {
                        None,
                        Decimal,
                        Digit,
                        Numeric,
                        Count
                        }

        public enum     HangulSyllableType
                        {
                        NotApplicable,
                        LeadingJamo,
                        VowelJamo,
                        TrailingJamo,
                        LvSyllable,
                        LvtSyllable,
                        Count
                        }

        /***********************************************************************

                Get the property value for an enumerated or integer
                Unicode property for a code point. Also returns binary
                and mask property values.

                Unicode, especially in version 3.2, defines many more
                properties than the original set in UnicodeData.txt.

                The properties APIs are intended to reflect Unicode
                properties as defined in the Unicode Character Database
                (UCD) and Unicode Technical Reports (UTR). For details
                about the properties see http://www.unicode.org/ . For
                names of Unicode properties see the file PropertyAliases.txt

        ***********************************************************************/

        uint getProperty (dchar c, Property p)
        {
                return u_getIntPropertyValue (cast(uint) c, cast(uint) p);
        }

        /***********************************************************************

                Get the minimum value for an enumerated/integer/binary
                Unicode property

        ***********************************************************************/

        uint getPropertyMinimum (Property p)
        {
                return u_getIntPropertyMinValue (p);
        }

        /***********************************************************************

                Get the maximum value for an enumerated/integer/binary
                Unicode property

        ***********************************************************************/

        uint getPropertyMaximum (Property p)
        {
                return u_getIntPropertyMaxValue (p);
        }

        /***********************************************************************

                Returns the bidirectional category value for the code
                point, which is used in the Unicode bidirectional algorithm
                (UAX #9 http://www.unicode.org/reports/tr9/).

        ***********************************************************************/

        Direction charDirection (dchar c)
        {
                return cast(Direction) u_charDirection (c);
        }

        /***********************************************************************

                Returns the Unicode allocation block that contains the
                character

        ***********************************************************************/

        BlockCode getBlockCode (dchar c)
        {
                return cast(BlockCode) ublock_getCode (c);
        }

        /***********************************************************************

                Retrieve the name of a Unicode character.

        ***********************************************************************/

        char[] getCharName (dchar c, CharNameChoice choice, ref char[] dst)
        {
                UErrorCode e;

                uint len = u_charName (c, choice, dst.ptr, dst.length, e);
                testError (e, "failed to extract char name (buffer too small?)");
                return dst [0..len];
        }

        /***********************************************************************

                Get the ISO 10646 comment for a character.

        ***********************************************************************/

        char[] getComment (dchar c, ref char[] dst)
        {
                UErrorCode e;

                uint len = u_getISOComment (c, dst.ptr, dst.length, e);
                testError (e, "failed to extract comment (buffer too small?)");
                return dst [0..len];
        }

        /***********************************************************************

                Find a Unicode character by its name and return its code
                point value.

        ***********************************************************************/

        dchar charFromName (CharNameChoice choice, char[] name)
        {
                UErrorCode e;

                dchar c = u_charFromName (choice, toString(name), e);
                testError (e, "failed to locate char name");
                return c;
        }

        /***********************************************************************

                Return the Unicode name for a given property, as given in the
                Unicode database file PropertyAliases.txt

        ***********************************************************************/

        char[] getPropertyName (Property p, NameChoice choice)
        {
                return toArray (u_getPropertyName (p, choice));
        }

        /***********************************************************************

                Return the Unicode name for a given property value, as given
                in the Unicode database file PropertyValueAliases.txt.

        ***********************************************************************/

        char[] getPropertyValueName (Property p, NameChoice choice, uint value)
        {
                return toArray (u_getPropertyValueName (p, value, choice));
        }

        /***********************************************************************

                Gets the Unicode version information

        ***********************************************************************/

        void getUnicodeVersion (ref Version v)
        {
                u_getUnicodeVersion (v);
        }

        /***********************************************************************

                Get the "age" of the code point

        ***********************************************************************/

        void getCharAge (dchar c, ref Version v)
        {
                u_charAge (c, v);
        }


        /***********************************************************************

                These are externalised directly to the client (sans wrapper),
                but this may have to change for linux, depending upon the
                ICU function-naming conventions within the Posix libraries.

        ***********************************************************************/

        static extern (C)
        {
                /***************************************************************

                        Check if a code point has the Alphabetic Unicode
                        property.

                ***************************************************************/

                bool function (dchar c) isUAlphabetic;

                /***************************************************************

                        Check if a code point has the Lowercase Unicode
                        property.

                ***************************************************************/

                bool function (dchar c) isULowercase;

                /***************************************************************

                        Check if a code point has the Uppercase Unicode
                        property.

                ***************************************************************/

                bool function (dchar c) isUUppercase;

                /***************************************************************

                        Check if a code point has the White_Space Unicode
                        property.

                ***************************************************************/

                bool function (dchar c) isUWhiteSpace;

                /***************************************************************

                        Determines whether the specified code point has the
                        general category "Ll" (lowercase letter).

                ***************************************************************/

                bool function (dchar c) isLower;

                /***************************************************************

                        Determines whether the specified code point has the
                        general category "Lu" (uppercase letter).

                ***************************************************************/

                bool function (dchar c) isUpper;

                /***************************************************************

                        Determines whether the specified code point is a
                        titlecase letter.

                ***************************************************************/

                bool function (dchar c) isTitle;

                /***************************************************************

                        Determines whether the specified code point is a
                        digit character according to Java.

                ***************************************************************/

                bool function (dchar c) isDigit;

                /***************************************************************

                        Determines whether the specified code point is a
                        letter character.

                ***************************************************************/

                bool function (dchar c) isAlpha;

                /***************************************************************

                        Determines whether the specified code point is an
                        alphanumeric character (letter or digit) according
                        to Java.

                ***************************************************************/

                bool function (dchar c) isAlphaNumeric;

                /***************************************************************

                        Determines whether the specified code point is a
                        hexadecimal digit.

                ***************************************************************/

                bool function (dchar c) isHexDigit;

                /***************************************************************

                        Determines whether the specified code point is a
                        punctuation character.

                ***************************************************************/

                bool function (dchar c) isPunct;

                /***************************************************************

                        Determines whether the specified code point is a
                        "graphic" character (printable, excluding spaces).

                ***************************************************************/

                bool function (dchar c) isGraph;

                /***************************************************************

                        Determines whether the specified code point is a
                        "blank" or "horizontal space", a character that
                        visibly separates words on a line.

                ***************************************************************/

                bool function (dchar c) isBlank;

                /***************************************************************

                        Determines whether the specified code point is
                        "defined", which usually means that it is assigned
                        a character.

                ***************************************************************/

                bool function (dchar c) isDefined;

                /***************************************************************

                        Determines if the specified character is a space
                        character or not.

                ***************************************************************/

                bool function (dchar c) isSpace;

                /***************************************************************

                        Determine if the specified code point is a space
                        character according to Java.

                ***************************************************************/

                bool function (dchar c) isJavaSpaceChar;

                /***************************************************************

                        Determines if the specified code point is a whitespace
                        character according to Java/ICU.

                ***************************************************************/

                bool function (dchar c) isWhiteSpace;

                /***************************************************************

                        Determines whether the specified code point is a
                        control character (as defined by this function).

                ***************************************************************/

                bool function (dchar c) isCtrl;

                /***************************************************************

                        Determines whether the specified code point is an ISO
                        control code.

                ***************************************************************/

                bool function (dchar c) isISOControl;

                /***************************************************************

                        Determines whether the specified code point is a
                        printable character.

                ***************************************************************/

                bool function (dchar c) isPrint;

                /***************************************************************

                        Determines whether the specified code point is a
                        base character.

                ***************************************************************/

                bool function (dchar c) isBase;

                /***************************************************************

                        Determines if the specified character is permissible
                        as the first character in an identifier according to
                        Unicode (The Unicode Standard, Version 3.0, chapter
                        5.16 Identifiers).

                ***************************************************************/

                bool function (dchar c) isIDStart;

                /***************************************************************

                        Determines if the specified character is permissible
                        in an identifier according to Java.

                ***************************************************************/

                bool function (dchar c) isIDPart;

                /***************************************************************

                        Determines if the specified character should be
                        regarded as an ignorable character in an identifier,
                        according to Java.

                ***************************************************************/

                bool function (dchar c) isIDIgnorable;

                /***************************************************************

                        Determines if the specified character is permissible
                        as the first character in a Java identifier.

                ***************************************************************/

                bool function (dchar c) isJavaIDStart;

                /***************************************************************

                        Determines if the specified character is permissible
                        in a Java identifier.

                ***************************************************************/

                bool function (dchar c) isJavaIDPart;

                /***************************************************************

                        Determines whether the code point has the
                        Bidi_Mirrored property.

                ***************************************************************/

                bool function (dchar c) isMirrored;

                /***************************************************************

                        Returns the decimal digit value of a decimal digit
                        character.

                ***************************************************************/

                ubyte function (dchar c) charDigitValue;

                /***************************************************************

                        Maps the specified character to a "mirror-image"
                        character.

                ***************************************************************/

                dchar function (dchar c) charMirror;

                /***************************************************************

                        Returns the general category value for the code point.

                ***************************************************************/

                ubyte function (dchar c) charType;

                /***************************************************************

                        Returns the combining class of the code point as
                        specified in UnicodeData.txt.

                ***************************************************************/

                ubyte function (dchar c) getCombiningClass;

                /***************************************************************

                        The given character is mapped to its lowercase
                        equivalent according to UnicodeData.txt; if the
                        character has no lowercase equivalent, the
                        character itself is returned.

                ***************************************************************/

                dchar function (dchar c) toLower;

                /***************************************************************

                        The given character is mapped to its uppercase equivalent
                        according to UnicodeData.txt; if the character has no
                        uppercase equivalent, the character itself is returned.

                ***************************************************************/

                dchar function (dchar c) toUpper;

                /***************************************************************

                        The given character is mapped to its titlecase
                        equivalent according to UnicodeData.txt; if none
                        is defined, the character itself is returned.

                ***************************************************************/

                dchar function (dchar c) toTitle;

                /***************************************************************

                        The given character is mapped to its case folding
                        equivalent according to UnicodeData.txt and
                        CaseFolding.txt; if the character has no case folding
                        equivalent, the character itself is returned.

                ***************************************************************/

                dchar function (dchar c, uint options) foldCase;

                /***************************************************************

                        Returns the decimal digit value of the code point in
                        the specified radix.

                ***************************************************************/

                uint function (dchar ch, ubyte radix) digit;

                /***************************************************************

                        Determines the character representation for a specific
                        digit in the specified radix.

                ***************************************************************/

                dchar function (uint digit, ubyte radix) forDigit;

                /***************************************************************

                        Get the numeric value for a Unicode code point as
                        defined in the Unicode Character Database.

                ***************************************************************/

                double function (dchar c) getNumericValue;
        }


        /***********************************************************************

                Bind the ICU functions from a shared library. This is
                complicated by the issues regarding D and DLLs on the
                Windows platform

        ***********************************************************************/

        mixin(genICUNative!("uc"
                ,"uint   function (uint, uint)", "u_getIntPropertyValue"
                ,"uint   function (uint)", "u_getIntPropertyMinValue"
                ,"uint   function (uint)", "u_getIntPropertyMaxValue"
                ,"uint   function (dchar)", "u_charDirection"
                ,"uint   function (dchar)", "ublock_getCode"
                ,"uint   function (dchar, uint, char*, uint, ref UErrorCode)", "u_charName"
                ,"uint   function (dchar, char*, uint, ref UErrorCode)", "u_getISOComment"
                ,"uint   function (uint, char*, ref UErrorCode)", "u_charFromName"
                ,"char*  function (uint, uint)", "u_getPropertyName"
                ,"char*  function (uint, uint, uint)", "u_getPropertyValueName"
                ,"void   function (ref Version)", "u_getUnicodeVersion"
                ,"void   function (dchar, ref Version)", "u_charAge"
        ));
}
author	Jacob Carlborg <doob@me.com>
date	Thu, 11 Aug 2011 19:55:14 +0200
parents	536e43f63c81
children