# HG changeset patch # User Aziz K?ksal # Date 1197225452 -3600 # Node ID 49c201b5c465d8428d10ba7b8657f2376d09d182 # Parent 0ffcc4ff82f3029e458b2a1ba28f3cb3d07369c1 Refactored scanners for block and nested comments. diff -r 0ffcc4ff82f3 -r 49c201b5c465 trunk/src/dil/Lexer.d --- a/trunk/src/dil/Lexer.d Sun Dec 09 14:58:38 2007 +0100 +++ b/trunk/src/dil/Lexer.d Sun Dec 09 19:37:32 2007 +0100 @@ -216,7 +216,7 @@ This is the old scan method. TODO: profile old and new to see which one is faster. +/ - public void scan(out Token t) + public void scan(ref Token t) in { assert(text.ptr <= p && p < end); @@ -676,7 +676,7 @@ const char[] case_L3 = case_!(str, tok, "Lcommon"); } - public void scan_(out Token t) + public void scan_(ref Token t) in { assert(text.ptr <= p && p < end); @@ -1098,13 +1098,16 @@ assert(p[-1] == '/' && *p == '*'); auto tokenLineNum = lineNum; auto tokenLineBegin = lineBegin; - uint c; + Loop: while (1) { - c = *++p; - LswitchBC: // only jumped to from default case of next switch(c) - switch (c) + switch (*++p) { + case '*': + if (p[1] != '/') + continue; + p += 2; + break Loop; case '\r': if (p[1] == '\n') ++p; @@ -1112,36 +1115,23 @@ assert(isNewlineEnd(p)); ++lineNum; setLineBegin(p+1); - continue; - case 0, _Z_: - error(tokenLineNum, tokenLineBegin, t.start, MID.UnterminatedBlockComment); - goto LreturnBC; + break; default: - if (!isascii(c)) + if (!isascii(*p)) { - c = decodeUTF8(); - if (isUnicodeNewlineChar(c)) + if (isUnicodeNewlineChar(decodeUTF8())) goto case '\n'; - continue; + } + else if (isEOF(*p)) + { + error(tokenLineNum, tokenLineBegin, t.start, MID.UnterminatedBlockComment); + break Loop; } } - - c <<= 8; - c |= *++p; - switch (c) - { - case toUint!("*/"): - ++p; - LreturnBC: - t.type = TOK.Comment; - t.end = p; - return; - default: - c &= char.max; - goto LswitchBC; - } } - assert(0); + t.type = TOK.Comment; + t.end = p; + return; } void scanNestedComment(ref Token t) @@ -1150,13 +1140,23 @@ auto tokenLineNum = lineNum; auto tokenLineBegin = lineBegin; uint level = 1; - uint c; + Loop: while (1) { - c = *++p; - LswitchNC: // only jumped to from default case of next switch(c) - switch (c) + switch (*++p) { + case '/': + if (p[1] == '+') + ++p, ++level; + continue; + case '+': + if (p[1] != '/') + continue; + ++p; + if (--level != 0) + continue; + ++p; + break Loop; case '\r': if (p[1] == '\n') ++p; @@ -1165,42 +1165,22 @@ ++lineNum; setLineBegin(p+1); continue; - case 0, _Z_: - error(tokenLineNum, tokenLineBegin, t.start, MID.UnterminatedNestedComment); - goto LreturnNC; default: - if (!isascii(c)) + if (!isascii(*p)) { - c = decodeUTF8(); - if (isUnicodeNewlineChar(c)) + if (isUnicodeNewlineChar(decodeUTF8())) goto case '\n'; - continue; + } + else if (isEOF(*p)) + { + error(tokenLineNum, tokenLineBegin, t.start, MID.UnterminatedNestedComment); + break Loop; } } - - c <<= 8; - c |= *++p; - switch (c) - { - case toUint!("/+"): - ++level; - continue; - case toUint!("+/"): - if (--level == 0) - { - ++p; - LreturnNC: - t.type = TOK.Comment; - t.end = p; - return; - } - continue; - default: - c &= char.max; - goto LswitchNC; - } } - assert(0); + t.type = TOK.Comment; + t.end = p; + return; } char scanPostfix()