Package home | Report new bug | New search | Development Roadmap Status: Open | Feedback | All | Closed Since Version 0.7.0

Bug #487 Lexer can't handle functions inside getParams()
Submitted: 2003-12-23 20:50 UTC
From: jlabonsk at eportation dot com Assigned: cybot
Status: Closed Package: SQL_Parser
PHP Version: Irrelevant OS: Irrelevant
Roadmaps: 0.5.1    
Subscription  
Comments Add Comment Add patch


Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know! Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
MUST BE VALID
Solve the problem : 32 - 30 = ?

 
 [2003-12-23 20:50 UTC] jlabonsk at eportation dot com
Description: ------------ Valid SQL: INSERT INTO mytable (foo, bar, baz) VALUES (NOW(), 1, 'text') Small dump from pear::error object [error_message_prefix] => [mode] => 1 [level] => 1024 [code] => [message] => Parse error: Expected , or ) on line 1 INSERT INTO mytable (foo, bar, baz) VALUES (NOW(), 1, 'text') ^ found: ( [userinfo] => It seems that Lexer::lex() can't handle tokens that are functions. Reproduce code: --------------- $sql = "INSERT INTO mytable (foo, bar, baz) VALUES (NOW(), 1, 'text')"; $parser = new SQL_Parser($sql); $tree = $parser->parse(); Expected result: ---------------- A value set of 'NOW()' with a type of 'function' Actual result: -------------- [backtrace] => Array ( [0] => Array ( [file] => /usr/local/lib/php/PEAR.php [line] => 527 [function] => pear_error [class] => pear_error [type] => -> [args] => Array ( [0] => Parse error: Expected , or ) on line 1 INSERT INTO mytable (foo, bar, baz) VALUES (NOW(), 1, 'text') ^ found: ( [1] => [2] => 1 [3] => 1024 [4] => ) ) [1] => Array ( [file] => /usr/local/lib/php/SQL/Parser.php [line] => 108 [function] => raiseerror [class] => pear [type] => :: [args] => Array ( [0] => Parse error: Expected , or ) on line 1 INSERT INTO mytable (foo, bar, baz) VALUES (NOW(), 1, 'text') ^ found: ( ) ) [2] => Array ( [file] => /usr/local/lib/php/SQL/Parser.php [line] => 85 [function] => raiseerror [class] => sql_parser [type] => -> [args] => Array ( [0] => Expected , or ) ) ) [3] => Array ( [file] => /usr/local/lib/php/SQL/Parser.php [line] => 565 [function] => getparams [class] => sql_parser [type] => -> [args] => Array ( [0] => Array ( [0] => NOW ) [1] => Array ( [0] => ident ) ) ) [4] => Array ( [file] => /usr/local/lib/php/SQL/Parser.php [line] => 878 [function] => parseinsert [class] => sql_parser [type] => -> [args] => Array ( ) ) [5] => Array ( [file] => /home/jlabonsk/public_html/pns-head/htdocs/terminal/test.php [line] => 52 [function] => parse [class] => sql_parser [type] => -> [args] => Array ( ) ) ) [callback] =>

Comments

 [2003-12-23 20:51 UTC] jlabonsk at eportation dot com
All tested with SQL_Parser 0.4
 [2005-04-02 02:48 UTC] epte at ruffdogs dot com
Confirmed. Wow. This bug has been in the queue for a while. I don't have cvs access, but I am a professional programmer (who now has a professional interest in this project). So, I'll have a shot at it... *crack knuckles* More to come :)
 [2005-04-02 23:58 UTC] epte at ruffdogs dot com
Alright. I think I fixed it. The lexer now handles functions. It will return a type 'function' for them. IMHO, it handles nested functions fairly well. Namely, it would assign FUNC(NOW(BLAH))) all to one token with type 'function'. The patch follows (diff from CVS). ? diff Index: Dialect_MySQL.php =================================================================== RCS file: /repository/pear/SQL_Parser/Dialect_MySQL.php,v retrieving revision 1.4 diff -u -r1.4 Dialect_MySQL.php --- Dialect_MySQL.php 3 Jul 2004 04:49:44 -0000 1.4 +++ Dialect_MySQL.php 2 Apr 2005 23:55:55 -0000 @@ -36,7 +36,7 @@ 'functions'=>array('avg','count','max','min','sum','nextval','currval','concat','date_format'), -'reserved'=>array('absolute','action','add','all','allocate','and','any','are','asc','ascending','assertion','at','authorization','begin','bit_length','both','cascade','cascaded','case','cast','catalog','char_length','character_length','check','close','coalesce','collate','collation','column','commit','connect','connection','constraint','constraints','continue','convert','corresponding','cross','current','current_date','current_time','current_timestamp','current_user','cursor','day','deallocate','declare','default','deferrable','deferred','desc','descending','describe','descriptor','diagnostics','disconnect','distinct','domain','else','end','end-exec','escape','except','exception','exec','execute','exists','external','extract','false','fetch','first','for','foreign','found','full','get','global','go','goto','grant','group','having','hour','identity','immediate','indicator','initially','inner','input','insensitive','intersect','isolation','join','key','language','last','leading','left','level','limit','local','lower','match','minute','module','month','names','national','natural','next','no','null','nullif','octet_length','of','only','open','option','or','order','outer','output','overlaps','pad','partial','position','precision','prepare','preserve','primary','prior','privileges','procedure','public','read','references','relative','restrict','revoke','right','rollback','rows','schema','scroll','second','section','session','session_user','size','some','space','sql','sqlcode','sqlerror','sqlstate','substring','system_user','table','temporary','then','timezone_hour','timezone_minute','to','trailing','transaction','translate','translation','trim','true','union','unique','unknown','upper','usage','user','using','value','values','varying','view','when','whenever','work','write','year','zone','eoc'), +'reserved'=>array('add','all','alter','analyze','and','as','asc','asensitive','auto_increment','bdb','before','berkeleydb','between','bigint','binary','blob','both','by','call','cascade','case','change','char','character','check','collate','column','columns','condition','connection','constraint','continue','create','cross','current_date','current_time','current_timestamp','cursor','database','databases','day_hour','day_microsecond','day_minute','day_second','dec','decimal','declare','default','delayed','delete','desc','describe','deterministic','distinct','distinctrow','div','double','drop','else','elseif','enclosed','escaped','exists','exit','explain','false','fetch','fields','float','for','force','foreign','found','frac_second','from','fulltext','grant','group','having','high_priority','hour_microsecond','hour_minute','hour_second','if','ignore','in','index','infile','inner','innodb','inout','insensitive','insert','int','integer','interval','into','io_thread','is','iterate','join','key','keys','kill','leading','leave','left','like','limit','lines','load','localtime','localtimestamp','lock','long','longblob','longtext','loop','low_priority','master_server_id','match','mediumblob','mediumint','mediumtext','middleint','minute_microsecond','minute_second','mod','natural','not','no_write_to_binlog','null','numeric','on','optimize','option','optionally','or','order','out','outer','outfile','precision','primary','privileges','procedure','purge','read','real','references','regexp','rename','repeat','replace','require','restrict','return','revoke','right','rlike','second_microsecond','select','sensitive','separator','set','show','smallint','some','soname','spatial','specific','sql','sqlexception','sqlstate','sqlwarning','sql_big_result','sql_calc_found_rows','sql_small_result','sql_tsi_day','sql_tsi_frac_second','sql_tsi_hour','sql_tsi_minute','sql_tsi_month','sql_tsi_quarter','sql_tsi_second','sql_tsi_week','sql_tsi_year','ssl','starting','straight_join','striped','table','tables','terminated','then','timestampadd','timestampdiff','tinyblob','tinyint','tinytext','to','trailing','true','undo','union','unique','unlock','unsigned','update','usage','use','user_resources','using','utc_date','utc_time','utc_timestamp','values','varbinary','varchar','varcharacter','varying','when','where','while','with','write','xor','year_month','zerofill'), 'synonyms'=>array('decimal'=>'numeric','dec'=>'numeric','numeric'=>'numeric','float'=>'float','real'=>'real','double'=>'real','int'=>'int','integer'=>'int','interval'=>'interval','smallint'=>'smallint','timestamp'=>'timestamp','bool'=>'bool','boolean'=>'bool','set'=>'set','enum'=>'enum','text'=>'text','char'=>'char','character'=>'char','varchar'=>'varchar','ascending'=>'asc','asc'=>'asc','descending'=>'desc','desc'=>'desc','date'=>'date','time'=>'time'), Index: Lexer.php =================================================================== RCS file: /repository/pear/SQL_Parser/Lexer.php,v retrieving revision 1.21 diff -u -r1.21 Lexer.php --- Lexer.php 17 May 2004 13:15:29 -0000 1.21 +++ Lexer.php 2 Apr 2005 23:55:55 -0000 @@ -29,6 +29,29 @@ // {{{ token definitions // variables: 'ident', 'sys_var' // values: 'real_val', 'text_val', 'int_val', null + +define('SQL_PARSER_STATE_START', 0); +define('SQL_PARSER_STATE_KEYWORD_OR_IDENT_INCOMPLETE', 1); +define('SQL_PARSER_STATE_KEYWORD_OR_IDENT_COMPLETE', 2); +define('SQL_PARSER_STATE_REAL_OR_INT_INCOMPLETE', 5); +define('SQL_PARSER_STATE_INT_COMPLETE', 6); +define('SQL_PARSER_STATE_REAL_INCOMPLETE', 7); +define('SQL_PARSER_STATE_REAL_COMPLETE', 8); +define('SQL_PARSER_STATE_COMPARISON_INCOMPLETE', 10); +define('SQL_PARSER_STATE_COMPARISON_COMPLETE', 11); +define('SQL_PARSER_STATE_STR_INCOMPLETE', 12); +define('SQL_PARSER_STATE_STR_COMPLETE', 13); +define('SQL_PARSER_STATE_COMMENT', 14); +define('SQL_PARSER_STATE_SCI_EXPONENT_SIGN', 15); +define('SQL_PARSER_STATE_SCI_EXPONENT_FIRST_DIGIT', 16); +define('SQL_PARSER_STATE_SCI_EXPONENT_VALUE', 17); +define('SQL_PARSER_STATE_SYSVAR_INCOMPLETE', 18); +define('SQL_PARSER_STATE_SYSVAR_COMPLETE', 19); +define('SQL_PARSER_STATE_FUNCTION_INCOMPLETE', 20); +define('SQL_PARSER_STATE_FUNCTION_COMPLETE', 21); +define('SQL_PARSER_STATE_UNKNOWN', 999); +define('SQL_PARSER_STATE_EOF', 1000); + // }}} /** @@ -100,472 +123,479 @@ function isCompop($c) { return (($c == '<') || ($c == '>') || ($c == '=') || ($c == '!')); } -// }}} + // }}} -// {{{ pushBack() -/* - * Push back a token, so the very next call to lex() will return that token. - * Calls to this function will be ignored if there is no lookahead specified - * to the constructor, or the pushBack() function has already been called the - * maximum number of token's that can be looked ahead. - */ -function pushBack() -{ - if($this->lookahead>0 && count($this->tokenStack)>0 && $this->stackPtr>0) { - $this->stackPtr--; + // {{{ pushBack() + /* + * Push back a token, so the very next call to lex() will return that token. + * Calls to this function will be ignored if there is no lookahead specified + * to the constructor, or the pushBack() function has already been called the + * maximum number of token's that can be looked ahead. + */ + function pushBack() + { + if($this->lookahead>0 && count($this->tokenStack)>0 && $this->stackPtr>0) { + $this->stackPtr--; + } } -} -// }}} - -// {{{ lex() -function lex() -{ - if($this->lookahead>0) { - // The stackPtr, should always be the same as the count of - // elements in the tokenStack. The stackPtr, can be thought - // of as pointing to the next token to be added. If however - // a pushBack() call is made, the stackPtr, will be less than the - // count, to indicate that we should take that token from the - // stack, instead of calling nextToken for a new token. - - if ($this->stackPtr<count($this->tokenStack)) { - - $this->tokText = $this->tokenStack[$this->stackPtr]['tokText']; - $this->skipText = $this->tokenStack[$this->stackPtr]['skipText']; - $token = $this->tokenStack[$this->stackPtr]['token']; - - // We have read the token, so now iterate again. - $this->stackPtr++; - return $token; - - } else { - - // If $tokenStack is full (equal to lookahead), pop the oldest - // element off, to make room for the new one. + // }}} - if ($this->stackPtr == $this->lookahead) { - // For some reason array_shift and - // array_pop screw up the indexing, so we do it manually. - for($i=0; $i<(count($this->tokenStack)-1); $i++) { - $this->tokenStack[$i] = $this->tokenStack[$i+1]; + // {{{ lex() + function lex() + { + if($this->lookahead>0) { + // The stackPtr, should always be the same as the count of + // elements in the tokenStack. The stackPtr, can be thought + // of as pointing to the next token to be added. If however + // a pushBack() call is made, the stackPtr, will be less than the + // count, to indicate that we should take that token from the + // stack, instead of calling nextToken for a new token. + + if ($this->stackPtr < count($this->tokenStack)) { + + $this->tokText = $this->tokenStack[$this->stackPtr]['tokText']; + $this->skipText = $this->tokenStack[$this->stackPtr]['skipText']; + $token = $this->tokenStack[$this->stackPtr]['token']; + + // We have read the token, so now iterate again. + $this->stackPtr++; + return $token; + + } else { + + // If $tokenStack is full (equal to lookahead), pop the oldest + // element off, to make room for the new one. + + if ($this->stackPtr == $this->lookahead) { + // For some reason array_shift and + // array_pop screw up the indexing, so we do it manually. + for($i=0; $i<(count($this->tokenStack)-1); $i++) { + $this->tokenStack[$i] = $this->tokenStack[$i+1]; + } + + // Indicate that we should put the element in + // at the stackPtr position. + $this->stackPtr--; } - // Indicate that we should put the element in - // at the stackPtr position. - $this->stackPtr--; + $token = $this->nextToken(); + $this->tokenStack[$this->stackPtr] = + array('token' => $token, + 'tokText' => $this->tokText, + 'skipText' => $this->skipText); + $this->stackPtr++; + return $token; } - - $token = $this->nextToken(); - $this->tokenStack[$this->stackPtr] = - array('token'=>$token, - 'tokText'=>$this->tokText, - 'skipText'=>$this->skipText); - $this->stackPtr++; - return $token; + } + else + { + return $this->nextToken(); } } - else + // }}} + + // {{{ setToken() + function setToken($tokText=NULL) { - return $this->nextToken(); + if ($tokText === NULL) { + $this->tokText = substr($this->string, $this->tokStart, $this->tokLen); + } else { + $this->tokText = $tokText; + } + $this->skipText = substr($this->string, $this->tokAbsStart, + $this->tokStart-$this->tokAbsStart); + $this->tokStart = $this->tokPtr; } -} -// }}} + // }}} -// {{{ nextToken() -function nextToken() -{ - if ($this->string == '') return; - $state = 0; - $this->tokAbsStart = $this->tokStart; - - while (true){ - //echo "State: $state, Char: $c\n"; - switch($state) { - // {{{ State 0 : Start of token - case 0: - $this->tokPtr = $this->tokStart; - $this->tokText = ''; - $this->tokLen = 0; - $c = $this->get(); + // {{{ nextToken() + function nextToken() + { + if ($this->string == '') return; + $state = SQL_PARSER_STATE_START; + $this->tokAbsStart = $this->tokStart; + + while (true){ + //echo "State: $state, Char: $c\n"; + switch($state) { + // {{{ State 0 : Start of token + case SQL_PARSER_STATE_START: + $this->tokPtr = $this->tokStart; + $this->tokText = ''; + $this->tokLen = 0; + $c = $this->get(); - if (is_null($c)) { // End Of Input - $state = 1000; - break; - } + if (is_null($c)) { // End Of Input + $state = SQL_PARSER_STATE_EOF; + break; + } - while (($c == ' ') || ($c == "\t") - || ($c == "\n") || ($c == "\r")) { - if ($c == "\n" || $c == "\r") { - // Handle MAC/Unix/Windows line endings. - if($c == "\r") { - $c = $this->skip(); - - // If not DOS newline - if($c != "\n") - $this->unget(); + while (($c == ' ') || ($c == "\t") + || ($c == "\n") || ($c == "\r")) { + if ($c == "\n" || $c == "\r") { + // Handle MAC/Unix/Windows line endings. + if($c == "\r") { + $c = $this->skip(); + + // If not DOS newline + if($c != "\n") + $this->unget(); + } + ++$this->lineNo; + $this->lineBegin = $this->tokPtr; } - ++$this->lineNo; - $this->lineBegin = $this->tokPtr; - } - - $c = $this->skip(); - $this->tokLen = 1; - } - - // Escape quotes and backslashes - if ($c == '\\') { - $t = $this->get(); - if ($t == '\'' || $t == '\\' || $t == '"') { - $this->tokText = $t; - $this->tokStart = $this->tokPtr; - return $this->tokText; - } else { - $this->unget(); - // Unknown token. Revert to single char - $state = 999; - break; + $c = $this->skip(); + $this->tokLen = 1; } - } - - if (($c == '\'') || ($c == '"')) { // text string - $quote = $c; - $state = 12; - break; - } - - if ($c == '_') { // system variable - $state = 18; - break; - } - - if (ctype_alpha(ord($c))) { // keyword or ident - $state = 1; - break; - } - - if (ctype_digit(ord($c))) { // real or int number - $state = 5; - break; - } - - if ($c == '.') { - $t = $this->get(); - if ($t == '.') { // ellipsis - if ($this->get() == '.') { - $this->tokText = '...'; + + // Escape quotes and backslashes + if ($c == '\\') { + $t = $this->get(); + if ($t == '\'' || $t == '\\' || $t == '"') { + $this->tokText = $t; $this->tokStart = $this->tokPtr; return $this->tokText; } else { - $state = 999; + $this->unget(); + + // Unknown token. Revert to single char + $state = SQL_PARSER_STATE_UNKNOWN; break; } - } else if (ctype_digit(ord($t))) { // real number - $this->unget(); - $state = 7; + } + + if (($c == '\'') || ($c == '"')) { // text string + $quote = $c; + $state = SQL_PARSER_STATE_STR_INCOMPLETE; break; - } else { // period - $this->unget(); } - } - if ($c == '#') { // Comments - $state = 14; - break; - } - if ($c == '-') { - $t = $this->get(); - if ($t == '-') { - $state = 14; - break; - } else { // negative number - $this->unget(); - $state = 5; + if ($c == '_') { // system variable + $state = SQL_PARSER_STATE_SYSVAR_INCOMPLETE; break; } - } - - if ($this->isCompop($c)) { // comparison operator - $state = 10; - break; - } - // Unknown token. Revert to single char - $state = 999; - break; - // }}} - - // {{{ State 1 : Incomplete keyword or ident - case 1: - $c = $this->get(); - if (ctype_alnum(ord($c)) || ($c == '_') || ($c == '.')) { - $state = 1; - break; - } - $state = 2; - break; - // }}} - - /* {{{ State 2 : Complete keyword or ident */ - case 2: - $this->unget(); - $this->tokText = substr($this->string, $this->tokStart, - $this->tokLen); - - $testToken = strtolower($this->tokText); - if (isset($this->symbols[$testToken])) { - - $this->skipText = substr($this->string, $this->tokAbsStart, - $this->tokStart-$this->tokAbsStart); - $this->tokStart = $this->tokPtr; - return $testToken; - } else { - $this->skipText = substr($this->string, $this->tokAbsStart, - $this->tokStart-$this->tokAbsStart); - $this->tokStart = $this->tokPtr; - return 'ident'; - } - break; - // }}} - // {{{ State 5: Incomplete real or int number - case 5: - $c = $this->get(); - if (ctype_digit(ord($c))) { - $state = 5; - break; - } else if ($c == '.') { - $t = $this->get(); - if($t == '.') { // ellipsis - $this->unget(); - } else { // real number - $state = 7; - break; - } - } else if(ctype_alpha(ord($c))) { - // Do we allow idents to begin with a digit? - if ($this->allowIdentFirstDigit) { - $state = 1; - } else { // a number must end with non-alpha character - $state = 999; - } - break; - } else { - // complete number - $state = 6; - break; - } - // }}} - - // {{{ State 6: Complete integer number - case 6: - $this->unget(); - $this->tokText = intval(substr($this->string, $this->tokStart, - $this->tokLen)); - $this->skipText = substr($this->string, $this->tokAbsStart, - $this->tokStart-$this->tokAbsStart); - $this->tokStart = $this->tokPtr; - return 'int_val'; - break; - // }}} - - // {{{ State 7: Incomplete real number - case 7: - $c = $this->get(); + if (ctype_alpha(ord($c))) { // keyword or ident + $state = SQL_PARSER_STATE_KEYWORD_OR_IDENT_INCOMPLETE; + break; + } - if ($c == 'e' || $c == 'E') { - $state = 15; + if (ctype_digit(ord($c))) { // real or int number + $state = SQL_PARSER_STATE_REAL_OR_INT_INCOMPLETE; break; - } + } - if (ctype_digit(ord($c))) { - $state = 7; - break; - } - $state = 8; - break; - // }}} - - // {{{ State 8: Complete real number - case 8: - $this->unget(); - $this->tokText = floatval(substr($this->string, $this->tokStart, - $this->tokLen)); - $this->skipText = substr($this->string, $this->tokAbsStart, - $this->tokStart-$this->tokAbsStart); - $this->tokStart = $this->tokPtr; - return 'real_val'; - // }}} - - // {{{ State 10: Incomplete comparison operator - case 10: - $c = $this->get(); - if ($this->isCompop($c)) - { - $state = 10; - break; - } - $state = 11; - break; - // }}} - - // {{{ State 11: Complete comparison operator - case 11: - $this->unget(); - $this->tokText = substr($this->string, $this->tokStart, - $this->tokLen); - if($this->tokText) { - $this->skipText = substr($this->string, $this->tokAbsStart, - $this->tokStart-$this->tokAbsStart); - $this->tokStart = $this->tokPtr; - return $this->tokText; - } - $state = 999; - break; - // }}} - - // {{{ State 12: Incomplete text string - case 12: - $bail = false; - while (!$bail) { - switch ($this->get()) { - case '': - $this->tokText = null; - $bail = true; - break; - case "\\": - if (!$this->get()) { - $this->tokText = null; - $bail = true; + if ($c == '.') { + $t = $this->get(); + if ($t == '.') { // ellipsis + if ($this->get() == '.') { + $this->tokText = '...'; + $this->tokStart = $this->tokPtr; + return $this->tokText; + } else { + $state = SQL_PARSER_STATE_UNKNOWN; + break; } - //$bail = true; - break; - case $quote: - $this->tokText = stripslashes(substr($this->string, - ($this->tokStart+1), ($this->tokLen-2))); - $bail = true; + } else if (ctype_digit(ord($t))) { // real number + $this->unget(); + $state = SQL_PARSER_STATE_REAL_INCOMPLETE; break; + } else { // period + $this->unget(); + } } - } - if (!is_null($this->tokText)) { - $state = 13; - break; - } - $state = 999; - break; - // }}} - - // {{{ State 13: Complete text string - case 13: - $this->skipText = substr($this->string, $this->tokAbsStart, - $this->tokStart-$this->tokAbsStart); - $this->tokStart = $this->tokPtr; - return 'text_val'; - break; - // }}} - - // {{{ State 14: Comment - case 14: - $c = $this->skip(); - if ($c == "\n" || $c == "\r" || $c == "") { - // Handle MAC/Unix/Windows line endings. - if ($c == "\r") { - $c = $this->skip(); - // If not DOS newline - if ($c != "\n") { + + if ($c == '#') { // Comments + $state = SQL_PARSER_STATE_COMMENT; + break; + } + if ($c == '-') { + $t = $this->get(); + if ($t == '-') { + $state = SQL_PARSER_STATE_COMMENT; + break; + } else { // negative number $this->unget(); + $state = SQL_PARSER_STATE_REAL_OR_INT_INCOMPLETE; + break; } } - if ($c != "") { - ++$this->lineNo; - $this->lineBegin = $this->tokPtr; + if ($this->isCompop($c)) { // comparison operator + $state = SQL_PARSER_STATE_COMPARISON_INCOMPLETE; + break; } + // Unknown token. Revert to single char + $state = SQL_PARSER_STATE_UNKNOWN; + break; + // }}} - // We need to skip all the text. - $this->tokStart = $this->tokPtr; - $state = 0; - } else { - $state = 14; - } - break; - // }}} + // {{{ State 1 : Incomplete keyword or ident + case SQL_PARSER_STATE_KEYWORD_OR_IDENT_INCOMPLETE: + $c = $this->get(); + if (ctype_alnum(ord($c)) || ($c == '_') || ($c == '.')) { + $state = SQL_PARSER_STATE_KEYWORD_OR_IDENT_INCOMPLETE; + break; + } else if ($c == '(') { + $this->_function_paren_count = 1; + $state = SQL_PARSER_STATE_FUNCTION_INCOMPLETE; + break; + } + $state = SQL_PARSER_STATE_KEYWORD_OR_IDENT_COMPLETE; + break; + // }}} + + /* {{{ State 2 : Complete keyword or ident */ + case SQL_PARSER_STATE_KEYWORD_OR_IDENT_COMPLETE: + $this->unget(); + $this->setToken(); + $testToken = strtolower($this->tokText); + return (isset($this->symbols[$testToken])) ? $testToken : 'ident'; + break; + // }}} - // {{{ State 15: Exponent Sign in Scientific Notation - case 15: + // {{{ State 20: Incomplete function (Doesn't yet handle nesting!) + case SQL_PARSER_STATE_FUNCTION_INCOMPLETE: $c = $this->get(); - if($c == '-' || $c == '+') { - $state = 16; + if ($c == ')') { + $this->_function_paren_count--; + if ($this->_function_paren_count == 0) { + $state = SQL_PARSER_STATE_FUNCTION_COMPLETE; break; + } + } else if ($c == '(') { + $this->_function_paren_count++; } - $state = 999; + $state = SQL_PARSER_STATE_FUNCTION_INCOMPLETE; + break; + // }}} + + // {{{ State 21: Complete function + case SQL_PARSER_STATE_FUNCTION_COMPLETE: + $this->setToken(); + return 'function'; break; - // }}} + // }}} - // {{{ state 16: Exponent Value-first digit in Scientific Notation - case 16: + // {{{ State 5: Incomplete real or int number + case SQL_PARSER_STATE_REAL_OR_INT_INCOMPLETE: $c = $this->get(); if (ctype_digit(ord($c))) { - $state = 17; + $state = SQL_PARSER_STATE_REAL_OR_INT_INCOMPLETE; + break; + } else if ($c == '.') { + $t = $this->get(); + if($t == '.') { // ellipsis + $this->unget(); + } else { // real number + $state = SQL_PARSER_STATE_REAL_INCOMPLETE; break; + } + } else if(ctype_alpha(ord($c))) { + // Do we allow idents to begin with a digit? + if ($this->allowIdentFirstDigit) { + $state = SQL_PARSER_STATE_KEYWORD_OR_IDENT_INCOMPLETE; + } else { // a number must end with non-alpha character + $state = SQL_PARSER_STATE_UNKNOWN; + } + break; + } else { + // complete number + $state = SQL_PARSER_STATE_INT_COMPLETE; + break; } - $state = 999; // if no digit, then token is unknown + // }}} + + // {{{ State 6: Complete integer number + case SQL_PARSER_STATE_INT_COMPLETE: + $this->unget(); + $this->setToken(); + $this->tokTest = intval($this->tokText); + return 'int_val'; break; - // }}} + // }}} - // {{{ State 17: Exponent Value in Scientific Notation - case 17: + // {{{ State 7: Incomplete real number + case SQL_PARSER_STATE_REAL_INCOMPLETE: $c = $this->get(); - if (ctype_digit(ord($c))) { - $state = 17; + + if ($c == 'e' || $c == 'E') { + $state = SQL_PARSER_STATE_SCI_EXPONENT_SIGN; break; } - $state = 8; // At least 1 exponent digit was required + + if (ctype_digit(ord($c))) { + $state = SQL_PARSER_STATE_REAL_INCOMPLETE; + break; + } + $state = SQL_PARSER_STATE_REAL_COMPLETE; break; - // }}} + // }}} - // {{{ State 18 : Incomplete System Variable - case 18: - $c = $this->get(); - if (ctype_alnum(ord($c)) || $c == '_') { - $state = 18; + // {{{ State 8: Complete real number + case SQL_PARSER_STATE_REAL_COMPLETE: + $this->unget(); + $this->setToken(); + $this->tokText = floatval($this->tokText); + return 'real_val'; + // }}} + + // {{{ State 10: Incomplete comparison operator + case SQL_PARSER_STATE_COMPARISON_INCOMPLETE: + $c = $this->get(); + if ($this->isCompop($c)) { + $state = SQL_PARSER_STATE_COMPARISON_INCOMPLETE; + break; + } + $state = SQL_PARSER_STATE_COMPARISON_COMPLETE; break; - } - $state = 19; - break; - // }}} - - // {{{ State 19: Complete Sys Var - case 19: - $this->unget(); - $this->tokText = substr($this->string, $this->tokStart, - $this->tokLen); - $this->skipText = substr($this->string, $this->tokAbsStart, - $this->tokStart-$this->tokAbsStart); - $this->tokStart = $this->tokPtr; - return 'sys_var'; - // }}} - - // {{{ State 999 : Unknown token. Revert to single char - case 999: - $this->revert(); - $this->tokText = $this->get(); - $this->skipText = substr($this->string, $this->tokAbsStart, - $this->tokStart-$this->tokAbsStart); - $this->tokStart = $this->tokPtr; - return $this->tokText; - // }}} - - // {{{ State 1000 : End Of Input - case 1000: - $this->tokText = '*end of input*'; - $this->skipText = substr($this->string, $this->tokAbsStart, - $this->tokStart-$this->tokAbsStart); - $this->tokStart = $this->tokPtr; - return null; - // }}} + // }}} + + // {{{ State 11: Complete comparison operator + case SQL_PARSER_STATE_COMPARISON_COMPLETE: + $this->unget(); + $this->tokText = substr($this->string, $this->tokStart, + $this->tokLen); + if($this->tokText) { + $this->setToken($this->tokText); + return $this->tokText; + } + $state = SQL_PARSER_STATE_UNKNOWN; + break; + // }}} + + // {{{ State 12: Incomplete text string + case SQL_PARSER_STATE_STR_INCOMPLETE: + $bail = false; + while (!$bail) { + switch ($this->get()) { + case '': + $this->tokText = null; + $bail = true; + break; + case "\\": + if (!$this->get()) { + $this->tokText = null; + $bail = true; + } + //$bail = true; + break; + case $quote: + $this->tokText = stripslashes(substr($this->string, + ($this->tokStart+1), ($this->tokLen-2))); + $bail = true; + break; + } + } + if (!is_null($this->tokText)) { + $state = SQL_PARSER_STATE_STR_COMPLETE; + break; + } + $state = SQL_PARSER_STATE_UNKNOWN; + break; + // }}} + + // {{{ State 13: Complete text string + case SQL_PARSER_STATE_STR_COMPLETE: + $this->setToken($this->tokText); + return 'text_val'; + break; + // }}} + + // {{{ State 14: Comment + case SQL_PARSER_STATE_COMMENT: + $c = $this->skip(); + if ($c == "\n" || $c == "\r" || $c == "") { + // Handle MAC/Unix/Windows line endings. + if ($c == "\r") { + $c = $this->skip(); + // If not DOS newline + if ($c != "\n") { + $this->unget(); + } + } + + if ($c != "") { + ++$this->lineNo; + $this->lineBegin = $this->tokPtr; + } + + // We need to skip all the text. + $this->tokStart = $this->tokPtr; + $state = SQL_PARSER_STATE_START; + } else { + $state = SQL_PARSER_STATE_COMMENT; + } + break; + // }}} + + // {{{ State 15: Exponent Sign in Scientific Notation + case SQL_PARSER_STATE_SCI_EXPONENT_SIGN: + $c = $this->get(); + if($c == '-' || $c == '+') { + $state = SQL_PARSER_STATE_SCI_EXPONENT_FIRST_DIGIT; + break; + } + $state = SQL_PARSER_STATE_UNKNOWN; + break; + // }}} + + // {{{ state 16: Exponent Value-first digit in Scientific Notation + case SQL_PARSER_STATE_SCI_EXPONENT_FIRST_DIGIT: + $c = $this->get(); + if (ctype_digit(ord($c))) { + $state = SQL_PARSER_STATE_SCI_EXPONENT_VALUE; + break; + } + $state = SQL_PARSER_STATE_UNKNOWN; // if no digit, then token is unknown + break; + // }}} + + // {{{ State 17: Exponent Value in Scientific Notation + case SQL_PARSER_STATE_SCI_EXPONENT_VALUE: + $c = $this->get(); + if (ctype_digit(ord($c))) { + $state = SQL_PARSER_STATE_SCI_EXPONENT_VALUE; + break; + } + $state = SQL_PARSER_STATE_REAL_COMPLETE; // At least 1 exponent digit was required + break; + // }}} + + // {{{ State 18 : Incomplete System Variable + case SQL_PARSER_STATE_SYSVAR_INCOMPLETE: + $c = $this->get(); + if (ctype_alnum(ord($c)) || $c == '_') { + $state = SQL_PARSER_STATE_SYSVAR_INCOMPLETE; + break; + } + $state = SQL_PARSER_STATE_SYSVAR_COMPLETE; + break; + // }}} + + // {{{ State 19: Complete Sys Var + case SQL_PARSER_STATE_SYSVAR_COMPLETE: + $this->unget(); + $this->setToken(); + return 'sys_var'; + // }}} + + // {{{ State 999 : Unknown token. Revert to single char + case SQL_PARSER_STATE_UNKNOWN: + $this->revert(); + $this->setToken($this->get()); + return $this->tokText; + // }}} + + // {{{ State 1000 : End Of Input + case SQL_PARSER_STATE_EOF: + $this->setToken('*end of input*'); + return null; + // }}} + } } } -} -// }}} + // }}} } ?> Index: Parser.php =================================================================== RCS file: /repository/pear/SQL_Parser/Parser.php,v retrieving revision 1.27 diff -u -r1.27 Parser.php --- Parser.php 3 Jul 2004 04:49:44 -0000 1.27 +++ Parser.php 2 Apr 2005 23:55:56 -0000 @@ -143,6 +143,7 @@ // {{{ isVal() function isVal() { return (($this->token == 'real_val') || + ($this->token == 'function') || ($this->token == 'int_val') || ($this->token == 'text_val') || ($this->token == 'null'));
 [2005-04-05 03:50 UTC] epte at ruffdogs dot com
So, I was wrong about functions. They are a syntactic element -- I see that now. I have therefore taken out the Lexer changes I suggested above, and I have put in a very rudimentary parseFunction function (which should, when it develops further, call getParams, I think). It works for the above test case, and will not handle complex arguments yet. Here is the relevant hunks of the patch of Lexer and Parser against CVS sources, including everything so far (4/4): Index: Parser.php =================================================================== RCS file: /repository/pear/SQL_Parser/Parser.php,v retrieving revision 1.27 diff -r1.27 Parser.php 103a104,106 > } elseif (in_array($this->token, $this->functions)) { > $types[] = 'function'; > $values[] = $this->parseFunction($this->token); 114a118,141 > // {{{ parseFunction() > // this function really should call getParams > function parseFunction($funcname) { > $argcnt = 0; > $func_value_tree = array('funcname' => $funcname); > $this->getTok(); > if ($this->token != '(') { > return $this->raiseError('Expected )'); > } > $this->getTok(); > while ($this->token != ')') { > $this->getTok(); > if ($this->token == ',') { > $argcnt++; > } else { > $func_value_tree['arg_'.$argcnt]['type'] = $this->token; > $func_value_tree['arg_'.$argcnt]['value'] = $this->tokText; > } > } > $func_value_tree['arg_count'] = $argcnt; > return $func_value_tree; > } > // }}} > Index: Lexer.php =================================================================== RCS file: /repository/pear/SQL_Parser/Lexer.php,v retrieving revision 1.21 diff -r1.21 Lexer.php 31a32,52 > > define('SQL_PARSER_STATE_START', 0); > define('SQL_PARSER_STATE_KEYWORD_OR_IDENT_INCOMPLETE', 1); > define('SQL_PARSER_STATE_KEYWORD_OR_IDENT_COMPLETE', 2); > define('SQL_PARSER_STATE_REAL_OR_INT_INCOMPLETE', 5); > define('SQL_PARSER_STATE_INT_COMPLETE', 6); > define('SQL_PARSER_STATE_REAL_INCOMPLETE', 7); > define('SQL_PARSER_STATE_REAL_COMPLETE', 8); > define('SQL_PARSER_STATE_COMPARISON_INCOMPLETE', 10); > define('SQL_PARSER_STATE_COMPARISON_COMPLETE', 11); > define('SQL_PARSER_STATE_STR_INCOMPLETE', 12); > define('SQL_PARSER_STATE_STR_COMPLETE', 13); > define('SQL_PARSER_STATE_COMMENT', 14); > define('SQL_PARSER_STATE_SCI_EXPONENT_SIGN', 15); > define('SQL_PARSER_STATE_SCI_EXPONENT_FIRST_DIGIT', 16); > define('SQL_PARSER_STATE_SCI_EXPONENT_VALUE', 17); > define('SQL_PARSER_STATE_SYSVAR_INCOMPLETE', 18); > define('SQL_PARSER_STATE_SYSVAR_COMPLETE', 19); > define('SQL_PARSER_STATE_UNKNOWN', 999); > define('SQL_PARSER_STATE_EOF', 1000); > 103c124 < // }}} --- > // }}} 105,115c126,137 < // {{{ pushBack() < /* < * Push back a token, so the very next call to lex() will return that token. < * Calls to this function will be ignored if there is no lookahead specified < * to the constructor, or the pushBack() function has already been called the < * maximum number of token's that can be looked ahead. < */ < function pushBack() < { < if($this->lookahead>0 && count($this->tokenStack)>0 && $this->stackPtr>0) { < $this->stackPtr--; --- > // {{{ pushBack() > /* > * Push back a token, so the very next call to lex() will return that token. > * Calls to this function will be ignored if there is no lookahead specified > * to the constructor, or the pushBack() function has already been called the > * maximum number of token's that can be looked ahead. > */ > function pushBack() > { > if($this->lookahead>0 && count($this->tokenStack)>0 && $this->stackPtr>0) { > $this->stackPtr--; > } 117,118c139 < } < // }}} --- > // }}} 120,150c141,176 < // {{{ lex() < function lex() < { < if($this->lookahead>0) { < // The stackPtr, should always be the same as the count of < // elements in the tokenStack. The stackPtr, can be thought < // of as pointing to the next token to be added. If however < // a pushBack() call is made, the stackPtr, will be less than the < // count, to indicate that we should take that token from the < // stack, instead of calling nextToken for a new token. < < if ($this->stackPtr<count($this->tokenStack)) { < < $this->tokText = $this->tokenStack[$this->stackPtr]['tokText']; < $this->skipText = $this->tokenStack[$this->stackPtr]['skipText']; < $token = $this->tokenStack[$this->stackPtr]['token']; < < // We have read the token, so now iterate again. < $this->stackPtr++; < return $token; < < } else { < < // If $tokenStack is full (equal to lookahead), pop the oldest < // element off, to make room for the new one. < < if ($this->stackPtr == $this->lookahead) { < // For some reason array_shift and < // array_pop screw up the indexing, so we do it manually. < for($i=0; $i<(count($this->tokenStack)-1); $i++) { < $this->tokenStack[$i] = $this->tokenStack[$i+1]; --- > // {{{ lex() > function lex() > { > if($this->lookahead>0) { > // The stackPtr, should always be the same as the count of > // elements in the tokenStack. The stackPtr, can be thought > // of as pointing to the next token to be added. If however > // a pushBack() call is made, the stackPtr, will be less than the > // count, to indicate that we should take that token from the > // stack, instead of calling nextToken for a new token. > > if ($this->stackPtr < count($this->tokenStack)) { > > $this->tokText = $this->tokenStack[$this->stackPtr]['tokText']; > $this->skipText = $this->tokenStack[$this->stackPtr]['skipText']; > $token = $this->tokenStack[$this->stackPtr]['token']; > > // We have read the token, so now iterate again. > $this->stackPtr++; > return $token; > > } else { > > // If $tokenStack is full (equal to lookahead), pop the oldest > // element off, to make room for the new one. > > if ($this->stackPtr == $this->lookahead) { > // For some reason array_shift and > // array_pop screw up the indexing, so we do it manually. > for($i=0; $i<(count($this->tokenStack)-1); $i++) { > $this->tokenStack[$i] = $this->tokenStack[$i+1]; > } > > // Indicate that we should put the element in > // at the stackPtr position. > $this->stackPtr--; 153,155c179,185 < // Indicate that we should put the element in < // at the stackPtr position. < $this->stackPtr--; --- > $token = $this->nextToken(); > $this->tokenStack[$this->stackPtr] = > array('token' => $token, > 'tokText' => $this->tokText, > 'skipText' => $this->skipText); > $this->stackPtr++; > return $token; 157,164c187,190 < < $token = $this->nextToken(); < $this->tokenStack[$this->stackPtr] = < array('token'=>$token, < 'tokText'=>$this->tokText, < 'skipText'=>$this->skipText); < $this->stackPtr++; < return $token; --- > } > else > { > return $this->nextToken(); 167c193,196 < else --- > // }}} > > // {{{ setToken() > function setToken($tokText=NULL) 169c198,205 < return $this->nextToken(); --- > if ($tokText === NULL) { > $this->tokText = substr($this->string, $this->tokStart, $this->tokLen); > } else { > $this->tokText = $tokText; > } > $this->skipText = substr($this->string, $this->tokAbsStart, > $this->tokStart-$this->tokAbsStart); > $this->tokStart = $this->tokPtr; 171,172c207 < } < // }}} --- > // }}} 174,189c209,224 < // {{{ nextToken() < function nextToken() < { < if ($this->string == '') return; < $state = 0; < $this->tokAbsStart = $this->tokStart; < < while (true){ < //echo "State: $state, Char: $c\n"; < switch($state) { < // {{{ State 0 : Start of token < case 0: < $this->tokPtr = $this->tokStart; < $this->tokText = ''; < $this->tokLen = 0; < $c = $this->get(); --- > // {{{ nextToken() > function nextToken() > { > if ($this->string == '') return; > $state = SQL_PARSER_STATE_START; > $this->tokAbsStart = $this->tokStart; > > while (true){ > //echo "State: $state, Char: $c\n"; > switch($state) { > // {{{ State 0 : Start of token > case SQL_PARSER_STATE_START: > $this->tokPtr = $this->tokStart; > $this->tokText = ''; > $this->tokLen = 0; > $c = $this->get(); 191,194c226,229 < if (is_null($c)) { // End Of Input < $state = 1000; < break; < } --- > if (is_null($c)) { // End Of Input > $state = SQL_PARSER_STATE_EOF; > break; > } 196,205c231,243 < while (($c == ' ') || ($c == "\t") < || ($c == "\n") || ($c == "\r")) { < if ($c == "\n" || $c == "\r") { < // Handle MAC/Unix/Windows line endings. < if($c == "\r") { < $c = $this->skip(); < < // If not DOS newline < if($c != "\n") < $this->unget(); --- > while (($c == ' ') || ($c == "\t") > || ($c == "\n") || ($c == "\r")) { > if ($c == "\n" || $c == "\r") { > // Handle MAC/Unix/Windows line endings. > if($c == "\r") { > $c = $this->skip(); > > // If not DOS newline > if($c != "\n") > $this->unget(); > } > ++$this->lineNo; > $this->lineBegin = $this->tokPtr; 207,223d244 < ++$this->lineNo; < $this->lineBegin = $this->tokPtr; < } < < $c = $this->skip(); < $this->tokLen = 1; < } < < // Escape quotes and backslashes < if ($c == '\\') { < $t = $this->get(); < if ($t == '\'' || $t == '\\' || $t == '"') { < $this->tokText = $t; < $this->tokStart = $this->tokPtr; < return $this->tokText; < } else { < $this->unget(); 225,227c246,247 < // Unknown token. Revert to single char < $state = 999; < break; --- > $c = $this->skip(); > $this->tokLen = 1; 229,256c249,254 < } < < if (($c == '\'') || ($c == '"')) { // text string < $quote = $c; < $state = 12; < break; < } < < if ($c == '_') { // system variable < $state = 18; < break; < } < < if (ctype_alpha(ord($c))) { // keyword or ident < $state = 1; < break; < } < < if (ctype_digit(ord($c))) { // real or int number < $state = 5; < break; < } < < if ($c == '.') { < $t = $this->get(); < if ($t == '.') { // ellipsis < if ($this->get() == '.') { < $this->tokText = '...'; --- > > // Escape quotes and backslashes > if ($c == '\\') { > $t = $this->get(); > if ($t == '\'' || $t == '\\' || $t == '"') { > $this->tokText = $t; 260c258,261 < $state = 999; --- > $this->unget(); > > // Unknown token. Revert to single char > $state = SQL_PARSER_STATE_UNKNOWN; 263,265c264,268 < } else if (ctype_digit(ord($t))) { // real number < $this->unget(); < $state = 7; --- > } > > if (($c == '\'') || ($c == '"')) { // text string > $quote = $c; > $state = SQL_PARSER_STATE_STR_INCOMPLETE; 267,268d269 < } else { // period < $this->unget(); 270d270 < } 272,283c272,273 < if ($c == '#') { // Comments < $state = 14; < break; < } < if ($c == '-') { < $t = $this->get(); < if ($t == '-') { < $state = 14; < break; < } else { // negative number < $this->unget(); < $state = 5; --- > if ($c == '_') { // system variable > $state = SQL_PARSER_STATE_SYSVAR_INCOMPLETE; 286d275 < } 288,373c277,280 < if ($this->isCompop($c)) { // comparison operator < $state = 10; < break; < } < // Unknown token. Revert to single char < $state = 999; < break; < // }}} < < // {{{ State 1 : Incomplete keyword or ident < case 1: < $c = $this->get(); < if (ctype_alnum(ord($c)) || ($c == '_') || ($c == '.')) { < $state = 1; < break; < } < $state = 2; < break; < // }}} < < /* {{{ State 2 : Complete keyword or ident */ < case 2: < $this->unget(); < $this->tokText = substr($this->string, $this->tokStart, < $this->tokLen); < < $testToken = strtolower($this->tokText); < if (isset($this->symbols[$testToken])) { < < $this->skipText = substr($this->string, $this->tokAbsStart, < $this->tokStart-$this->tokAbsStart); < $this->tokStart = $this->tokPtr; < return $testToken; < } else { < $this->skipText = substr($this->string, $this->tokAbsStart, < $this->tokStart-$this->tokAbsStart); < $this->tokStart = $this->tokPtr; < return 'ident'; < } < break; < // }}} < < // {{{ State 5: Incomplete real or int number < case 5: < $c = $this->get(); < if (ctype_digit(ord($c))) { < $state = 5; < break; < } else if ($c == '.') { < $t = $this->get(); < if($t == '.') { // ellipsis < $this->unget(); < } else { // real number < $state = 7; < break; < } < } else if(ctype_alpha(ord($c))) { < // Do we allow idents to begin with a digit? < if ($this->allowIdentFirstDigit) { < $state = 1; < } else { // a number must end with non-alpha character < $state = 999; < } < break; < } else { < // complete number < $state = 6; < break; < } < // }}} < < // {{{ State 6: Complete integer number < case 6: < $this->unget(); < $this->tokText = intval(substr($this->string, $this->tokStart, < $this->tokLen)); < $this->skipText = substr($this->string, $this->tokAbsStart, < $this->tokStart-$this->tokAbsStart); < $this->tokStart = $this->tokPtr; < return 'int_val'; < break; < // }}} < < // {{{ State 7: Incomplete real number < case 7: < $c = $this->get(); --- > if (ctype_alpha(ord($c))) { // keyword or ident > $state = SQL_PARSER_STATE_KEYWORD_OR_IDENT_INCOMPLETE; > break; > } 375,376c282,283 < if ($c == 'e' || $c == 'E') { < $state = 15; --- > if (ctype_digit(ord($c))) { // real or int number > $state = SQL_PARSER_STATE_REAL_OR_INT_INCOMPLETE; 378c285 < } --- > } 380,438c287,296 < if (ctype_digit(ord($c))) { < $state = 7; < break; < } < $state = 8; < break; < // }}} < < // {{{ State 8: Complete real number < case 8: < $this->unget(); < $this->tokText = floatval(substr($this->string, $this->tokStart, < $this->tokLen)); < $this->skipText = substr($this->string, $this->tokAbsStart, < $this->tokStart-$this->tokAbsStart); < $this->tokStart = $this->tokPtr; < return 'real_val'; < // }}} < < // {{{ State 10: Incomplete comparison operator < case 10: < $c = $this->get(); < if ($this->isCompop($c)) < { < $state = 10; < break; < } < $state = 11; < break; < // }}} < < // {{{ State 11: Complete comparison operator < case 11: < $this->unget(); < $this->tokText = substr($this->string, $this->tokStart, < $this->tokLen); < if($this->tokText) { < $this->skipText = substr($this->string, $this->tokAbsStart, < $this->tokStart-$this->tokAbsStart); < $this->tokStart = $this->tokPtr; < return $this->tokText; < } < $state = 999; < break; < // }}} < < // {{{ State 12: Incomplete text string < case 12: < $bail = false; < while (!$bail) { < switch ($this->get()) { < case '': < $this->tokText = null; < $bail = true; < break; < case "\\": < if (!$this->get()) { < $this->tokText = null; < $bail = true; --- > if ($c == '.') { > $t = $this->get(); > if ($t == '.') { // ellipsis > if ($this->get() == '.') { > $this->tokText = '...'; > $this->tokStart = $this->tokPtr; > return $this->tokText; > } else { > $state = SQL_PARSER_STATE_UNKNOWN; > break; 440,445c298,300 < //$bail = true; < break; < case $quote: < $this->tokText = stripslashes(substr($this->string, < ($this->tokStart+1), ($this->tokLen-2))); < $bail = true; --- > } else if (ctype_digit(ord($t))) { // real number > $this->unget(); > $state = SQL_PARSER_STATE_REAL_INCOMPLETE; 446a302,304 > } else { // period > $this->unget(); > } 448,474c306,316 < } < if (!is_null($this->tokText)) { < $state = 13; < break; < } < $state = 999; < break; < // }}} < < // {{{ State 13: Complete text string < case 13: < $this->skipText = substr($this->string, $this->tokAbsStart, < $this->tokStart-$this->tokAbsStart); < $this->tokStart = $this->tokPtr; < return 'text_val'; < break; < // }}} < < // {{{ State 14: Comment < case 14: < $c = $this->skip(); < if ($c == "\n" || $c == "\r" || $c == "") { < // Handle MAC/Unix/Windows line endings. < if ($c == "\r") { < $c = $this->skip(); < // If not DOS newline < if ($c != "\n") { --- > > if ($c == '#') { // Comments > $state = SQL_PARSER_STATE_COMMENT; > break; > } > if ($c == '-') { > $t = $this->get(); > if ($t == '-') { > $state = SQL_PARSER_STATE_COMMENT; > break; > } else { // negative number 475a318,319 > $state = SQL_PARSER_STATE_REAL_OR_INT_INCOMPLETE; > break; 479,481c323,325 < if ($c != "") { < ++$this->lineNo; < $this->lineBegin = $this->tokPtr; --- > if ($this->isCompop($c)) { // comparison operator > $state = SQL_PARSER_STATE_COMPARISON_INCOMPLETE; > break; 482a327,330 > // Unknown token. Revert to single char > $state = SQL_PARSER_STATE_UNKNOWN; > break; > // }}} 484,491c332,341 < // We need to skip all the text. < $this->tokStart = $this->tokPtr; < $state = 0; < } else { < $state = 14; < } < break; < // }}} --- > // {{{ State 1 : Incomplete keyword or ident > case SQL_PARSER_STATE_KEYWORD_OR_IDENT_INCOMPLETE: > $c = $this->get(); > if (ctype_alnum(ord($c)) || ($c == '_') || ($c == '.')) { > $state = SQL_PARSER_STATE_KEYWORD_OR_IDENT_INCOMPLETE; > break; > } > $state = SQL_PARSER_STATE_KEYWORD_OR_IDENT_COMPLETE; > break; > // }}} 493,494c343,353 < // {{{ State 15: Exponent Sign in Scientific Notation < case 15: --- > /* {{{ State 2 : Complete keyword or ident */ > case SQL_PARSER_STATE_KEYWORD_OR_IDENT_COMPLETE: > $this->unget(); > $this->setToken(); > $testToken = strtolower($this->tokText); > return (isset($this->symbols[$testToken])) ? $testToken : 'ident'; > break; > // }}} > > // {{{ State 5: Incomplete real or int number > case SQL_PARSER_STATE_REAL_OR_INT_INCOMPLETE: 496,497c355,363 < if($c == '-' || $c == '+') { < $state = 16; --- > if (ctype_digit(ord($c))) { > $state = SQL_PARSER_STATE_REAL_OR_INT_INCOMPLETE; > break; > } else if ($c == '.') { > $t = $this->get(); > if($t == '.') { // ellipsis > $this->unget(); > } else { // real number > $state = SQL_PARSER_STATE_REAL_INCOMPLETE; 498a365,377 > } > } else if(ctype_alpha(ord($c))) { > // Do we allow idents to begin with a digit? > if ($this->allowIdentFirstDigit) { > $state = SQL_PARSER_STATE_KEYWORD_OR_IDENT_INCOMPLETE; > } else { // a number must end with non-alpha character > $state = SQL_PARSER_STATE_UNKNOWN; > } > break; > } else { > // complete number > $state = SQL_PARSER_STATE_INT_COMPLETE; > break; 500c379,386 < $state = 999; --- > // }}} > > // {{{ State 6: Complete integer number > case SQL_PARSER_STATE_INT_COMPLETE: > $this->unget(); > $this->setToken(); > $this->tokTest = intval($this->tokText); > return 'int_val'; 502c388 < // }}} --- > // }}} 504,505c390,391 < // {{{ state 16: Exponent Value-first digit in Scientific Notation < case 16: --- > // {{{ State 7: Incomplete real number > case SQL_PARSER_STATE_REAL_INCOMPLETE: 507,508c393,395 < if (ctype_digit(ord($c))) { < $state = 17; --- > > if ($c == 'e' || $c == 'E') { > $state = SQL_PARSER_STATE_SCI_EXPONENT_SIGN; 511c398,403 < $state = 999; // if no digit, then token is unknown --- > > if (ctype_digit(ord($c))) { > $state = SQL_PARSER_STATE_REAL_INCOMPLETE; > break; > } > $state = SQL_PARSER_STATE_REAL_COMPLETE; 513c405 < // }}} --- > // }}} 515,516c407,416 < // {{{ State 17: Exponent Value in Scientific Notation < case 17: --- > // {{{ State 8: Complete real number > case SQL_PARSER_STATE_REAL_COMPLETE: > $this->unget(); > $this->setToken(); > $this->tokText = floatval($this->tokText); > return 'real_val'; > // }}} > > // {{{ State 10: Incomplete comparison operator > case SQL_PARSER_STATE_COMPARISON_INCOMPLETE: 518,520c418,420 < if (ctype_digit(ord($c))) { < $state = 17; < break; --- > if ($this->isCompop($c)) { > $state = SQL_PARSER_STATE_COMPARISON_INCOMPLETE; > break; 522c422 < $state = 8; // At least 1 exponent digit was required --- > $state = SQL_PARSER_STATE_COMPARISON_COMPLETE; 524c424 < // }}} --- > // }}} 526,530c426,435 < // {{{ State 18 : Incomplete System Variable < case 18: < $c = $this->get(); < if (ctype_alnum(ord($c)) || $c == '_') { < $state = 18; --- > // {{{ State 11: Complete comparison operator > case SQL_PARSER_STATE_COMPARISON_COMPLETE: > $this->unget(); > $this->tokText = substr($this->string, $this->tokStart, > $this->tokLen); > if($this->tokText) { > $this->setToken($this->tokText); > return $this->tokText; > } > $state = SQL_PARSER_STATE_UNKNOWN; 532,565c437,567 < } < $state = 19; < break; < // }}} < < // {{{ State 19: Complete Sys Var < case 19: < $this->unget(); < $this->tokText = substr($this->string, $this->tokStart, < $this->tokLen); < $this->skipText = substr($this->string, $this->tokAbsStart, < $this->tokStart-$this->tokAbsStart); < $this->tokStart = $this->tokPtr; < return 'sys_var'; < // }}} < < // {{{ State 999 : Unknown token. Revert to single char < case 999: < $this->revert(); < $this->tokText = $this->get(); < $this->skipText = substr($this->string, $this->tokAbsStart, < $this->tokStart-$this->tokAbsStart); < $this->tokStart = $this->tokPtr; < return $this->tokText; < // }}} < < // {{{ State 1000 : End Of Input < case 1000: < $this->tokText = '*end of input*'; < $this->skipText = substr($this->string, $this->tokAbsStart, < $this->tokStart-$this->tokAbsStart); < $this->tokStart = $this->tokPtr; < return null; < // }}} --- > // }}} > > // {{{ State 12: Incomplete text string > case SQL_PARSER_STATE_STR_INCOMPLETE: > $bail = false; > while (!$bail) { > switch ($this->get()) { > case '': > $this->tokText = null; > $bail = true; > break; > case "\\": > if (!$this->get()) { > $this->tokText = null; > $bail = true; > } > //$bail = true; > break; > case $quote: > $this->tokText = stripslashes(substr($this->string, > ($this->tokStart+1), ($this->tokLen-2))); > $bail = true; > break; > } > } > if (!is_null($this->tokText)) { > $state = SQL_PARSER_STATE_STR_COMPLETE; > break; > } > $state = SQL_PARSER_STATE_UNKNOWN; > break; > // }}} > > // {{{ State 13: Complete text string > case SQL_PARSER_STATE_STR_COMPLETE: > $this->setToken($this->tokText); > return 'text_val'; > break; > // }}} > > // {{{ State 14: Comment > case SQL_PARSER_STATE_COMMENT: > $c = $this->skip(); > if ($c == "\n" || $c == "\r" || $c == "") { > // Handle MAC/Unix/Windows line endings. > if ($c == "\r") { > $c = $this->skip(); > // If not DOS newline > if ($c != "\n") { > $this->unget(); > } > } > > if ($c != "") { > ++$this->lineNo; > $this->lineBegin = $this->tokPtr; > } > > // We need to skip all the text. > $this->tokStart = $this->tokPtr; > $state = SQL_PARSER_STATE_START; > } else { > $state = SQL_PARSER_STATE_COMMENT; > } > break; > // }}} > > // {{{ State 15: Exponent Sign in Scientific Notation > case SQL_PARSER_STATE_SCI_EXPONENT_SIGN: > $c = $this->get(); > if($c == '-' || $c == '+') { > $state = SQL_PARSER_STATE_SCI_EXPONENT_FIRST_DIGIT; > break; > } > $state = SQL_PARSER_STATE_UNKNOWN; > break; > // }}} > > // {{{ state 16: Exponent Value-first digit in Scientific Notation > case SQL_PARSER_STATE_SCI_EXPONENT_FIRST_DIGIT: > $c = $this->get(); > if (ctype_digit(ord($c))) { > $state = SQL_PARSER_STATE_SCI_EXPONENT_VALUE; > break; > } > $state = SQL_PARSER_STATE_UNKNOWN; // if no digit, then token is unknown > break; > // }}} > > // {{{ State 17: Exponent Value in Scientific Notation > case SQL_PARSER_STATE_SCI_EXPONENT_VALUE: > $c = $this->get(); > if (ctype_digit(ord($c))) { > $state = SQL_PARSER_STATE_SCI_EXPONENT_VALUE; > break; > } > $state = SQL_PARSER_STATE_REAL_COMPLETE; // At least 1 exponent digit was required > break; > // }}} > > // {{{ State 18 : Incomplete System Variable > case SQL_PARSER_STATE_SYSVAR_INCOMPLETE: > $c = $this->get(); > if (ctype_alnum(ord($c)) || $c == '_') { > $state = SQL_PARSER_STATE_SYSVAR_INCOMPLETE; > break; > } > $state = SQL_PARSER_STATE_SYSVAR_COMPLETE; > break; > // }}} > > // {{{ State 19: Complete Sys Var > case SQL_PARSER_STATE_SYSVAR_COMPLETE: > $this->unget(); > $this->setToken(); > return 'sys_var'; > // }}} > > // {{{ State 999 : Unknown token. Revert to single char > case SQL_PARSER_STATE_UNKNOWN: > $this->revert(); > $this->setToken($this->get()); > return $this->tokText; > // }}} > > // {{{ State 1000 : End Of Input > case SQL_PARSER_STATE_EOF: > $this->setToken('*end of input*'); > return null; > // }}} > } 568,569c570 < } < // }}} --- > // }}}
 [2007-07-13 14:30 UTC] cybot (Sebastian Mendel)
This bug has been fixed in CVS. If this was a documentation problem, the fix will appear on pear.php.net by the end of next Sunday (CET). If this was a problem with the pear.php.net website, the change should be live shortly. Otherwise, the fix will appear in the package's next release. Thank you for the report and for helping us make PEAR better.