Difference between revisions of "Plex and Pyacc"

From Free Pascal wiki
Jump to navigationJump to search
(Sorry this is terribly wrong. I have to correct this.)
(Correction part two.)
Line 17: Line 17:
 
*parser.y
 
*parser.y
  
WARNING: The code below compiles but currently doesn't work
+
WARNING: The code below compiles but currently doesn't work '''<-- Because it is simply wrong. See below and learn why !'''
  
 
===build.sh===
 
===build.sh===
Line 146: Line 146:
 
and it works.<br>
 
and it works.<br>
  
  Now I will explain you why you can not do things that way.
+
  '''Now I will explain you why you can NOT do things that way !'''
 
<u>We start with Lex:</u><br>
 
<u>We start with Lex:</u><br>
 
# Lex returns a token to the parser which is always an integer value. The token is defined as an constant value with the '''%token''' command. Tokens can only be returned to Yacc through two commands, '''return''' and '''returnc'''.
 
# Lex returns a token to the parser which is always an integer value. The token is defined as an constant value with the '''%token''' command. Tokens can only be returned to Yacc through two commands, '''return''' and '''returnc'''.
 
## '''return''' : This is the normal return of a token = integer
 
## '''return''' : This is the normal return of a token = integer
## '''returnc''': With this it is possible to return one character, it is needed for the litterals in Yacc rules
+
## '''returnc''': With this it is possible to return one character, it is needed for the literals in Yacc rules
 
Now we correct the return of a token from example above:
 
Now we correct the return of a token from example above:
 
<syntaxhighlight>
 
<syntaxhighlight>
Line 165: Line 165:
 
And last but not least the strange function:
 
And last but not least the strange function:
 
<syntaxhighlight>
 
<syntaxhighlight>
function meuyywrap (): Integer; <- ??? what is this for ???
+
function meuyywrap (): Integer; <- ??? what is this for ??? this no overload for yywrap !
 
begin
 
begin
 
   Result := 1;
 
   Result := 1;
Line 202: Line 202:
 
%%
 
%%
 
</syntaxhighlight>
 
</syntaxhighlight>
[[User:GVS|GVS]] 23:15, 8 March 2014 (CET)  
+
<br>
 +
<u>Now we go on with Yacc:</u><br>
 +
The main block:<br>
 +
If we want to read a file we have to assign an open it before we can parse it. If we want to read more files we have to use yywrap.<br>
 +
With a parser like this we need only stdio. So we need to do nothing because Lex assigns and opens the 'files' known as stdin and stdout.<br>
 +
With this we can forget all but the yyparse() call.<br><br>
 +
Additional functionality:<br>
 +
As a feature we want to write user defined error messages. For this we simply overload the yyerror procedure.<br>
 +
<br>
 +
The working Yacc file:
 +
<syntaxhighlight>
 +
/* Parser
 +
 +
  Aluno: Felipe Monteiro de Carvalho                          */
 +
%{
 +
program calculadora;
 +
 +
{$mode delphi}
 +
 +
uses SysUtils, yacclib, lexlib;
 +
 +
procedure yyerror(s: string);  // Called by yyparse on error <- overload !
 +
begin
 +
  WriteLn(Format('Erro: %s', [s]));
 +
end;
 +
 +
%}
 +
 +
%start entrada
 +
%token <Integer> NUMBER IGUALD IGUALH MENOS
 +
%type <Integer> expressao termo fator
 +
 +
%%
 +
 +
entrada
 +
    : /* linha vazia */    {  }
 +
    | entrada linha        {  }
 +
    | entrada linha '\n'    { writeln('DONE.'); yyaccept; } /* <- nice end with RETURN */
 +
    ;
 +
linha
 +
    : expressao IGUALD '\n' { WriteLn(Format('Resultado: %d', [$1])); }
 +
    | expressao IGUALH '\n' { WriteLn(Format('Resultado: %X', [$1])); } /* <- hex = X */
 +
    ;
 +
expressao
 +
    : expressao '+' termo  { $$ := $1 + $3; }
 +
    | expressao MENOS termo { $$ := $1 - $3; }
 +
    | termo                { $$ := $1; }
 +
    ;
 +
termo
 +
    : termo '*' fator      { $$ := $1 * $3; }
 +
    | termo '/' fator      { if ($3 = 0) then
 +
                                yyerror('Divisao por zero!')
 +
                              else
 +
                                $$ := $1 div $3; }
 +
    | fator                { $$ := $1; }
 +
    ;
 +
fator
 +
    : NUMBER                { $$ := $1; }
 +
    | MENOS NUMBER          { $$ := -1 * $2; }
 +
    | '(' expressao ')'    { $$ := $2; }
 +
    ;
 +
 +
%%
 +
 +
 
 +
{$include lexer.pas}
 +
 +
begin
 +
  yyparse ();
 +
end.
 +
</syntaxhighlight>
 +
'''WOW. The calculator works as expected !'''<br>
 +
 
 +
[[User:GVS|GVS]] 00:09, 9 March 2014 (CET)
 
[[Category:Code]]
 
[[Category:Code]]
 
[[Category:Compiler design]]
 
[[Category:Compiler design]]
 
[[Category:Utilities]]
 
[[Category:Utilities]]

Revision as of 01:09, 9 March 2014

Free Pascal comes with substitutes for the GNU projects Lex and YACC. They are called Plex and Pyacc and they can be used to generate compilers and regular expression analyzers in Pascal instead of C.

Library contents

TP Lex and Yacc can be found here: http://svn.freepascal.org/cgi-bin/viewvc.cgi/trunk/utils/tply/

Documentation

Download the manual here: http://svn.freepascal.org/cgi-bin/viewvc.cgi/trunk/utils/tply/tply.doc?revision=1

Simple example application

This is a very simple calculator, which shows how to use plex and pyacc. The files in this simple project are:

  • build.sh
  • lexer.l
  • parser.y

WARNING: The code below compiles but currently doesn't work <-- Because it is simply wrong. See below and learn why !

build.sh

plex lexer.l
pyacc -d parser.y
mv parser.pas calculadora.pas
fpc calculadora.pas

lexer.l

{ Analisador léxico da calculadora para a disciplina de PCS

   Aluno: Felipe Monteiro de Carvalho                           }

%{
%}
%%

[0-9]+[dD]?          begin yylval.yyInteger := StrToInt(yytext); yyruleno := NUMBER; end;

[0-9a-fA-F]+[hH]     begin yylval.yyInteger := StrToInt(yytext); yyruleno := NUMBER; end;

[ \t]                begin end; { ignorar espaços em branco }

[-]                  begin yyruleno := MENOS; end;

[+*/()]              begin yyruleno := Integer(yytext[0]); end;

[=][hH]              begin yyruleno := IGUALH; end;

[=][dD]?             begin yyruleno := IGUALD; end;

\n                   begin yyruleno := Integer(yytext[0]); end;

.                    begin yyerror('Caracter inexperado'); end;

%%
function meuyywrap (): Integer;
begin
  Result := 1;
end;

parser.y

/* Parser

   Aluno: Felipe Monteiro de Carvalho                           */
%{
program calculadora;

{$mode delphi}

uses SysUtils, yacclib, lexlib;

%}

%start entrada
%token <Integer> NUMBER IGUALD IGUALH MENOS
%type <Integer> expressao termo fator

%%

entrada
    : /* linha vazia */     {  }
    | entrada linha         {  }
    ;
linha
    : expressao IGUALD '\n' { WriteLn(Format('Resultado: %d', [$1])); }
    | expressao IGUALH '\n' { WriteLn(Format('Resultado: %H', [$1])); }
    ;
expressao
    : expressao '+' termo   { $$ := $1 + $3; }
    | expressao MENOS termo { $$ := $1 - $3; }
    | termo                 { $$ := $1; }
    ;
termo
    : termo '*' fator       { $$ := $1 * $3; }
    | termo '/' fator       { if ($3 = 0) then
                                 yyerror('Divisao por zero!')
                              else
                                 $$ := $1 div $3; }
    | fator                 { $$ := $1; }
    ;
fator
    : NUMBER                { $$ := $1; }
    | MENOS NUMBER          { $$ := -1 * $2; }
    | '(' expressao ')'     { $$ := $2; }
    ;

%%

procedure meuyyerror(s: PChar);  // Called by yyparse on error
begin
  WriteLn(Format('Erro: %s', [s]));
end;

{$include lexer.pas}

begin
  yywrap := @meuyywrap;
//  yyerror := @meuyyerror;
  yyparse ();
end.

How to use it

Type for example: 5+3=

And it will answer with: 8

You can also try 3*8=H to request an answer in Hexadecimal instead of decimal

How to do it the right way

Sorry that I have to do this, but this example is wrong !

First of all I recommend you to read the manual and try to understand it.
Secondly take a look at the examples provided with original tply package by Albert Graef distributed at: http://www.musikwissenschaft.uni-mainz.de/~ag/tply
There you will find a more sophisticated calculator example named 'expr'. If you take a look at it you will find out that this is a basic example for Lex/Yacc generated parsers and it works.

Now I will explain you why you can NOT do things that way !

We start with Lex:

  1. Lex returns a token to the parser which is always an integer value. The token is defined as an constant value with the %token command. Tokens can only be returned to Yacc through two commands, return and returnc.
    1. return : This is the normal return of a token = integer
    2. returnc: With this it is possible to return one character, it is needed for the literals in Yacc rules

Now we correct the return of a token from example above:

[0-9]+[dD]?          begin yylval.yyInteger := StrToInt(yytext); yyruleno := NUMBER; end; <- nothing, because yyruleno ends up in nirvana
[0-9]+[dD]?          begin yylval.yyInteger := StrToInt(yytext); return(NUMBER); end;     <- good

And the litteral return :

[+*/()]              begin yyruleno := Integer(yytext[0]); end; <- ouch !
[+*/()]              returnc(yytext[1]);                        <- returns single characters we want

Can you see the difference ?
Why the hell one should use index NULL ? And no need for begins and ends.
And last but not least the strange function:

function meuyywrap (): Integer; <- ??? what is this for ??? this no overload for yywrap !
begin
  Result := 1;
end;

The yywrap function is for loading multiple files, do we need this if we read from stdin and write to stdout ? No, we don't !

The working Lex file:

{ Analisador léxico da calculadora para a disciplina de PCS
 
   Aluno: Felipe Monteiro de Carvalho                           }
 
%{
%}
%%
 
[0-9]+[dD]?          begin yylval.yyInteger := StrToInt(yytext); return(NUMBER); end;
 
[0-9a-fA-F]+[hH]     begin yylval.yyInteger := StrToInt(yytext); return(NUMBER); end;
 
[ \t]                ; { ignorar espaços em branco }
 
[-]                  return(MENOS);
 
[+*/()]              returnc(yytext[1]);
 
[=][hH]              return(IGUALH);
 
[=][dD]?             return(IGUALD);
 
\n                   returnc(yytext[1]);
 
.                    yyerror('Caracter inexperado');
 
%%


Now we go on with Yacc:
The main block:
If we want to read a file we have to assign an open it before we can parse it. If we want to read more files we have to use yywrap.
With a parser like this we need only stdio. So we need to do nothing because Lex assigns and opens the 'files' known as stdin and stdout.
With this we can forget all but the yyparse() call.

Additional functionality:
As a feature we want to write user defined error messages. For this we simply overload the yyerror procedure.

The working Yacc file:

/* Parser
 
   Aluno: Felipe Monteiro de Carvalho                           */
%{
program calculadora;
 
{$mode delphi}
 
uses SysUtils, yacclib, lexlib;
 
procedure yyerror(s: string);  // Called by yyparse on error <- overload !
begin
  WriteLn(Format('Erro: %s', [s]));
end;
 
%}
 
%start entrada
%token <Integer> NUMBER IGUALD IGUALH MENOS
%type <Integer> expressao termo fator
 
%%
 
entrada
    : /* linha vazia */     {  }
    | entrada linha         {  }
    | entrada linha '\n'    { writeln('DONE.'); yyaccept; } /* <- nice end with RETURN */
    ;
linha
    : expressao IGUALD '\n' { WriteLn(Format('Resultado: %d', [$1])); }
    | expressao IGUALH '\n' { WriteLn(Format('Resultado: %X', [$1])); } /* <- hex = X */
    ;
expressao
    : expressao '+' termo   { $$ := $1 + $3; }
    | expressao MENOS termo { $$ := $1 - $3; }
    | termo                 { $$ := $1; }
    ;
termo
    : termo '*' fator       { $$ := $1 * $3; }
    | termo '/' fator       { if ($3 = 0) then
                                 yyerror('Divisao por zero!')
                              else
                                 $$ := $1 div $3; }
    | fator                 { $$ := $1; }
    ;
fator
    : NUMBER                { $$ := $1; }
    | MENOS NUMBER          { $$ := -1 * $2; }
    | '(' expressao ')'     { $$ := $2; }
    ;
 
%%
 

{$include lexer.pas}
 
begin
  yyparse ();
end.

WOW. The calculator works as expected !

GVS 00:09, 9 March 2014 (CET)