diff options
Diffstat (limited to 'doc/ficl_parse.html')
| -rw-r--r-- | doc/ficl_parse.html | 197 |
1 files changed, 197 insertions, 0 deletions
diff --git a/doc/ficl_parse.html b/doc/ficl_parse.html new file mode 100644 index 0000000000000..a90607778f0e8 --- /dev/null +++ b/doc/ficl_parse.html @@ -0,0 +1,197 @@ +<!doctype html public "-//w3c//dtd html 4.0 transitional//en"> +<html> +<head> + <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> + <meta name="Author" content="john sadler"> + <meta name="Description" content="the coolest embedded scripting language ever"> + <title>Ficl Parse Steps</title> +</head> +<body> +<link REL="SHORTCUT ICON" href="ficl.ico"> +<table BORDER=0 CELLSPACING=3 COLS=1 WIDTH="675" ><tr><td> +<h1>Ficl Parse Steps</h1> +<script language="javascript" src="ficlheader.js"></script> + +<h2>Overview</h2> +<p> +Ficl 2.05 and later includes an extensible parser chain. Ficl feeds every incoming token +(chunk of text with no internal whitespace) to each step in the parse chain in turn. The +first parse step that successfully matches the token applies semantics to it and returns +a TRUE flag, ending the sequence. If all parse steps fire without a match, ficl prints +an error message and resets the virtual machine. Parse steps can be written in precompiled +code, or in ficl itself, and can be appended to the chain at run-time if you like. +</p> +<p> +More detail: +</p> +<ul> +<li> +If compiling and local variable support is enabled, attempt to find the token in the local +variable dictionary. If found, execute the token's compilation semantics and return +</li> +<li> +Attempt to find the token in the system dictionary. If found, execute the token's semantics +(may be different when compiling than when interpreting) and return +</li> +<li> +If prefix support is enabled (Compile-time constant FICL_WANT_PREFIX in sysdep.h is non-zero), +attempt to match the beginning of the token to the list of known prefixes. If there's a match, +execute the associated prefix method. +</li> +<li> +Attempt to convert the token to a number in the present <code>BASE</code>. If successful, push the +value onto the stack if interpreting, compile it if compiling. Return +</li> +<li> +All previous parse steps failed to recognize the token. Print "<token> not found" and abort +</li> +</ul> +You can add steps to the parse chain, and you can add prefixes. +<h2>Adding Parse Steps</h2> +You can add a parse step in two ways. The first is to write a ficl word that +has the correct stack signature for a parse step: +<pre> +my-parse-step ( c-addr u -- ??? flag ) +</pre> +Where <code>c-addr u</code> are the address and length of the incoming token, +and <code>flag</code> is <code>true</code> if the parse step recognizes the token +and <code>false</code> otherwise. +<br> +Install the parse step using <code>add-parse-step</code>. +A trivial example: +<pre> +: ?silly ( c-addr u -- flag ) + ." Oh no! Not another " type cr true ; +' ?silly add-parse-step +parse-order +</pre> +<p> +The other way to add a parse step is by writing it in C, and inserting it into the +parse chain with: +</p> +<pre> +void ficlAddPrecompiledParseStep(FICL_SYSTEM *pSys, char *name, FICL_PARSE_STEP pStep); +</pre> +Where <code>name</code> is the display name of the parse step in the parse chain (as revealed +by <code>parse-order</code>). Parameter pStep is a pointer to the code for the parse step itself, +and must match the following declaration: +<pre> +typedef int (*FICL_PARSE_STEP)(FICL_VM *pVM, STRINGINFO si); +</pre> +<p> +Upon entry to the parse step, <code>si</code> points to the incoming token. The parse step +must return <code>FICL_TRUE</code> if it succeeds in matching the token, and +<code>FICL_TRUE</code> otherwise. If it succeeds in matching a token, the parse step +applies semantics to it before returning. See <code>ficlParseNumber()</code> in words.c for +an example. +</p> + +<h2>Adding Prefixes</h2> +<p> +What's a prefix, anyway? A prefix (contributed by Larry Hastings) is a token that's +recognized as the beginning of another token. Its presence modifies the semantics of +the rest of the token. An example is <code>0x</code>, which causes digits following +it to be converted to hex regardless of the current value of <code>BASE</code>. +</p><p> +Caveat: Prefixes are matched in sequence, so the more of them there are, +the slower the interpreter gets. On the other hand, because the prefix parse step occurs +immediately after the dictionary lookup step, if you have a prefix for a particular purpose, +using it may save time since it stops the parse process. +</p><p> +Each prefix is a ficl word stored in a special wordlist called <code><prefixes></code>. When the +prefix parse step (<code>?prefix</code> AKA ficlParsePrefix()) fires, it searches each word +in <code><prefixes></code> in turn, comparing it with the initial characters of the incoming +token. If a prefix matches, the parse step returns the remainder of the token to the input stream +and executes the code associated with the prefix. This code can be anything you like, but it would +typically do something with the remainder of the token. If the prefix code does not consume the +rest of the token, it will go through the parse process again (which may be what you want). +</p><p> +Prefixes are defined in prefix.c and in softwords/prefix.fr. The easiest way to add a new prefix is +to insert it into prefix.fr and rebuild the system. You can also add prefixes interactively +by bracketing prefix definitions as follows (see prefix,fr): +</p> +<pre> +start-prefixes ( defined in prefix.fr ) +\ make dot-paren a prefix (create an alias for it in the prefixes list) +: .( .( ; +: 0b 2 __tempbase ; immediate +end-prefixes +</pre> +<p> +The precompiled word <code>__tempbase</code> is a helper for prefixes that specify a +temporary value of <code>BASE</code>. +</p><p> +Constant <code>FICL_EXTENDED_PREFIX</code> controls the inclusion of a bunch of additional +prefix definitions. This is turned off in the default build since several of these prefixes +alter standard behavior, but you might like them. +</p> + +<h2>Notes</h2> +<p> +Prefixes and parser extensions are non-standard, although with the exception of prefix support, +ficl's default parse order follows the standard. Inserting parse steps in some other order +will almost certainly break standard behavior. +</p> +<p> +The number of parse steps that can be added to the system is limited by the value of +<code>FICL_MAX_PARSE_STEPS</code> (defined in sysdep.h unless you define it first), which defaults +to 8. More parse steps means slower average interpret and compile performance, +so be sparing. Same applies to the number of prefixes defined for the system, since each one +has to be matched in turn before it can be proven that no prefix matches. On the other hand, +if prefixes are defined, use them when possible: since they are matched early in the parse order, +a prefix match short circuits the parse process, saving time relative to +(for example) using a number builder parse step at the end of the parse chain. +</p> +<p> +Compile time constant <code>FICL_EXTENDED_PREFIX</code> enables several more prefix +definitions in prefix.c and prefix.fr. Please note that this will slow average compile and +interpret speed in most cases. +</p> +<h2>Parser Glossary</h2> +<dl> +<dt><b><code>parse-order ( -- )</code></b></dt> +<dd> +Prints the list of parse steps in the order in which they are evaluated. +Each step is the name of a ficl word with the following signature: +<pre> +parse-step ( c-addr u -- ??? flag ) +</pre> +A parse step consumes a counted string (the incoming token) from the stack, +and exits leaving a flag on top of the stack (it may also leave other parameters as side effects). +The flag is true if the parse step succeeded at recognizing the token, false otherwise. +</dd> +<dt><b><code>add-parse-step ( xt -- )</code></b></dt> +<dd> +Appends a parse step to the parse chain. XT is the adress (execution token) of a ficl +word to use as the parse step. The word must have the following signature: +<pre> +parse-step ( c-addr u -- ??? flag ) +</pre> +A parse step consumes a counted string (the incoming token) from the stack, +and exits leaving a flag on top of the stack (it may also leave other parameters as side effects). +The flag is true if the parse step succeeded at recognizing the token, false otherwise. +</dd> +<dt><b><code>show-prefixes ( -- )</code></b></dt> +<dd> +Defined in <code>softwords/prefix.fr</code>. +Prints the list of all prefixes. Each prefix is a ficl word that is executed if its name +is found at the beginning of a token. See <code>softwords/prefix.fr</code> and <code>prefix.c</code> for examples. +</dd> +<dt><b><code>start-prefixes ( -- )</code></b></dt> +<dd> +Defined in <code>softwords/prefix.fr</code>. +Declares the beginning of one or more prefix definitions (it just switches the compile wordlist +to <code><prefixes></code> +</dd> +<dt><b><code>end-prefixes ( -- )</code></b></dt> +<dd> +Defined in <code>softwords/prefix.fr</code>. +Restores the compilation wordlist that was in effect before the last invocation of +<code>start-prefixes</code>. Note: the prior wordlist ID is stored in a Ficl variable, so +attempts to nest <code>start-prefixes end-prefixes</code> blocks wil result in mildly silly +side effects. +</dd> +</dl> +</td></tr></table> +</body> +</html>
\ No newline at end of file |
