docs/documentation.html

---
title: 'Documentation'
permalink: '/documentation.html'
layout: main-layout
stylesheets: ["css/documentation.css"]
---

<script src="js/examples.js?2022-06-03T00:53:11Z"></script>

    <h1>Documentation</h1>

<h2 id="table-of-contents">Table of Contents</h2>

<ul>
  <li>
    <a href="#installation">Installation</a>
    <ul>
      <li><a href="#installation-node-js">Node.js</a></li>
      <li><a href="#installation-browser">Browser</a></li>
    </ul>
  </li>
  <li>
    <a href="#generating-a-parser">Generating a Parser</a>
    <ul>
      <li><a href="#generating-a-parser-command-line">Command Line</a></li>
      <li>
        <a href="#generating-a-parser-javascript-api">JavaScript API</a>
        <ul>
          <li><a href="#error-reporting">Error Reporting</a></li>
        </ul>
      </li>
    </ul>
  </li>
  <li><a href="#using-the-parser">Using the Parser</a></li>
  <li>
    <a href="#grammar-syntax-and-semantics">Grammar Syntax and Semantics</a>
    <ul>
      <li><a href="#grammar-syntax-and-semantics-parsing-expression-types">Parsing Expression Types</a></li>
      <li><a href="#action-execution-environment">Action Execution Environment</a></li>
      <li><a href="#parsing-lists">Parsing Lists</a></li>
    </ul>
  </li>
  <li><a href="#identifiers">Peggy Identifiers</a></li>
  <li><a href="#error-messages">Error Messages</a></li>
  <li><a href="#locations">Locations</a></li>
  <li>
    <a href="#plugins-api">Plugins API</a>
    <ul>
      <li><a href="#session-api">Session API</a></li>
    </ul>
  </li>
  <li><a href="#compatibility">Compatibility</a></li>
</ul>

<h2 id="installation">Installation</h2>

<h3 id="installation-node-js">Node.js</h3>

<p>To use the <code>peggy</code> command, install Peggy globally:</p>

<pre><code class="language-console">$ npm install -g peggy</code></pre>

<p>To use the JavaScript API, install Peggy locally:</p>

<pre><code class="language-console">$ npm install peggy</code></pre>

<p>If you need both the <code>peggy</code> command and the JavaScript API,
  install Peggy both ways.</p>

<h3 id="installation-browser">Browser</h3>

<p>The easiest way to use Peggy from the browser is to pull the latest version
  from a CDN. Either of these should work:</p>

<pre><code class="language-html">&lt;script src="https://unpkg.com/peggy">&lt;/script></code></pre>

<pre><code class="language-html">&lt;script src="https://cdn.jsdelivr.net/npm/peggy">&lt;/script></code></pre>

<p>Both of those CDNs support pinning a version number rather than always
  taking the latest. Not only is that good practice, it will save several
  redirects, improving performance. See their documentation for more
  information:</p>

<ul>
  <li><a href="https://unpkg.com/">unpkg</a></li>
  <li><a href="https://www.jsdelivr.com/">jsDelivr</a></li>
</ul>

<p>When your document is done loading, there will be a global <code>peggy</code> object.</p>

<h2 id="generating-a-parser">Generating a Parser</h2>

<p>Peggy generates parser from a grammar that describes expected input and can
  specify what the parser returns (using semantic actions on matched parts of the
  input). Generated parser itself is a JavaScript object with a simple API.</p>

<h3 id="generating-a-parser-command-line">Command Line</h3>

<p>To generate a parser from your grammar, use the <code>peggy</code>
  command:</p>

<pre><code class="language-console">$ peggy arithmetics.pegjs</code></pre>

<p>This writes parser source code into a file with the same name as the grammar
  file but with “.js” extension. You can also specify the output file
  explicitly:</p>

<pre><code class="language-console">$ peggy -o arithmetics-parser.js arithmetics.pegjs</code></pre>

<p>If you omit both input and output file, standard input and standard output
  are used.</p>

<p>By default, the generated parser is in the Node.js module format. You can
  override this using the <code>--format</code> option.</p>

<p>You can tweak the generated parser with several options:</p>

<dl>
  <dt><code>--allowed-start-rules &lt;rules&gt;</code></dt>
  <dd>Comma-separated list of rules the parser will be allowed to start parsing
    from.  Use '*' if you want any rule to be allowed as a start rule.
    (default: only the first rule in the grammar).</dd>

  <dt><code>--ast</code></dt>
  <dd>Outputting an internal AST representation of the grammar after
    all optimizations instead of the parser source code. Useful for plugin authors
    to see how their plugin changes the AST. This option cannot be mixed with the
    <code>-t/--test</code>, <code>-T/--test-file</code> and <code>-m/--source-map</code>
    options.
  </dd>

  <dt><code>--cache</code></dt>
  <dd>Makes the parser cache results, avoiding exponential parsing time in
    pathological cases but making the parser slower.</dd>

  <dt><code>-d</code>, <code>--dependency &lt;[name:]module&gt;</code></dt>
  <dd>Makes the parser require a specified dependency (can be specified
    multiple times). A variable name for the import/require/etc. may be given,
    followed by a colon. If no name is given, the module name will also be used
    for the variable name.</dd>

  <dt><code>-D</code>, <code>--dependencies &lt;json&gt;</code></dt>
  <dd>Dependencies, in JSON object format with variable:module pairs. (Can be
    specified multiple times).</dd>

  <dt><code>-e</code>, <code>--export-var &lt;variable&gt;</code></dt>
  <dd>Name of a global variable into which the parser object is assigned to when
    no module loader is detected.</dd>

  <dt><code>--extra-options &lt;options&gt;</code></dt>
  <dd>Additional options (in JSON format, as an object) to pass to
    <code>peg.generate</code>.
  </dd>

  <dt><code>-c</code>, <code>--extra-options-file &lt;file&gt;</code></dt>
  <dd>File with additional options (in JSON format, as an object) to pass to
    <code>peg.generate</code>.
  </dd>

  <dt><code>--format &lt;format&gt;</code></dt>
  <dd>Format of the generated parser: <code>amd</code>, <code>commonjs</code>,
    <code>globals</code>, <code>umd</code>, <code>es</code> (default: <code>commonjs</code>).
  </dd>

  <dt><code>-o</code>, <code>--output &lt;file&gt;</code></dt>
  <dd>File to send output to. Defaults to input file name with
    extension changed to <code>.js</code>, or stdout if no input file is given.</dd>

  <dt><code>--plugin</code></dt>
  <dd>Makes Peggy use a specified plugin (can be specified multiple
    times).</dd>

  <dt><code>-m</code>, <code>--source-map &lt;file&gt;</code></dt>
  <dd>Generate a source map. If name is not specified, the source map will be
    named "&lt;input_file&gt;.map" if input is a file and "source.map" if input
    is a standard input. If the special filename <code>inline</code> is given,
    the sourcemap will be embedded in the output file as a data URI. If the
    filename is prefixed with <code>hidden:</code>, no mapping URL will be
    included so that the mapping can be specified with an HTTP SourceMap:
    header. This option conflicts with the <code>-t/--test</code> and
    <code>-T/--test-file</code> options unless <code>-o/--output</code> is also
    specified
  </dd>

  <dt><code>-S</code>, <code>--start-rule &lt;rule&gt;</code></dt>
  <dd>When testing, use this rule as the start rule. Automatically added to
    the allowedStartRules.</dd>

  <dt><code>-t</code>, <code>--test &lt;text&gt;</code></dt>
  <dd>Test the parser with the given text, outputting the result of running
    the parser against this input.
    If the input to be tested is not parsed, the CLI will exit with code 2.</dd>

  <dt><code>-T</code>, <code>--test-file &lt;text&gt;</code></dt>
  <dd>Test the parser with the contents of the given file, outputting the
    result of running the parser against this input.
    If the input to be tested is not parsed, the CLI will exit with code 2.</dd>

  <dt><code>--trace</code></dt>
  <dd>Makes the parser trace its progress.</dd>

  <dt><code>-v</code>, <code>--version</code></dt>
  <dd>Output the version number.</dd>

  <dt><code>-h</code>, <code>--help</code></dt>
  <dd>Display help for the command.</dd>

</dl>

<p>If you specify options using <code>-c &lt;file&gt;</code> or
  <code>--extra-options-file &lt;file&gt;</code>, you will need to ensure you
  are using the correct types. In particular, you may specify "plugin" as a
  string, or "plugins" as an array of objects that have a <code>use</code>
  method. Always use the long (two-dash) form of the option, without the
  dashes, as the key. Options that contain internal dashes should be specified
  in camel case. You may also specify an "input" field instead of using the
  command line. For example:</p>

<pre><code class="language-javascript">// config.js or config.cjs
module.exports = {
  allowedStartRules = ["foo", "bar"],
  format: "umd",
  exportVar: "foo",
  input: "fooGrammar.peggy",
  plugins: [require("./plugin.js")],
  testFile: "myTestInput.foo",
  trace: true,
};
</code></pre>

<p>You can test generated parser immediately if you specify the
  <code>-t/--test</code> or <code>-T/--test-file</code>
  option. This option conflicts with the
  <code>--ast</code> option, and also conflicts with the
  <code>-m/--source-map</code> option unless <code>-o/--output</code> is also
  specified.</p>

<p>The CLI will exit with the code:</p>

<ul>
  <li><code>0</code>: if successful</li>
  <li><code>1</code>: if you supply incorrect or conflicting parameters</li>
  <li><code>2</code>: if you specified the
    <code>-t/--test</code> or <code>-T/--test-file</code> option and the specified
    input fails parsing with the specified grammar
  </li>
</ul>

<p>Examples:</p>

<pre><code class="language-console"># - write test results to stdout (42)
# - exit with the code 0
echo "foo = '1' { return 42 }" | peggy --test 1

# - write a parser error to stdout (Expected "1" but "2" found)
# - exit with the code 2
echo "foo = '1' { return 42 }" | peggy --test 2

# - write an error to stdout (Generation of the source map is useless if you don't
#   store a generated parser code, perhaps you forgot to add an `-o/--output` option?)
# - exit with the code 1
echo "foo = '1' { return 42 }" | peggy --source-map --test 1

# - write an error to stdout (Generation of the source map is useless if you don't
#   store a generated parser code, perhaps you forgot to add an `-o/--output` option?)
# - exit with the code 1
echo "foo = '1' { return 42 }" | peggy --source-map --test 2

# - write an output to `parser.js`,
# - write a source map to `parser.js.map`
# - write test results to stdout (42)
# - exit with the code 0
echo "foo = '1' { return 42 }" | peggy --output parser.js --source-map --test 1

# - write an output to `parser.js`,
# - write a source map to `parser.js.map`
# - write a parser error to stdout (Expected "1" but "2" found)
# - exit with the code 2
echo "foo = '1' { return 42 }" | peggy --output parser.js --source-map --test 2
</code></pre>

<h3 id="generating-a-parser-javascript-api">JavaScript API</h3>

<p>In Node.js, require the Peggy parser generator module:</p>

<pre><code class="language-javascript">const peggy = require("peggy");</code></pre>

<p>or:</p>

<pre><code class="language-javascript">import * as peggy from "peggy";</code></pre>

<p>For use in browsers, include the Peggy library in your web page or
  application using the <code>&lt;script&gt;</code> tag. If Peggy detects an
  <a href="https://requirejs.org/docs/whyamd.html">AMD</a> loader, it will define
  itself as a module, otherwise the API will be available in the
  <code>peg</code> global object.</p>

<p>To generate a parser, call the <code>peggy.generate</code> method and pass your
  grammar as a parameter:</p>

<pre><code class="language-javascript">const parser = peggy.generate("start = ('a' / 'b')+");</code></pre>

<p>The method will return generated parser object or its source code as a string
  (depending on the value of the <code>output</code> option — see below). It will
  throw an exception if the grammar is invalid. The exception will contain a
  <code>message</code> property with more details about the error.</p>

<p>You can tweak the generated parser by passing a second parameter with an
  options object to <code>peg.generate</code>. The following options are
  supported:</p>

<dl>
  <dt><code>allowedStartRules</code></dt>
  <dd>Rules the parser will be allowed to start parsing from (default: the
    first rule in the grammar). If any of the rules specified is "*", any of
    the rules in the grammar can be used as start rules.</dd>

  <dt><code>cache</code></dt>
  <dd>If <code>true</code>, makes the parser cache results, avoiding exponential
    parsing time in pathological cases but making the parser slower (default:
    <code>false</code>).
  </dd>

  <dt><code>dependencies</code></dt>
  <dd>Parser dependencies. The value is an object which maps variables used to
    access the dependencies in the parser to module IDs used to load them; valid
    only when <code>format</code> is set to <code>"amd"</code>,
    <code>"commonjs"</code>, <code>"es"</code>, or <code>"umd"</code>.
    Dependencies variables will be available in both the <em>global
    initializer</em> and the <em>per-parse initializer</em>. Unless the parser
    is to be generated in different formats, it is recommended to rather
    import dependencies from within the <em>global initializer</em> (default:
    <code>{}</code>).
  </dd>

  <dt><code>error</code></dt>
  <dd>Callback for errors. See <a href="#error-reporting">Error Reporting</a></dd>

  <dt><code>exportVar</code></dt>
  <dd>Name of a global variable into which the parser object is assigned to when
    no module loader is detected; valid only when <code>format</code> is set to
    <code>"globals"</code> or <code>"umd"</code> (default:
    <code>null</code>).
  </dd>

  <dt><code>format</code></dt>
  <dd>
    Format of the generated parser (<code>"amd"</code>, <code>"bare"</code>,
    <code>"commonjs"</code>, <code>"es"</code>, <code>"globals"</code>, or
    <code>"umd"</code>); valid only when <code>output</code> is set to
    <code>"source"</code>, <code>"source-and-map"</code>, or
    <code>"source-with-inline-map"</code>. (default: <code>"bare"</code>).
  </dd>

  <dt id="grammar-source"><code>grammarSource</code></dt>
  <dd>
    A string or object representing the "origin" of the input string being
    parsed. The <code>location()</code> API returns the supplied
    <code>grammarSource</code> in the <code>source</code> key. As an example, if
    you pass in <code>grammarSource</code> as "main.js", then errors with
    locations include <code>{ source: 'main.js', ... }</code>.
    <br><br>
    If you pass an object, the location API returns the entire object in the
    <code>source</code> key. If you format an error containing a location with
    <a href="#error-messages"><code>format()</code></a>, the formatter
    stringifies the object. If you pass an object, we recommend you add
    a <code>toString()</code> method to the object to improve error messages.
  </dd>

  <dt><code>info</code></dt>
  <dd>Callback for informational messages. See <a href="#error-reporting">Error Reporting</a></dd>

  <dt><code>output</code></dt>
  <dd>
    <p>A string, one of:</p>
    <ul>
      <li><code>"parser"</code> - return generated parser object.</li>
      <li><code>"source"</code> - return parser source code as a string.</li>
      <li><code>"source-and-map"</code> - return a
        <a href="https://github.com/mozilla/source-map#sourcenode"><code>SourceNode</code></a>
        object; you can get source code by calling <code>toString()</code>
        method or source code and mapping by calling
        <code>toStringWithSourceMap()</code> method, see the
        <a href="https://github.com/mozilla/source-map#sourcenode"><code>SourceNode</code></a>
        documentation.
      </li>
      <li><code>"source-with-inline-map"</code> - return the parser source along
        with an embedded source map as a <code>data:</code> URI. This option
        leads to a larger output string, but is the easiest to integrate with
        developer tooling.</li>
      <li><code>"ast"</code> - return the internal AST of the grammar as a JSON
        string. Useful for plugin authors to explore internals of Peggy and
        for automation.</li>
    </ul>
    <p>(default: <code>"parser"</code>)</p>
    <blockquote>
      <p><strong>Note</strong>: You should also set <code>grammarSource</code>
        to a not-empty string if you set this value to
        <code>"source-and-map"</code> or
        <code>"source-with-inline-map"</code>. The path should be relative to
        the location where the generated parser code will be stored. For
        example, if you are generating <code>lib/parser.js</code> from
        <code>src/parser.peggy</code>, then your options should be:
        <code>{ grammarSource: "../src/parser.peggy" }</code>
      </p>
    </blockquote>
  </dd>

  <dt><code>plugins</code></dt>
  <dd>Plugins to use. See the <a href="#plugins-api">Plugins API</a> section.</dd>

  <dt><code>trace</code></dt>
  <dd>Makes the parser trace its progress (default: <code>false</code>).</dd>

  <dt><code>warning</code></dt>
  <dd>Callback for warnings. See <a href="#error-reporting">Error Reporting</a></dd>
</dl>

<h4 id="error-reporting">Error Reporting</h4>

<p>While generating the parser, the compiler may throw a <code>GrammarError</code> which collects
  all of the issues that were seen.</p>

<p>There is also another way to collect problems as fast as they are reported —
  register one or more of these callbacks:</p>

<ul>
  <li><code>error(stage: Stage, message: string, location?: LocationRange, notes?: DiagnosticNote[]): void</code></li>
  <li><code>warning(stage: Stage, message: string, location?: LocationRange, notes?: DiagnosticNote[]): void</code></li>
  <li><code>info(stage: Stage, message: string, location?: LocationRange, notes?: DiagnosticNote[]): void</code></li>
</ul>

<p>All parameters are the same as the parameters of the <a href="#session-api">reporting API</a> except the first.
  The <code>stage</code> represent one of possible stages during which execution a diagnostic was generated.
  This is a string enumeration, that currently has one of three values:</p>

<ul>
  <li><code>check</code></li>
  <li><code>transform</code></li>
  <li><code>generate</code></li>
</ul>

<h2 id="using-the-parser">Using the Parser</h2>

<p>To use the generated parser, call its <code>parse</code>
  method and pass an input string as a parameter. The method will return a parse
  result (the exact value depends on the grammar used to generate the parser) or
  throw an exception if the input is invalid. The exception will contain
  <code>location</code>, <code>expected</code>, <code>found</code>,
  <code>message</code>, and <code>diagnostic</code> properties with more details about the error. The error
  will have a <code>format(SourceText[])</code> function, to which you pass an array
  of objects that look like <code>{ source: grammarSource, text: string }</code>; this
  will return a nicely-formatted error suitable for human consumption.</p>

<pre><code class="language-javascript">parser.parse("abba"); // returns ["a", "b", "b", "a"]

parser.parse("abcd"); // throws an exception</code></pre>

<p>You can tweak parser behavior by passing a second parameter with an options
object to the <code>parse</code> method. The following options are
supported:</p>

<dl>
  <dt><code>startRule</code></dt>
  <dd>Name of the rule to start parsing from.</dd>

  <dt><code>tracer</code></dt>
  <dd>
    Tracer to use. A tracer is an object containing a <code>trace()</code> function.
    <code>trace()</code> takes a single parameter which is an object containing
    "type" ("rule.enter", "rule.fail", "rule.match"), "rule" (the rule name as a
    string), "<a href="<a href=" #locations">location</a>", and, if the type is
    "rule.match", "result" (what the rule returned).
  </dd>

  <dt><code>...</code> (any others)</dt>
  <dd>Made available in the <code>options</code> variable</dd>
</dl>

<p>As you can see above, parsers can also support their own custom options.  For example:</p>

<pre><code class="language-javascript">const parser = peggy.generate(`
{
  // options are available in the per-parse initializer
  console.log(options.validWords);  // outputs "[ 'boo', 'baz', 'boop' ]"
}

validWord = @word:$[a-z]+ &{ return options.validWords.includes(word) }
`);

const result = parser.parse("boo", {
  validWords: [ "boo", "baz", "boop" ]
});

console.log(result);  // outputs "boo"
</code></pre>

<h2 id="grammar-syntax-and-semantics">Grammar Syntax and Semantics</h2>

<p>The grammar syntax is similar to JavaScript in that it is not line-oriented
  and ignores whitespace between tokens. You can also use JavaScript-style
  comments (<code>// ...</code> and <code>/* ... */</code>).</p>

<p>Let's look at example grammar that recognizes simple arithmetic expressions
  like <code>2*(3+4)</code>. A parser generated from this grammar computes their
  values.</p>

<pre><code class="language-peggy">start
  = additive

additive
  = left:multiplicative "+" right:additive { return left + right; }
  / multiplicative

multiplicative
  = left:primary "*" right:multiplicative { return left * right; }
  / primary

primary
  = integer
  / "(" additive:additive ")" { return additive; }

integer "simple number"
  = digits:[0-9]+ { return parseInt(digits.join(""), 10); }</code></pre>

<p>On the top level, the grammar consists of <em>rules</em> (in our example,
  there are five of them). Each rule has a <em>name</em> (e.g.
  <code>integer</code>) that identifies the rule, and a <em>parsing
  expression</em> (e.g. <code>digits:[0-9]+ { return parseInt(digits.join(""), 10); }</code>)
  that defines a pattern to match against the input text and
  possibly contains some JavaScript code that determines what happens when the
  pattern matches successfully. A rule can also contain <em>human-readable
  name</em> that is used in error messages (in our example, only the
  <code>integer</code> rule has a human-readable name). The parsing starts at the
  first rule, which is also called the <em>start rule</em>.</p>

<p>A rule name must be a Peggy <a href="#identifiers">identifier</a>. It is
  followed by an equality sign (“=”) and a parsing expression. If the rule has a
  human-readable name, it is written as a JavaScript string between the rule
  name and the equality sign. Rules need to be separated only by whitespace
  (their beginning is easily recognizable), but a semicolon (“;”) after the
  parsing expression is allowed.</p>

<p>The first rule can be preceded by a <em>global initializer</em> and/or a
  <em>per-parse initializer</em>, in that order. Both are pieces of JavaScript
  code in double curly braces (“{{'{{'}}” and “}}”) and single curly braces (“{” and
  “}”) respectively. All variables and functions defined in both
  <em>initializers</em> are accessible in rule actions and semantic predicates.
  Curly braces in both <em>initializers</em> code must be balanced.</p>

<p>The <em>global initializer</em> is executed once and only once, when the
  generated parser is loaded (through a <code>require</code> or an
  <code>import</code> statement for instance). It is the ideal location to
  require, to import, to declare constants, or to declare utility functions to be used in rule actions
  and semantic predicates.</p>

<p>The <em>per-parse initializer</em> is called before the generated parser
  starts parsing. The code inside the <em>per-parse initializer</em> can access
  the input string and the options passed to the parser using the
  <code>input</code> variable and the <code>options</code> variable respectively.
  It is the ideal location to create data structures that are unique to each
  parse or to modify the input before the parse.</p>

<p>Let's look at the example grammar from above using a <em>global
  initializer</em> and a <em>per-parse initializer</em>:</p>

<pre><code class="language-peggy">{{'{{'}}
  function makeInteger(o) {
    return parseInt(o.join(""), 10);
  }
}}

{
  if (options.multiplier) {
    input = `(${input})*(${options.multiplier})`;
  }
}

start
  = additive

additive
  = left:multiplicative "+" right:additive { return left + right; }
  / multiplicative

multiplicative
  = left:primary "*" right:multiplicative { return left * right; }
  / primary

primary
  = integer
  / "(" additive:additive ")" { return additive; }

integer "simple number"
  = digits:[0-9]+ { return makeInteger(digits); }</code></pre>

<p>The parsing expressions of the rules are used to match the input text to the
  grammar. There are various types of expressions &mdash; matching characters or
  character classes, indicating optional parts and repetition, etc. Expressions
  can also contain references to other rules. See
  <a href="#grammar-syntax-and-semantics-parsing-expression-types">detailed
  description below</a>.</p>

<p>If an expression successfully matches a part of the text when running the
  generated parser, it produces a <em>match result</em>, which is a JavaScript
  value. For example:</p>

<ul>
  <li>An expression matching a literal string produces a JavaScript string
    containing matched text.</li>

  <li>An expression matching repeated occurrence of some subexpression produces
    a JavaScript array with all the matches.</li>
</ul>

<p>The match results propagate through the rules when the rule names are used in
  expressions, up to the start rule. The generated parser returns start rule's
  match result when parsing is successful.</p>

<p>One special case of parser expression is a <em>parser action</em> &mdash; a
  piece of JavaScript code inside curly braces (“{” and “}”) that takes match
  results of the preceding expression and returns a JavaScript value.
  This value is then considered match result of the preceding expression (in other
  words, the parser action is a match result transformer).</p>

<p>In our arithmetics example, there are many parser actions. Consider the
  action in expression <code>digits:[0-9]+ { return parseInt(digits.join(""), 10); }</code>.
  It takes the match result of the expression [0-9]+, which is an array
  of strings containing digits, as its parameter. It joins the digits together to
  form a number and converts it to a JavaScript <code>number</code> object.</p>

<h3 id="grammar-syntax-and-semantics-parsing-expression-types">Parsing Expression Types</h3>

<p>There are several types of parsing expressions, some of them containing
  subexpressions and thus forming a recursive structure. Each example below is
  a part of a <a href="js/examples.peggy">full grammar</a>, which produces an
  object that contains <code>match</code> and <code>rest</code>.
  <code>match</code> is the part of the input that matched the example,
  <code>rest</code> is any remaining input after the match.</p>

<dl>
  <dt><code>"<em>literal</em>"<br>'<em>literal</em>'</code></dt>

  <dd>
    <p>Match exact literal string and return it. The string syntax is the same
      as in JavaScript. Appending <code>i</code> right after the literal makes the
      match case-insensitive.</p>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>literal = "foo"</code></div>
        <div><em>Matches:</em> <code>"foo"</code></div>
        <div><em>Does not match:</em> <code>"Foo"</code>, <code>"fOo"</code>, <code>"bar"</code>, <code>"fo"</code>
        </div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="foo" class="exampleInput" name="literal">
        <div class="result"></div>
      </div>
    </div>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>literal_i = "foo"i</code></div>
        <div><em>Matches:</em> <code>"foo"</code>, <code>"Foo"</code>, <code>"fOo"</code></div>
        <div><em>Does not match:</em> <code>"bar"</code>, <code>"fo"</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="FOO" class="exampleInput" name="literal_i">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt><code>.</code> (U+002E: FULL STOP, or "period")</dt>

  <dd>
    <p>Match exactly one character and return it as a string.</p>
    <div class="example">
      <div>
        <div><em>Example:</em> <code>any = .</code></div>
        <div><em>Matches:</em> <code>"f"</code>, <code>"."</code>, <code>" "</code></div>
        <div><em>Does not match:</em> <code>""</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="f" class="exampleInput" name="any">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt><code>[<em>characters</em>]</code></dt>

  <dd>
    <p>Match one character from a set and return it as a string. The characters
      in the list can be escaped in exactly the same way as in JavaScript string.
      The list of characters can also contain ranges (e.g. <code>[a-z]</code>
      means “all lowercase letters”). Preceding the characters with <code>^</code>
      inverts the matched set (e.g. <code>[^a-z]</code> means “all character but
      lowercase letters”). Appending <code>i</code> right after the literal makes
      the match case-insensitive.</p>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>class = [a-z]</code></div>
        <div><em>Matches:</em> <code>"f"</code></div>
        <div><em>Does not match:</em> <code>"A"</code>, <code>"-"</code>, <code>""</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="f" class="exampleInput" name="class">
        <div class="result"></div>
      </div>
    </div>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>class_i = [^a-z]i</code></div>
        <div><em>Matches:</em> <code>"="</code>, <code>" "</code></div>
        <div><em>Does not match:</em> <code>"F"</code>, <code>"f"</code>, <code>""</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="=" class="exampleInput" name="class_i">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt><code><em>rule</em></code></dt>

  <dd>
    <p>Match a parsing expression of a rule (perhaps recursively) and return its match
      result.</p>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>rule = child; child = "foo"</code></div>
        <div><em>Matches:</em> <code>"foo"</code></div>
        <div><em>Does not match:</em> <code>"Foo"</code>, <code>"fOo"</code>, <code>"bar"</code>, <code>"fo"</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="foo" class="exampleInput" name="rule">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt><code>( <em>expression</em> )</code></dt>

  <dd>
    <p>Match a subexpression and return its match result.</p>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>paren = ("1" { return 2; })+</code></div>
        <div><em>Matches:</em> <code>"11"</code></div>
        <div><em>Does not match:</em> <code>"2"</code>, <code>""</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="11" class="exampleInput" name="paren">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt><code><em>expression</em> *</code></dt>

  <dd>
    <p>Match zero or more repetitions of the expression and return their match
      results in an array. The matching is greedy, i.e. the parser tries to match
      the expression as many times as possible. Unlike in regular expressions,
      there is no backtracking.</p>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>star = "a"*</code></div>
        <div><em>Matches:</em> <code>"a"</code>, <code>"aaa"</code></div>
        <div><em>Does not match:</em> (always matches)</div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="aa" class="exampleInput" name="star">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt><code><em>expression</em> +</code></dt>

  <dd>
    <p>Match one or more repetitions of the expression and return their match
      results in an array. The matching is greedy, i.e. the parser tries to match
      the expression as many times as possible. Unlike in regular expressions,
      there is no backtracking.</p>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>plus = "a"+</code></div>
        <div><em>Matches:</em> <code>"a"</code>, <code>"aaa"</code></div>
        <div><em>Does not match:</em> <code>"b"</code>, <code>""</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="aa" class="exampleInput" name="plus">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt><code><em>expression</em> |count|
          <br><em>expression</em> |min..max|
          <br><em>expression</em> |count, delimiter|
          <br><em>expression</em> |min..max, delimiter|</code></dt>

  <dd>
    <p>Match exact <code>count</code> repetitions of <code>expression</code>.
      If the match succeeds, return their match results in an array.</p>

    <p><em>-or-</em></p>

    <p>Match expression at least <code>min</code> but not more then <code>max</code> times.
      If the match succeeds, return their match results in an array. Both <code>min</code>
      and <code>max</code> may be omitted. If <code>min</code> is omitted, then it is assumed
      to be <code>0</code>. If <code>max</code> is omitted, then it is assumed to be infinity.
      Hence</p>

    <ul>
      <li><code>expression |..|</code> is equivalent to <code>expression |0..|</code>
      and <code>expression *</code></li>
      <li><code>expression |1..|</code> is equivalent to <code>expression +</code></li>
    </ul>

    <p>Optionally, <code>delimiter</code> expression can be specified. The
      delimiter is a separate parser expression, its match results are ignored,
      and it must appear between matched expressions exactly once.</p>

    <p><code>count</code>, <code>min</code> and <code>max</code> can be
      represented as:</p>

    <ul>
      <li>positive integer:
        <pre><code class="language-peggy">start = "a"|2|;</code></pre>
      </li>
      <li>name of the preceding label:
        <pre><code class="language-peggy">start = count:n1 "a"|count|;
n1 = n:$[0-9] { return parseInt(n); };</code></pre>
      </li>
      <li>code block:
        <pre><code class="language-peggy">start = "a"|{ return options.count; }|;</code></pre>
      </li>
      Any non-number values, returned by the code block, will be interpreted as <code>0</code>.
    </ul>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>repetition = "a"|2..3, ","|</code></div>
        <div><em>Matches:</em> <code>"a,a"</code>, <code>"a,a,a"</code></div>
        <div><em>Does not match:</em> <code>"a"</code>, <code>"b,b"</code>,
          <code>"a,a,a,"</code>, <code>"a,a,a,a"</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="a,a" class="exampleInput" name="repetition">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt><code><em>expression</em> ?</code></dt>

  <dd>
    <p>Try to match the expression. If the match succeeds, return its match
      result, otherwise return <code>null</code>. Unlike in regular expressions,
      there is no backtracking.</p>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>maybe = "a"?</code></div>
        <div><em>Matches:</em> <code>"a"</code>, <code>""</code></div>
        <div><em>Does not match:</em> (always matches)</div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="aa" class="exampleInput" name="maybe">
        <div class="result"></div>
      </div>
    </div>

  </dd>

  <dt><code>&amp; <em>expression</em></code></dt>

  <dd>
    <p>This is a positive assertion. No input is consumed.</p>
    <p>Try to match the expression. If the match succeeds, just return
      <code>undefined</code> and do not consume any input, otherwise consider the
      match failed.
    </p>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>posAssertion = "a" &"b"</code></div>
        <div><em>Matches:</em> <code>"ab"</code></div>
        <div><em>Does not match:</em> <code>"ac"</code>, <code>"a"</code>, <code>""</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="ab" class="exampleInput" name="posAssertion">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt><code>! <em>expression</em></code></dt>

  <dd>
    <p>This is a negative assertion. No input is consumed.</p>

    <p>Try to match the expression. If the match does
      not succeed, just return <code>undefined</code> and do not consume any
      input, otherwise consider the match failed.</p>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>negAssertion = "a" !"b"</code></div>
        <div><em>Matches:</em> <code>"a"</code>, <code>"ac"</code></div>
        <div><em>Does not match:</em> <code>"ab"</code>, <code>""</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="ac" class="exampleInput" name="negAssertion">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt id="-predicate-"><code>&amp; { <em>predicate</em> }</code></dt>

  <dd>
    <p>This is a positive assertion. No input is consumed.</p>

    <p>The predicate should be JavaScript code, and it's executed as a
      function. Curly braces in the predicate must be balanced.</p>

    <p>The predicate should <code>return</code> a boolean value. If the result
      is truthy, it's match result is <code>undefined</code>, otherwise the
      match is considered failed. Failure to include the <code>return</code>
      keyword is a common mistake.</p>

    <p>The predicate has access to all variables and functions in the
      <a href="#action-execution-environment">Action Execution Environment</a>.
    </p>

    <div class="example">
      <div>
        <div><em>Example:</em> <br><code>posPredicate = [0-9]+ &amp;{return parseInt(match, 10) &lt; 100}</code></div>
        <div><em>Matches:</em> <code>"0"</code>, <code>"99"</code></div>
        <div><em>Does not match:</em> <code>"100"</code>, <code>"-1"</code>, <code>""</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="56" class="exampleInput" name="posPredicate">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt id="--predicate-"><code>! { <em>predicate</em> }</code></dt>

  <dd>
    <p>This is a negative assertion. No input is consumed.</p>

    <p>The predicate should be JavaScript code, and it's executed as a
      function. Curly braces in the predicate must be balanced.</p>

    <p>The predicate should <code>return</code> a boolean value. If the result is
      falsy, it's match result is <code>undefined</code>, otherwise the match is
      considered failed.</p>

    <p>The predicate has access to all variables and functions in the
      <a href="#action-execution-environment">Action Execution Environment</a>.
    </p>

    <div class="example">
      <div>
        <div><em>Example:</em> <br><code>negPredicate = $[0-9]+ !{ return parseInt(match, 10) &lt; 100 }</code></div>
        <div><em>Matches:</em> <code>"100"</code>, <code>"156"</code></div>
        <div><em>Does not match:</em> <code>"56"</code>, <code>"-1"</code>, <code>""</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="156" class="exampleInput" name="negPredicate">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt id="-expression-2"><code>$ <em>expression</em></code></dt>

  <dd>
    <p>Try to match the expression. If the match succeeds, return the matched
      text instead of the match result.</p>

    <p>If you need to return the matched text in an action, use the
      <a href="#action-execution-environment"><code>text()</code> function</a>.
    </p>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>dollar = $"a"+</code></div>
        <div><em>Matches:</em> <code>"a"</code>, <code>"aa"</code></div>
        <div><em>Does not match:</em> <code>"b"</code>, <code>""</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="aaa" class="exampleInput" name="dollar">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt><code><em>label</em> : <em>expression</em></code></dt>

  <dd>
    <p>Match the expression and remember its match result under given label. The
      label must be a Peggy <a href="#identifiers">identifier</a>.</p>

    <p>Labeled expressions are useful together with actions, where saved match
      results can be accessed by action's JavaScript code.</p>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>label = foo:"bar"i { return {foo}; }</code></div>
        <div><em>Matches:</em> <code>"bar"</code>, <code>"BAR"</code></div>
        <div><em>Does not match:</em> <code>"b"</code>, <code>""</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="BAR" class="exampleInput" name="label">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt><code><em>@</em> ( <em>label</em> : )? <em>expression</em></code></dt>

  <dd>
    <p>Match the expression and if the label exists, remember its match result
    under given label. The label must be a Peggy
    <a href="#identifiers">identifier</a>, and must be valid as a function parameter
      in the language that is being generated (by default, JavaScript).</p>

    <p>Return the value of this expression from the rule, or "pluck" it. You
      may not have an action for this rule. The expression must not be a
      semantic predicate (<a href="#-predicate-"><code>&{ predicate }</code></a> or
      <a href="#--predicate-"><code>!{ predicate }</code></a>). There may be multiple
      pluck expressions in a given rule, in which case an array of the plucked
      expressions is returned from the rule.
    </p>

    <p>Pluck expressions are useful for writing terse grammars, or returning
      parts of an expression that is wrapped in parentheses.</p>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>pluck_1 = @$"a"+ " "+ @$"b"+</code></div>
        <div><em>Matches:</em> <code>"aaa "</code>, <code>"a  "</code></div>
        <div><em>Does not match:</em> <code>"b"</code>, <code>" "</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="aaa   " class="exampleInput" name="pluck_1">
        <div class="result"></div>
      </div>
    </div>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>pluck_2 = @$"a"+ " "+ @two:$"b"+</code></div>
        <div><em>Matches:</em> <code>"aaa b"</code>, <code>"a  bbb"</code></div>
        <div><em>Does not match:</em> <code>"b"</code>, <code>" "</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="aaa   bbb" class="exampleInput" name="pluck_2">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt><code><em>expression<sub>1</sub></em> <em>expression<sub>2</sub></em> ...  <em>expression<sub>n</sub></em></code>
  </dt>

  <dd>
    <p>Match a sequence of expressions and return their match results in an array.</p>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>sequence = "a" "b" "c"</code></div>
        <div><em>Matches:</em> <code>"abc"</code></div>
        <div><em>Does not match:</em> <code>"b"</code>, <code>" "</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="abc" class="exampleInput" name="sequence">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt><code><em>expression</em> { <em>action</em> }</code></dt>

  <dd>
    <p>If the expression matches successfully, run the action, otherwise
      consider the match failed.</p>

    <p>The action should be JavaScript code, and it's executed as a
      function. Curly braces in the action must be balanced.</p>

    <p>The action should <code>return</code> some value, which will be used as the
      match result of the expression.</p>

    <p>The action has access to all variables and functions in the
      <a href="#action-execution-environment">Action Execution Environment</a>.
    </p>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>action = " "+ "a" { return location(); }</code></div>
        <div><em>Matches:</em> <code>"  a"</code></div>
        <div><em>Does not match:</em> <code>"a"</code>, <code>" "</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="  a" class="exampleInput" name="action">
        <div class="result"></div>
      </div>
    </div>
  </dd>

  <dt>
    <code><em>expression<sub>1</sub></em> / <em>expression<sub>2</sub></em> / ... / <em>expression<sub>n</sub></em></code>
  </dt>

  <dd>
    <p>Try to match the first expression, if it does not succeed, try the second
      one, etc. Return the match result of the first successfully matched
      expression. If no expression matches, consider the match failed.</p>

    <div class="example">
      <div>
        <div><em>Example:</em> <code>alt = "a" / "b" / "c"</code></div>
        <div><em>Matches:</em> <code>"a"</code>, <code>"b"</code>, <code>"c"</code></div>
        <div><em>Does not match:</em> <code>"d"</code>, <code>""</code></div>
      </div>
      <div class="try">
        <em>Try it:</em>
        <input type="text" value="a" class="exampleInput" name="alt">
        <div class="result"></div>
      </div>
    </div>
  </dd>
</dl>

<h3 id="action-execution-environment">Action Execution Environment</h3>

<p>Actions and predicates have these variables and functions
  available to them.</p>

<ul>
  <li>
    <p>All variables and functions defined in the initializer or the top-level
      initializer at the beginning of the grammar are available.</p>
  </li>
  <li>
    <p>Note, that all functions and variables, described below, are unavailable
      in the global initializer.</p>
  </li>
  <li>
    <p>Labels from preceding expressions are available as local variables,
      which will have the match result of the labelled expressions.</p>
    <p>A label is only available after its labelled expression is matched:</p>
    <pre><code class="language-peggy">rule = A:('a' B:'b' { /* B is available, A is not */ } )</code></pre>
    <p>A label in a sub-expression is only valid within the
      sub-expression:</p>
    <pre><code class="language-peggy">rule = A:'a' (B:'b') (C:'b' { /* A and C are available, B is not */ })</code></pre>
  </li>
  <li>
    <p><code>input</code> is a parsed string that was passed to the <code>parse()</code> method.</p>
  </li>
  <li>
    <p><code>options</code> is a variable that contains the parser options.
      That is the same object that was passed to the <code>parse()</code> method.</p>
  </li>
  <li>
    <p><code>error(message, where)</code> will report an error and throw an exception.
      <code>where</code> is optional; the default is the value of <code>location()</code>.
    </p>
  </li>
  <li>
    <p><code>expected(message, where)</code> is similar to <code>error</code>, but reports</p>
    <blockquote>
      <p>Expected <em>message</em> but &quot;<em>other</em>&quot; found.</p>
    </blockquote>
    <p>where <code>other</code> is, by default, the character in the <code>location().start.offset</code> position.</p>
  </li>
  <li>
    <p><code>location()</code> returns an object with the information about the parse position.
      Refer to <a href="#locations">the corresponding section</a> for the details.</p>
  </li>
  <li>
    <p><code>range()</code> is similar to <code>location()</code>, but returns an object with offsets only.
      Refer to <a href="#locations">the &quot;Locations&quot; section</a> for the details.</p>
  </li>
  <li>
    <p><code>offset()</code> returns only the start offset, i.e. <code>location().start.offset</code>.
      Refer to <a href="#locations">the &quot;Locations&quot; section</a> for the details.</p>
  </li>
  <li>
    <p><code>text()</code> returns the source text between <code>start</code> and <code>end</code> (which will be
      <code>&quot;&quot;</code> for
      predicates). Instead of using that function as a return value for the rule consider
      using the <a href="#-expression-2"><code>$</code> operator</a>.</p>
  </li>
</ul>

<h3 id="parsing-lists">Parsing Lists</h3>

<p>One of the most frequent questions about Peggy grammars is how to parse a
  delimited list of items.  The cleanest current approach is:</p>

<pre><code class="language-peggy">list
  = word|.., _ "," _|
word
  = $[a-z]i+
_
  = [ \t]*</code></pre>

<p>If you want to allow a trailing delimiter, append it to the end of the rule:</p>

<pre><code class="language-peggy">list
  = word|.., delimiter| delimiter?
delimiter
  = _ "," _
word
  = $[a-z]i+
_
  = [ \t]*</code></pre>

<p>In the grammars created before the repetition operator was added to the peggy
  (in 3.0.0) you could see that approach, which is equivalent of the new approach
  with the repetition operator, but less efficient on long lists:</p>

<pre><code class="language-peggy">list
  = head:word tail:(_ "," _ @word)* { return [head, ...tail]; }
word
  = $[a-z]i+
_
  = [ \t]*</code></pre>

<p>Note that the <code>@</code> in the tail section plucks the word out of the
  parentheses, NOT out of the rule itself.</p>

<h2 id="identifiers">Peggy Identifiers</h2>

<p>Peggy Identifiers are used as rule names, rule references, and label names.
  They are used as identifiers in the code that Peggy generates (by default,
  JavaScript), and as such, must conform to the limitations of the Peggy grammar
  as well as those of the target language.</p>

<p>Like all Peggy grammar constructs, identifiers MUST contain only codepoints in the
  <a href="https://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane">Basic
    Multilingual Plane</a>.  They must begin with a codepoint whose Unicode
    General Category property is Lu, Ll, Lt, Lm, Lo, or Nl (letters),
    "_" (underscore), or a Unicode escape in the form <code>\uXXXX</code>.
    Subsequent codepoints can be any of those that are valid as an initial
    codepoint, "$", codepoints whose General Category property is Mn or Mc
    (combining characters), Nd (numbers), Pc (connector punctuation),
    "\u200C" (zero width non-joiner), or "\u200D (zero width joiner)"</p>

<p>Labels have a further restriction, which is that they must be valid as
  a function parameter in the language being generated.  For JavaScript, this
  means that they cannot be on the limited set of
  <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Lexical_grammar#reserved_words">JavaScript
  reserved words</a>. Plugins can modify the list of reserved words at compile time.
</p>

<p>Valid identifiers:</p>
<ul>
  <li><code>Foo</code></li>
  <li><code>Bär</code></li>
  <li><code>_foo</code></li>
  <li><code>foo$bar</code></li>
</ul>

<p><b>Invalid</b> identifiers:</p>
<ul>
  <li><code>const</code> (reserved word)</li>
  <li><code>𐓁𐒰͘𐓐𐓎𐓊𐒷</code> (valid in JavaScript, but not in the Basic Multilingual Plane)</li>
  <li><code>$Bar</code> (starts with "$")</li>
  <li><code>foo bar</code> (invalid JavaScript identifier containing space)</li>
</ul>

<h2 id="error-messages">Error Messages</h2>
<p>As described above, you can annotate your grammar rules with human-readable
  names that will be used in error messages. For example, this production:</p>

<pre><code class="language-peggy">integer "simple number"
  = digits:[0-9]+</code></pre>
<p>will produce an error message like:</p>

<blockquote>Expected simple number but "a" found.</blockquote>

<p>when parsing a non-number, referencing the human-readable name "simple
  number." Without the human-readable name, Peggy instead uses a description of
  the character class that failed to match:</p>

<blockquote>Expected [0-9] but "a" found.</blockquote>

<p>Aside from the text content of messages, human-readable names also have a subtler effect on <em>where</em> errors are reported. Peggy prefers to match named rules completely or not at all, but not partially. Unnamed rules, on the other hand, can produce an error in the middle of their subexpressions.</p>

<p>For example, for this rule matching a comma-separated list of integers:</p>

<pre><code class="language-peggy">seq
  = integer ("," integer)*</code></pre>
<p>an input like 1,2,a produces this error message:</p>

<blockquote>Expected integer but "a" found.</blockquote>

<p>But if we add a human-readable name to the seq production:</p>

<pre><code class="language-peggy">seq "list of numbers"
  = integer ("," integer)*</code></pre>
<p>then Peggy prefers an error message that implies a smaller attempted parse tree:</p>

<blockquote>Expected end of input but "," found.</blockquote>

<p>There are two classes of errors in Peggy:</p>

<ul>
  <li><code>SyntaxError</code>: Syntax errors, found during parsing the input.
    This kind of errors can be thrown both during <em>grammar</em> parsing and
    during <em>input</em> parsing. Although name is the same, errors of each
    generated parser (including Peggy parser itself) has its own unique
    class.</li>
  <li><code>GrammarError</code>: Grammar errors, found during construction of
    the parser. These errors can be thrown only in the parser generation phase.
    This error signals a logical mistake in the grammar, such as having two
    rules with the same name in one grammar, etc.</li>
</ul>

<p>By default, stringifying these errors produces an error string without location information. These errors also have a <code>format()</code> method that produces an error string with location information. If you provide an array of mappings from the <a href="#grammar-source"><code>grammarSource</code></a> to the input string being processed, then the formatted error string includes ASCII arrows and underlines highlighting the error(s) in the source.</p>

<pre><code class="language-javascript">let source = ...;
try {
  peggy.generate( , { grammarSource: 'recursion.pegjs', ... }); // throws SyntaxError or GrammarError
  parser.parse(input, { grammarSource: 'input.js', ... }); // throws SyntaxError
} catch (e) {
  if (typeof e.format === "function") {
    console.log(e.format([
      { source: 'main.pegjs', text },
      { source: 'input.js', text: input },
      ...
    ]));
  } else {
    throw e;
  }
}</code></pre>

<p>Messages generated by <code>format()</code> look like this</p>

<pre><code class="language-console">Error: Possible infinite loop when parsing (left recursion: start -> proxy -> end -> start)
 --> .\recursion.pegjs:1:1
  |
1 | start = proxy;
  | ^^^^^
note: Step 1: call of the rule "proxy" without input consumption
 --> .\recursion.pegjs:1:9
  |
1 | start = proxy;
  |         ^^^^^
note: Step 2: call of the rule "end" without input consumption
 --> .\recursion.pegjs:2:11
  |
2 | proxy = a:end { return a; };
  |           ^^^
note: Step 3: call itself without input consumption - left recursion
 --> .\recursion.pegjs:3:8
  |
3 | end = !start
  |        ^^^^^
  Error: Expected ";" or "{" but "x" found.
--> input.js:1:16
  |
1 | function main()x {}
  |                ^
</code></pre>

<p>A plugin may register additional passes that can generate
  <code>GrammarError</code>s to report about problems, but they shouldn't do
  that by throwing an instance of <code>GrammarError</code>. They should use the
  <a href="#session-api">session API</a> instead.</p>

<h2 id="locations">Locations</h2>

<p>During the parsing you can access to the information of the current parse
  location, such as offset in the parsed string, line and column information.
  You can get this information by calling <code>location()</code> function,
  which returns you the following object:</p>

<pre><code class="language-javascript">{
  source: options.grammarSource,
  start: { offset: 23, line: 5, column: 6 },
  end: { offset: 25, line: 5, column: 8 }
}
</code></pre>

<p><code>source</code> is the a string or object that was supplied in the
  <a href="#grammar-source"><code>grammarSource</code></a> parser option.</p>
  
<p>For certain special cases, you can use an instance of the
  <code>GrammarLocation</code> class as the <code>grammarSource</code>.
  <code>GrammarLocation</code> allows you to specify the offset of the grammar
  source in another file, such as when that grammar is embedded in a larger
  document.</p>

<p>If <code>source</code> is <code>null</code> or <code>undefined</code> it doesn't appear in the formatted messages.
  The default value for <code>source</code> is <code>undefined</code>.</p>

<p>For actions, <code>start</code> refers to the position at the beginning of the preceding
  expression, and <code>end</code> refers to the position after the end of the preceding
  expression.</p>

<p>For semantic predicates, <code>start</code> and <code>end</code> are equal, denoting the location where
  the predicate is evaluated.</p>

<p>For the per-parse initializer, the location is the start of the input, i.e.</p>

<pre><code class="language-javascript">{
  source: options.grammarSource,
  start: { offset: 0, line: 1, column: 1 },
  end: { offset: 0, line: 1, column: 1 }
}
</code></pre>

<p><code>offset</code> is a 0-based character index within the source text.
<code>line</code> and <code>column</code> are 1-based indices.</p>

<p>The line number is incremented each time the parser finds an end of line sequence in
  the input.</p>

<p>Line and column are somewhat expensive to compute, so if you just need the
  offset, there's also a function <code>offset()</code> that returns just the
  start offset, and a function <code>range()</code> that returns the object:</p>

<pre><code class="language-javascript">{
  source: options.grammarSource,
  start: 23,
  end: 25
}</code></pre>

<p>(i.e. difference from the <code>location()</code> result only in type of
  <code>start</code> and <code>end</code> properties, which contain just an
  offset instead of the <code>Location</code>
  object.)</p>

<p>All of the notes about values for <code>location()</code> object are also
  applicable to the <code>range()</code> and <code>offset()</code> calls.</p>

<p>Currently, Peggy grammars may only contain codepoints from the
  <a href="https://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane">Basic
  Multilingual Plane (BMP)</a> of Unicode.
  This means that all offsets are measured in UTF-16 code units. If you
  include characters outside this Plane (for example, emoji, or any
  surrogate pairs), you may get an offset inside a code point.</p>

<p>Changing this behavior might be a breaking change, so it will likely cause
  a major version number increase if it happens. You can join to the discussion
  for this topic on the <a href="https://github.com/peggyjs/peggy/discussions/15">GitHub Discussions
    page</a>.</p>

<h2 id="plugins-api">Plugins API</h2>

<p>A plugin is an object with the <code>use(config, options)</code> method.
  That method will be called for all plugins in the <code>options.plugins</code>
  array, supplied to the <code>generate()</code> method.</p>

<p><code>use</code> accepts these parameters:</p>

<h3><code>config</code></h3>
<p>Object with the following properties:</p>

<dl>
  <dt><code>parser</code></dt>
  <dd><code>Parser</code> object, by default the <code>peggy.parser</code> instance. That object
    will be used to parse the grammar. Plugin can replace this object</dd>

  <dt><code>passes</code></dt>
  <dd>
    <p>Mapping <code>{ [stage: string]: Pass[] }</code> that represents compilation
      stages that would applied to the AST, returned by the <code>parser</code> object. That
      mapping will contain at least the following keys:</p>

    <ul>
      <li><code>check</code> — passes that check AST for correctness. They shouldn't change the AST</li>
      <li><code>transform</code> — passes that performs various optimizations. They can change
        the AST, add or remove nodes or their properties</li>
      <li><code>generate</code> — passes used for actual code generating</li>
    </ul>

    <p>A plugin that implements a pass should usually push it to the end of the correct
      array. Each pass is a function with the signature <code>pass(ast, options, session)</code>:</p>

    <ul>
      <li><code>ast</code> — the AST created by the
        <code>config.parser.parse()</code> method
      </li>
      <li><code>options</code> — compilation options passed to the
        <code>peggy.compiler.compile()</code> method. If parser generation is
        started because <code>generate()</code> function was called that is also an
        options, passed to the <code>generate()</code> method
      </li>
      <li><code>session</code> — a <a href="#session-api"><code>Session</code></a>
        object that allows raising errors, warnings and informational messages</li>
    </ul>
  </dd>

  <dt><code>reservedWords</code></dt>
  <dd>
    <p>String array with a list of words that shouldn't be used as label
      names. This list can be modified by plugins. That property is not required
      to be sorted or not contain duplicates, but it is recommend to remove
      duplicates.</p>

    <p>Default list contains <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Lexical_grammar#reserved_words">JavaScript
      reserved words</a>, and can be found in the <code>peggy.RESERVED_WORDS</code>
      property.</p>
  </dd>
</dl>

<h3><code>options</code></h3>

<p>Build options passed to the <code>generate()</code> method. A best practice
  for a plugin would look for its own options under a
  <code>&lt;plugin_name&gt;</code> key.</p>


<h3 id="session-api">Session API</h3>

<p>Each compilation request is represented by a <code>Session</code> instance.
  An object of this class is created by the compiler and given to each pass as a
  3rd parameter. The session object gives access to the various compiler
  services. At the present time there is only one such service: reporting of
  diagnostics.</p>

<p>All diagnostics are divided into three groups: errors, warnings and
  informational messages. For each of them the <code>Session</code> object has a
  method, described below.</p>

<p>All reporting methods have an identical signature:</p>

<pre><code class="language-typescript">(message: string, location?: LocationRange, notes?: DiagnosticNote[]) =&gt; void;</code></pre>

<ul>
  <li><code>message</code>: a main diagnostic message</li>
  <li><code>location</code>: an optional location information if diagnostic is related to the grammar
    source code</li>
  <li><code>notes</code>: an array with additional details about diagnostic, pointing to the
    different places in the grammar. For example, each note could be a location of
    a duplicated rule definition</li>
</ul>

<dl>
  <dt><code>error(...)</code></dt>
  <dd>
    <p>Reports an error. Compilation process is subdivided into pieces called <em>stages</em> and
      each stage consist of one or more <em>passes</em>. Within the one stage all errors, reported
      by different passes, are collected without interrupting the parsing process.</p>

    <p>When all passes in the stage are completed, the stage is checked for errors. If one
      was registered, a <code>GrammarError</code> with all found problems in the <code>problems</code> property
      is thrown. If there are no errors, then the next stage is processed.</p>

    <p>After processing all three stages (<code>check</code>, <code>transform</code> and <code>generate</code>) the
      compilation
      process is finished.</p>

    <p>The process, described above, means that passes should be careful about what they do.
      For example, if you place your pass into the <code>check</code> stage there is no guarantee that
      all rules exists, because checking for existing rules is also performed during the
      <code>check</code> stage. On the contrary, passes in the <code>transform</code> and <code>generate</code> stages
      can be
      sure that all rules exists, because that precondition was checked on the <code>check</code> stage.
    </p>
  </dd>

  <dt><code>warning(...)</code></dt>
  <dd>Reports a warning. Warnings are similar to errors, but they do not interrupt a compilation.</dd>

  <dt><code>info(...)</code></dt>
  <dd>Report an informational message. This method can be used to inform user about
    significant changes in the grammar, for example, replacing proxy rules.</dd>
</dl>

<h2 id="compatibility">Compatibility</h2>

<p>Both the parser generator and generated parsers should run well in the
  following environments:</p>

<ul>
  <li>Node.js 14+</li>
  <li>Edge</li>
  <li>Firefox</li>
  <li>Chrome</li>
  <li>Safari</li>
  <li>Opera</li>
</ul>

<p>The generated parser is intended to run in older environments when the format
  chosen is "globals" or "umd". Extensive testing is NOT performed in these
  environments, but issues filed regarding the generated code will be fixed.</p>

<script>
function validateGrammar({target}) {
  const results = target.nextElementSibling;
  try {
    const res = peggyExamples.parse(target.value, {startRule: target.name});
    // not innerHTML, or needs to be escaped.
    results.innerText = JSON.stringify(res);
    results.classList.remove('error');
  } catch (e) {
    results.innerText = e.toString();
    results.classList.add('error');
  }
}
const inputs = document.querySelectorAll('.example input');
for (const i of inputs) {
  i.addEventListener("input", validateGrammar)
  validateGrammar({target: i});
}
</script>