- Terminology
- Base parsers and parser generators
- Parsimmon.createLanguage(parsers)
- Parsimmon(fn)
- Parsimmon.Parser(fn)
- Parsimmon.makeSuccess(index, value)
- Parsimmon.makeFailure(furthest, expectation)
- Parsimmon.isParser(obj)
- Parsimmon.string(string)
- Parsimmon.oneOf(string)
- Parsimmon.noneOf(string)
- Parsimmon.range(begin, end)
- Parsimmon.regexp(regexp)
- Parsimmon.regexp(regexp, group)
- Parsimmon.regex
- Parsimmon.notFollowedBy(parser)
- Parsimmon.lookahead(parser)
- Parsimmon.lookahead(string)
- Parsimmon.lookahead(regexp)
- Parsimmon.succeed(result)
- Parsimmon.of(result)
- Parsimmon.formatError(string, error)
- Parsimmon.seq(p1, p2, ...pn)
- Parsimmon.seqMap(p1, p2, ...pn, function(r1, r2, ...rn))
- Parsimmon.seqObj(...args)
- Parsimmon.alt(p1, p2, ...pn)
- Parsimmon.sepBy(content, separator)
- Parsimmon.sepBy1(content, separator)
- Parsimmon.lazy(fn)
- Parsimmon.lazy(description, fn)
- Parsimmon.fail(message)
- Parsimmon.letter
- Parsimmon.letters
- Parsimmon.digit
- Parsimmon.digits
- Parsimmon.whitespace
- Parsimmon.optWhitespace
- Parsimmon.cr
- Parsimmon.lf
- Parsimmon.crlf
- Parsimmon.newline
- Parsimmon.end
- Parsimmon.any
- Parsimmon.all
- Parsimmon.eof
- Parsimmon.index
- Parsimmon.test(predicate)
- Parsimmon.takeWhile(predicate)
- Parsimmon.custom(fn)
- Binary constructors
- Parser methods
- parser.parse(input)
- parser.tryParse(input)
- parser.or(otherParser)
- parser.chain(newParserFunc)
- parser.then(anotherParser)
- parser.map(fn)
- parser.contramap(fn)
- parser.promap(fn)
- parser.result(value)
- parser.fallback(value)
- parser.skip(otherParser)
- parser.trim(anotherParser)
- parser.wrap(before, after)
- parser.notFollowedBy(anotherParser)
- parser.lookahead(anotherParser)
- parser.lookahead(string)
- parser.lookahead(regexp)
- parser.assert(condition, message)
- parser.tie()
- parser.many()
- parser.times(n)
- parser.times(min, max)
- parser.atMost(n)
- parser.atLeast(n)
- parser.node(name)
- parser.mark()
- parser.thru(wrapper)
- parser.desc(description)
- FantasyLand support
- Tips
When the documentation says a function yields an array of strings, it means the function returns a parser that when called with .parse
will return an object containing the array of strings.
The string passed to .parse
is called the input.
A parser is said to consume the text that it parses, leaving only the unconsumed text for subsequent parsers to check.
These are either parsers or functions that return new parsers. These are the building blocks of parsers. They are all contained in the Parsimmon
object.
createLanguage
is the best starting point for building a language parser in Parsimmon. It organizes all of your parsers, collects them into a single namespace, and removes the need to worry about using Parsimmon.lazy
.
Each function passed to createLanguage
receives as its only parameter the entire language of parsers as an object. This is used for referring to other rules from within your current rule.
Example:
var Lang = Parsimmon.createLanguage({
Value: function(r) {
return Parsimmon.alt(r.Number, r.Symbol, r.List);
},
Number: function() {
return Parsimmon.regexp(/[0-9]+/).map(Number);
},
Symbol: function() {
return Parsimmon.regexp(/[a-z]+/);
},
List: function(r) {
return Parsimmon.string("(")
.then(r.Value.sepBy(r._))
.skip(Parsimmon.string(")"));
},
_: function() {
return Parsimmon.optWhitespace;
}
});
Lang.Value.tryParse("(list 1 2 foo (list nice 3 56 989 asdasdas))");
NOTE: You probably will never need to use this function. Most parsing can be accomplished using Parsimmon.regexp
and combination with Parsimmon.seq
and Parsimmon.alt
.
You can add a primitive parser (similar to the included ones) by using Parsimmon(fn)
. This is an example of how to create a parser that matches any character except the one provided:
function notChar(char) {
return Parsimmon(function(input, i) {
if (input.charAt(i) !== char) {
return Parsimmon.makeSuccess(i + 1, input.charAt(i));
}
return Parsimmon.makeFailure(i, 'anything different than "' + char + '"');
});
}
This parser can then be used and composed the same way all the existing ones are used and composed, for example:
var parser = Parsimmon.seq(Parsimmon.string("a"), notChar("b").times(5));
parser.parse("accccc");
//=> {status: true, value: ['a', ['c', 'c', 'c', 'c', 'c']]}
Alias of Parsimmon(fn)
for backward compatibility.
To be used inside of Parsimmon(fn)
. Generates an object describing how far the successful parse went (index
), and what value
it created doing so. See documentation for Parsimmon(fn)
.
To be used inside of Parsimmon(fn)
. Generates an object describing how far the unsuccessful parse went (index
), and what kind of syntax it expected to see (expectation
). The expected value may also be an array of different values. See documentation for Parsimmon(fn)
.
Returns true
if obj
is a Parsimmon parser, otherwise false
.
Returns a parser that looks for string
and yields that exact value.
Returns a parser that looks for exactly one character from string
, and yields that character.
Returns a parser that looks for precisely one character NOT from string
, and yields that character.
Parses a single character from begin
to end
, inclusive.
Example:
var firstChar = Parsimmon.alt(
Parsimmon.range("a", "z"),
Parsimmon.range("A", "Z"),
Parsimmon.oneOf("_$")
);
var restChar = firstChar.or(Parsimmon.range("0", "9"));
var identifier = P.seq(firstChar, restChar.many().tie()).tie();
identifier.tryParse("__name$cool10__");
// => '__name$cool10__'
identifier.tryParse("3d");
// => Error
Parsimmon.range(begin, end)
is equivalent to:
Parsimmon.test(function(c) {
return begin <= c && c <= end;
});
Returns a parser that looks for a match to the regexp and yields the entire text matched. The regexp will always match starting at the current parse location. The regexp may only use the following flags: imus
. Any other flag will result in an error being thrown.
Like Parsimmon.regexp(regexp)
, but yields only the text in the specific regexp match group
, rather than the match of the entire regexp.
This is an alias for Parsimmon.regexp
.
Parses using parser
, but does not consume what it parses. Yields null
if the parser does not match the input. Otherwise, it fails.
Parses using parser
, but does not consume what it parses. Yields an empty string.
Returns a parser that looks for string
but does not consume it. Yields an empty string.
Returns a parser that wants the input to match regexp
. Yields an empty string.
Returns a parser that doesn't consume any input and yields result
.
This is an alias for Parsimmon.succeed(result)
.
Takes the string
passed to parser.parse(string)
and the error
returned from parser.parse(string)
and turns it into a human-readable error message string. Note that there are certainly better ways to format errors, so feel free to write your own.
Accepts any number of parsers and returns a new parser that expects them to match in order, yielding an array of all their results.
Matches all parsers sequentially, and passes their results as the arguments to a function, yielding the return value of that function. Similar to calling Parsimmon.seq
and then .map
, but the values are not put in an array. Example:
Parsimmon.seqMap(
Parsimmon.oneOf("abc"),
Parsimmon.oneOf("+-*"),
Parsimmon.oneOf("xyz"),
function(first, operator, second) {
console.log(first); // => 'a'
console.log(operator); // => '+'
console.log(second); // => 'x'
return [operator, first, second];
}
).parse("a+x");
Similar to Parsimmon.seq(...parsers)
, but yields an object of results named based on arguments.
Takes one or more arguments, where each argument is either a parser or a named parser pair ([stringKey, parser]
).
Requires at least one named parser.
All named parser keys must be unique.
Example:
var _ = Parsimmon.optWhitespace;
var identifier = Parsimmon.regexp(/[a-z_][a-z0-9_]*/i);
var lparen = Parsimmon.string("(");
var rparen = Parsimmon.string(")");
var comma = Parsimmon.string(",");
var functionCall = Parsimmon.seqObj(
["function", identifier],
lparen,
["arguments", identifier.trim(_).sepBy(comma)],
rparen
);
functionCall.tryParse("foo(bar, baz, quux)");
// => { function: 'foo',
// arguments: [ 'bar', 'baz', 'quux' ] }
Tip: Combines well with .node(name)
for a full-featured AST node.
Accepts any number of parsers, yielding the value of the first one that succeeds, backtracking in between.
This means that the order of parsers matters. If two parsers match the same prefix, the longer of the two must come first. Example:
Parsimmon.alt(Parsimmon.string("ab"), Parsimmon.string("a")).parse("ab");
// => {status: true, value: 'ab'}
Parsimmon.alt(Parsimmon.string("a"), Parsimmon.string("ab")).parse("ab");
// => {status: false, ...}
In the second case, Parsimmon.alt
matches on the first parser, then there are extra characters left over ('b'
), so Parsimmon returns a failure.
NOTE: This is not needed if you're using createLanguage
.
Accepts a function that returns a parser, which is evaluated the first time the parser is used. This is useful for referencing parsers that haven't yet been defined, and for implementing recursive parsers. Example:
var Value = Parsimmon.lazy(function() {
return Parsimmon.alt(
Parsimmon.string("X"),
Parsimmon.string("(")
.then(Value)
.skip(Parsimmon.string(")"))
);
});
Value.parse("X"); // => {status: true, value: 'X'}
Value.parse("(X)"); // => {status: true, value: 'X'}
Value.parse("((X))"); // => {status: true, value: 'X'}
Equivalent to Parsimmon.lazy(f).desc(description)
.
Returns a failing parser with the given message.
Equivalent to Parsimmon.regexp(/[a-z]/i)
.
Equivalent to Parsimmon.regexp(/[a-z]*/i)
.
Equivalent to Parsimmon.regexp(/[0-9]/)
.
Equivalent to Parsimmon.regexp(/[0-9]*/)
.
Equivalent to Parsimmon.regexp(/\s+/)
.
Equivalent to Parsimmon.regexp(/\s*/)
.
Equivalent to Parsimmon.string("\r")
.
This parser checks for the "carriage return" character, which is used as the line terminator for classic Mac OS 9 text files.
Equivalent to Parsimmon.string("\n")
.
This parser checks for the "line feed" character, which is used as the line terminator for Linux and macOS text files.
Equivalent to Parsimmon.string("\r\n")
.
This parser checks for the "carriage return" character followed by the "line feed" character, which is used as the line terminator for Windows text files and HTTP headers.
Equivalent to:
Parsimmon.alt(Parsimmon.crlf, Parsimmon.lf, Parsimmon.cr).desc("newline");
This flexible parser will match any kind of text file line ending.
Equivalent to Parsimmon.alt(Parsimmon.newline, Parsimmon.eof)
.
This is the most general purpose "end of line" parser. It allows the "end of file" in addition to all three text file line endings from Parsimmon.newline
. This is important because text files frequently do not have line terminators at the end ("trailing newline").
A parser that consumes and yields the next character of the input.
A parser that consumes and yields the entire remainder of the input.
A parser that expects to be at the end of the input (zero characters left).
A parser that consumes no input and yields an object representing the current offset into the parse: it has a 0-based character offset
property and 1-based line
and column
properties. Example:
Parsimmon.seqMap(
Parsimmon.oneOf("Q\n").many(),
Parsimmon.string("B"),
Parsimmon.index,
function(_prefix, B, index) {
console.log(index.offset); // => 8
console.log(index.line); // => 3
console.log(index.column); // => 5
return B;
}
).parse("QQ\n\nQQQB");
Returns a parser that yield a single character if it passes the predicate
function. Example:
var SameUpperLower = Parsimmon.test(function(c) {
return c.toUpperCase() === c.toLowerCase();
});
SameUpperLower.parse("a"); // => {status: false, ...}
SameUpperLower.parse("-"); // => {status: true, ...}
SameUpperLower.parse(":"); // => {status: true, ...}
Returns a parser yield a string containing all the next characters that pass the predicate
. Example:
var CustomString = Parsimmon.string("%")
.then(Parsimmon.any)
.chain(function(start) {
var end =
{
"[": "]",
"(": ")",
"{": "}",
"<": ">"
}[start] || start;
return Parsimmon.takeWhile(function(c) {
return c !== end;
}).skip(Parsimmon.string(end));
});
CustomString.parse("%:a string:"); // => {status: true, value: 'a string'}
CustomString.parse("%[a string]"); // => {status: true, value: 'a string'}
CustomString.parse("%{a string}"); // => {status: true, value: 'a string'}
CustomString.parse("%(a string)"); // => {status: true, value: 'a string'}
CustomString.parse("%<a string>"); // => {status: true, value: 'a string'}
Deprecated: Please use Parsimmon(fn)
going forward.
You can add a primitive parser (similar to the included ones) by using Parsimmon.custom
. This is an example of how to create a parser that matches any character except the one provided:
function notChar(char) {
return Parsimmon.custom(function(success, failure) {
return function(input, i) {
if (input.charAt(i) !== char) {
return success(i + 1, input.charAt(i));
}
return failure(i, 'anything different than "' + char + '"');
};
});
}
This parser can then be used and composed the same way all the existing ones are used and composed, for example:
var parser = Parsimmon.seq(Parsimmon.string("a"), notChar("b").times(5));
parser.parse("accccc");
// => { status: true, value: ['a', ['c', 'c', 'c', 'c', 'c']] }
The Parsimmon.Binary
constructors parse binary content using Node.js Buffers. These constructors can be combined with the normal parser combinators such as Parsimmon.seq
, Parsimmon.seqObj
, and still have all the same methods as text-based parsers (e.g. .map
, .node
, etc.).
Returns a parser that yields a byte (as a number) that matches the given input; similar to Parsimmon.digit
and Parsimmon.letter
.
var parser = Parsimmon.Binary.byte(0x3f);
parser.parse(Buffer.from([0x3f]));
// => { status: true, value: 63 }
Returns a parser that will consume some of a buffer and present it as a raw buffer for further transformation. This buffer is cloned, so in case you use a destructive method, it will not corrupt the original input buffer.
var parser = Parsimmon.Binary.buffer(2).skip(Parsimmon.any);
parser.parse(Buffer.from([1, 2, 3]));
// => { status: true, value: <Buffer 01 02> }
Parse length
bytes, and then decode with a particular encoding
.
var parser = Parsimmon.Binary.encodedString("utf8", 17);
parser.parse(
Buffer.from([
0x68,
0x65,
0x6c,
0x6c,
0x6f,
0x20,
0x74,
0x68,
0x65,
0x72,
0x65,
0x21,
0x20,
0xf0,
0x9f,
0x98,
0x84
])
);
// => { status: true, value: 'hello there! 😄' }
Parse an unsigned integer of 1 byte.
var parser = Parsimmon.Binary.uint8;
parser.parse(Buffer.from([0xff]));
// => { status: true, value: 255 }
Parse a signed integer of 1 byte.
var parser = Parsimmon.Binary.int8;
parser.parse(Buffer.from([0xff]));
// => { status: true, value: -1 }
Parse an unsigned integer (big-endian) of length bytes. Length cannot exceed 6.
var parser = Parsimmon.Binary.uintBE(4);
parser.parse(Buffer.from([1, 2, 3, 4]));
// => { status: true, value: 16909060 }
Parse a signed integer (big-endian) of length bytes. Length cannot exceed 6.
var parser = Parsimmon.Binary.intBE(4);
parser.parse(Buffer.from([0xff, 2, 3, 4]));
// => { status: true, value: -16645372 }
Parse an unsigned integer (little-endian) of length bytes. Length cannot exceed 6.
var parser = Parsimmon.Binary.uintLE(4);
parser.parse(Buffer.from([1, 2, 3, 4]));
// => { status: true, value: 67305985 }
Parse a signed integer (little-endian) of length bytes. Length cannot exceed 6.
var parser = Parsimmon.Binary.intLE(4);
parser.parse(Buffer.from([1, 2, 3, 0xff]));
// => { status: true, value: -16580095 }
Parse an unsigned integer (big-endian) of 2 bytes.
var parser = Parsimmon.Binary.uint16BE;
parser.parse(Buffer.from([0xff, 0xfe]));
// => { status: true, value: 65534 }
Parse a signed integer (big-endian) of 2 bytes.
var parser = Parsimmon.Binary.int16BE;
parser.parse(Buffer.from([0xff, 0xfe]));
// => { status: true, value: -2 }
Parse an unsigned integer (little-endian) of 2 bytes.
var parser = Parsimmon.Binary.uint16LE;
parser.parse(Buffer.from([0xfe, 0xff]));
// => { status: true, value: 65534 }
Parse a signed integer (little-endian) of 2 bytes.
var parser = Parsimmon.Binary.int16LE;
parser.parse(Buffer.from([0xfe, 0xff]));
// => { status: true, value: -2 }
Parse an unsigned integer (big-endian) of 4 bytes.
var parser = Parsimmon.Binary.uint32BE;
parser.parse(Buffer.from([0x00, 0x00, 0x00, 0xff]));
// => { status: true, value: 255 }
Parse an signed integer (big-endian) of 4 bytes.
var parser = Parsimmon.Binary.int32BE;
parser.parse(Buffer.from([0xff, 0xff, 0xff, 0xfe]));
// => { status: true, value: -2 }
Parse an unsigned integer (little-endian) of 4 bytes.
var parser = Parsimmon.Binary.uint32LE;
parser.parse(Buffer.from([0xff, 0x00, 0x00, 0x00]));
// => { status: true, value: 255 }
Parse an signed integer (little-endian) of 4 bytes.
var parser = Parsimmon.Binary.int32LE;
parser.parse(Buffer.from([0xfe, 0xff, 0xff, 0xff]));
// => { status: true, value: -2 }
Parse a float (big-endian) of 4 bytes.
var parser = Parsimmon.Binary.floatBE;
parser.parse(Buffer.from([1, 2, 3, 4]));
// => { status: true, value: 2.387939260590663e-38 }
Parse a float (little-endian) of 4 bytes.
var parser = Parsimmon.Binary.floatLE;
parser.parse(Buffer.from([1, 2, 3, 4]));
// => { status: true, value: 1.539989614439558e-36 }
Parse a double (big-endian) of 8 bytes.
var parser = Parsimmon.Binary.doubleBE;
parser.parse(Buffer.from([1, 2, 3, 4, 5, 6, 7, 8]));
// => { status: true, value: 8.20788039913184e-304 }
Parse a double (little-endian) of 8 bytes.
var parser = Parsimmon.Binary.doubleLE;
parser.parse(Buffer.from([1, 2, 3, 4, 5, 6, 7, 8]));
// => { status: true, value: 5.447603722011605e-270 }
Parse a series of bits that do not have to be byte-aligned and consume them from a Buffer. The maximum number is 48 since more than 48 bits won't fit safely into a JavaScript number without losing precision. Also, the total of all bits in the sequence must be a multiple of 8 since parsing is still done at the byte level.
var parser = Parsimmon.Binary.bitSeq([3, 5, 5, 3]);
parser.parse(Buffer.from([0x04, 0xff]));
//=> { status: true, value: [0, 4, 31, 7] }
Works like Parsimmon.bitSeq
except each item in the array is either a number of bits or pair (array with length = 2) of name and bits. The bits are parsed in order and put into an object based on the name supplied. If there's no name for the bits, it will be parsed but discarded from the returned value.
var parser = Parsimmon.Binary.bitSeqObj([["a", 3], 5, ["b", 5], ["c", 3]]);
parser.parse(Buffer.from([0x04, 0xff]));
//=> { status: true, value: { a: 0, b: 31, c: 7 } }
These methods are all called off of existing parsers, not from the Parsimmon
object itself. They all return new parsers, so you can chain as many of them together as you like.
Apply parser
on the provided string input
, returning an object that contains the status and parsed result.
If the parser succeeds, status
is set to true, and the value will be available in the value
property.
If the parser fails, status
will be false. Further information on the error can be found at index
and expected
. index
represents the furthest reached offset; it has a 0-based character offset
and 1-based line
and column
properties. expected
lists all tried parsers that were available at the offset, but the input couldn't continue with any of these.
var parser = Parsimmon.alt(
// Use `parser.desc(string)` in order to have meaningful failure messages
Parsimmon.string("a").desc("'a' character"),
Parsimmon.string("b").desc("'b' character")
);
parser.parse("a");
// => {status: true, value: 'a'}
parser.parse("ccc");
// => {status: false,
// index: {...},
// expected: ["'a' character", "'b' character"]}
Like parser.parse(input)
but either returns the parsed value or throws an error on failure. The error object contains additional properties about the error.
var parser = Parsimmon.letters.sepBy1(Parsimmon.whitespace);
parser.tryParse("foo bar baz");
// => ['foo', 'bar', 'baz']
try {
parser.tryParse("123");
} catch (err) {
err.message;
// => 'expected one of EOF, whitespace at line 1 column 1, got \'123\''
err.type;
// => 'ParsimmonError'
err.result;
// => {status: false,
// index: {offset: 0, line: 1, column: 1},
// expected: ['EOF', 'whitespace']}
}
Returns a new parser which tries parser
, and if it fails uses otherParser
. Example:
var numberPrefix = Parsimmon.string("+")
.or(Parsimmon.string("-"))
.fallback("");
numberPrefix.parse("+"); // => {status: true, value: '+'}
numberPrefix.parse("-"); // => {status: true, value: '-'}
numberPrefix.parse(""); // => {status: true, value: ''}
Returns a new parser which tries parser
, and on success calls the function newParserFunc
with the result of the parse, which is expected to return another parser, which will be tried next. This allows you to dynamically decide how to continue the parse, which is impossible with the other combinators. Example:
var CustomString = Parsimmon.string("%")
.then(Parsimmon.any)
.chain(function(start) {
var end =
{
"[": "]",
"(": ")",
"{": "}",
"<": ">"
}[start] || start;
return Parsimmon.takeWhile(function(c) {
return c !== end;
}).skip(Parsimmon.string(end));
});
CustomString.parse("%:a string:"); // => {status: true, value: 'a string'}
CustomString.parse("%[a string]"); // => {status: true, value: 'a string'}
CustomString.parse("%{a string}"); // => {status: true, value: 'a string'}
CustomString.parse("%(a string)"); // => {status: true, value: 'a string'}
CustomString.parse("%<a string>"); // => {status: true, value: 'a string'}
Expects anotherParser
to follow parser
, and yields the result
of anotherParser
.
var parserA = p1.then(p2); // is equivalent to...
var parserB = Parsimmon.seqMap(p1, p2, function(x1, x2) {
return x2;
});
Transforms the output of parser
with the given function. Example:
var pNum = Parsimmon.regexp(/[0-9]+/)
.map(Number)
.map(function(x) {
return x + 1;
});
pNum.parse("9"); // => {status: true, value: 10}
pNum.parse("123"); // => {status: true, value: 124}
Transforms the input of parser
with the given function. Example:
var pNum = Parsimmon.string("A").contramap(function(x) {
return x.toUpperCase();
});
pNum.parse("a"); // => {status: true, value: 'A'}
pNum.parse("A"); // => {status: true, value: 'A'}
An important caveat of contramap is that it transforms the remaining input. This means that you cannot expect values after a contramap in general, like the following.
Parsimmon.seq(
Parsimmon.string("a"),
Parsimmon.string("c").contramap(function(x) {
return x.slice(1);
}),
Parsimmon.string("d")
)
.tie()
.parse("abcd"); //this will fail
Parsimmon.seq(
Parsimmon.string("a"),
Parsimmon.seq(Parsimmon.string("c"), Parsimmon.string("d"))
.tie()
.contramap(function(x) {
return x.slice(1);
})
)
.tie()
.parse("abcd"); // => {status: true, value: 'acd'}
Transforms the input and output of parser
with the given functions. Example:
var pNum = Parsimmon.string("A").promap(
function(x) {
return x.toUpperCase();
},
function(x) {
return x.charCodeAt(0);
}
);
pNum.parse("a"); // => {status: true, value: 65}
pNum.parse("A"); // => {status: true, value: 65}
The same caveat for contramap above applies to promap.
Returns a new parser with the same behavior, but which yields value
. Equivalent to parser.map(function(x) { return x; }.bind(null, value))
.
Returns a new parser which tries parser
and, if it fails, yields value
without consuming any input. Equivalent to parser.or(Parsimmon.of(value))
.
var digitOrZero = Parsimmon.digit.fallback("0");
digitOrZero.parse("4"); // => {status: true, value: '4'}
digitOrZero.parse(""); // => {status: true, value: '0'}
Expects otherParser
after parser
, but yields the value of parser
.
var parserA = p1.skip(p2); // is equivalent to...
var parserB = Parsimmon.seqMap(p1, p2, function(x1, x2) {
return x1;
});
Expects anotherParser
before and after parser
, yielding the result of the parser. Useful for trimming comments/whitespace around other parsers.
Example:
Parsimmon.digits
.map(Number)
.trim(Parsimmon.optWhitespace)
.sepBy(Parsimmon.string(","))
.tryParse(" 1, 2,3 , 4 ");
// => [1, 2, 3, 4]
It is equivalent to:
anotherParser.then(parser).skip(anotherParser);
It is also equivalent to:
Parsimmon.seqMap(anotherParser, parser, anotherParser, function(
before,
middle
) {
return middle;
});
Expects the parser before
before parser
and after
after `parser.
Example:
Parsimmon.letters
.trim(Parsimmon.optWhitespace)
.wrap(Parsimmon.string("("), Parsimmon.string(")"))
.tryParse("( nice )");
// => 'nice'
It is equivalent to:
before.then(parser).skip(after);
It is also equivalent to:
Parsimmon.seqMap(before, parser, after, function(before, middle) {
return middle;
});
Returns a parser that looks for anything but whatever anotherParser
wants to parse, and does not consume it. Yields the same result as parser
. Equivalent to parser.skip(Parsimmon.notFollowedBy(anotherParser))
.
Returns a parser that looks for whatever anotherParser
wants to parse, but does not consume it. Yields the same result as parser
. Equivalent to parser.skip(Parsimmon.lookahead(anotherParser))
.
Returns a parser that looks for string
but does not consume it. Yields the same result as parser
. Equivalent to parser.skip(Parsimmon.lookahead(string))
.
Returns a parser that wants the input to match regexp
. Yields the same result as parser
. Equivalent to parser.skip(Parsimmon.lookahead(regexp))
.
Passes the result of parser
to the function condition
, which returns a boolean. If the the condition is false, returns a failed parse with the given message
. Else is returns the original result of parser
.
var evenLengthNumber = P.digits
.assert(function(s) {
return s.length % 2 === 0;
}, "even length number")
.map(Number);
evenLengthNumber.parse("34");
// { status: true, value: 34 }
evenLengthNumber.parse("1");
// { status: false,
// expected: ["even length number"],
// index: {...} }
When called on a parser yielding an array of strings, yields all their strings concatenated with the separator
. Asserts that its input is actually an array of strings.
Example:
var number = Parsimmon.seq(
Parsimmon.digits,
Parsimmon.string("."),
Parsimmon.digits
)
.tieWith("")
.map(Number);
number.tryParse("1.23");
// => 1.23
parser.tieWith(separator)
is similar to this:
parser.map(function(array) {
return array.join(separator);
});
Equivalent to parser.tieWith("")
.
Note: parser.tie()
is usually used after Parsimmon.seq(...parsers)
or parser.many()
.
Expects parser
zero or more times, and yields an array of the results.
NOTE: If parser
is capable of parsing an empty string (i.e. parser.parse("")
succeeds) then parser.many()
will throw an error. Otherwise parser.many()
would get stuck in an infinite loop.
Expects parser
exactly n
times, and yields an array of the results.
Expects parser
between min
and max
times, and yields an array
of the results.
Expects parser
at most n
times. Yields an array of the results.
Expects parser
at least n
times. Yields an array of the results.
Yields an object with name
, value
, start
, and end
keys, where value
is the original value yielded by the parser, name
is the argument passed in, and start
and end
are are objects with a 0-based offset
and 1-based line
and column
properties that represent the position in the input that contained the parsed text.
Example:
var Identifier = Parsimmon.regexp(/[a-z]+/).node("Identifier");
Identifier.tryParse("hey");
// => { name: 'Identifier',
// value: 'hey',
// start: { offset: 0, line: 1, column: 1 },
// end: { offset: 3, line: 1, column: 4 } }
Yields an object with start
, value
, and end
keys, where value
is the original value yielded by the parser, and start
and end
are are objects with a 0-based offset
and 1-based line
and column
properties that represent the position in the input that contained the parsed text. Works like this function:
var Identifier = Parsimmon.regexp(/[a-z]+/).mark();
Identifier.tryParse("hey");
// => { start: { offset: 0, line: 1, column: 1 },
// value: 'hey',
// end: { offset: 3, line: 1, column: 4 } }
Simply returns wrapper(this)
from the parser. Useful for custom functions used to wrap your parsers, while keeping with Parsimmon chaining style.
Example:
function makeNode(name) {
return function(parser) {
return Parsimmon.seqMap(Parsimmon.index, parser, Parsimmon.index, function(
start,
value,
end
) {
return Object.freeze({
type: "myLanguage." + name,
value: value,
start: start,
end: end
});
});
};
}
var Identifier = Parsimmon.regexp(/[a-z]+/).thru(makeNode("Identifier"));
Identifier.tryParse("hey");
// => { type: 'myLanguage.Identifier',
// value: 'hey',
// start: { offset: 0, line: 1, column: 1 },
// end: { offset: 3, line: 1, column: 4 } }
Returns a new parser whose failure message is description
. For example, string('x').desc('the letter x')
will indicate that
'the letter x'
was expected. Alternatively, an array of failure messages can be passed, if the parser represents multiple
options. For example, oneOf('abc').desc(['a', 'b', 'c'])
will indicate that any of 'a', 'b', or 'c' would be acceptable in
this case.
It is important to only add descriptions to "low-level" parsers; things like numbers and strings. If you add a description to every parser you write then generated error messages will not be very helpful when simple syntax errors occur.
var P = require("parsimmon");
var pNumber = P.regexp(/[0-9]+/)
.map(Number)
.desc("a number");
var pPairNorm = P.seq(
P.string("(")
.then(pNumber)
.skip(P.string(",")),
pNumber.skip(P.string(")"))
);
var pPairDesc = pPairNorm.desc("a pair");
var badInputs = ["(1,2", "1,2)", "(1|2)"];
function report(name, parser, x) {
var expectations = parser.parse(x).expected.join(", ");
console.log(name + " | Expected", expectations);
}
badInputs.forEach(function(x) {
report("pPairNorm", pPairNorm, x);
report("pPairDesc", pPairDesc, x);
});
pPairNorm
will output much more useful information than pPairDesc
, as seen below:
pPairNorm | Expected ')'
pPairDesc | Expected a pair
pPairNorm | Expected '('
pPairDesc | Expected a pair
pPairNorm | Expected ','
pPairDesc | Expected a pair
Parsimmon parsers are Semigroups, Applicative Functors, and Monads. Both the old-style (concat
) and new-style (fantasy-land/concat
) method names are supported.
Returns Parsimmon.fail("fantasy-land/empty")
.
See Parsimmon.empty()
.
Equivalent to parser.or(otherParser)
.
Takes parser
which returns a function and applies it to the parsed value of otherParser
.
Parsimmon.digit
.ap(Parsimmon.digit.map(s => t => Number(s) + Number(t)))
.parse("23");
// => {status: true, value: 5}
Equivalent to Parsimmon.sepBy(parser, separator)
.
Expects zero or more matches for parser
, separated by the parser separator
, yielding an array. Example:
Parsimmon.oneOf('abc')
.sepBy(Parsimmon.string('|'))
.parse('a|b|c|c|c|a');
// => {status: true, value: ['a', 'b', 'c', 'c', 'c', 'a']}
Parsimmon.oneOf('XYZ'),
.sepBy(Parsimmon.string('-'))
.parse('');
// => {status: true, value: []}
Equivalent to Parsimmon.sepBy1(parser, separator)
.
This is the same as Parsimmon.sepBy
, but matches the content
parser at least once.
See parser.chain(newParserFunc)
defined earlier.
Equivalent to Parsimmon.of(result)
.
For the sake of readability in your own parsers, it's recommended to either create a shortcut for the Parsimmon library:
var P = Parsimmon;
var parser = P.digits.sepBy(P.whitespace);
Or to create shortcuts for the Parsimmon values you intend to use (when using Babel):
import { digits, whitespace } from "parsimmon";
var parser = digits.sepBy(whitespace);
Because it can become quite wordy to repeat Parsimmon everywhere:
var parser = Parsimmon.sepBy(Parsimmon.digits, Parsimmon.whitespace);
For clarity's sake, however, Parsimmon
will refer to the Parsimmon library itself, and parser
will refer to a parser being used as an object in a method, like P.string('9')
in P.string('9').map(Number)
.
Do not perform side effects parser actions. This is potentially unsafe, as Parsimmon will backtrack between parsers, but there's no way to undo your side effects.
Side effects include pushing to an array, modifying an object, console.log
, reading data from outside sources (an array or object used to track things during parsing), or any random numbers.
Parsimmon expects that parsers and all .map
statements do not perform side effects (i.e. they are pure).
In this example, the parser pVariable
is called twice on the same text because of Parsimmon.alt
backtracking, and has a side effect (pushing to an array) inside its .map
method, so we get two items in the array instead of just one.
var x = 0;
var variableNames = [];
var pVariable = Parsimmon.regexp(/[a-z]+/i).map(function(name) {
variableNames.push(name);
return name;
});
var pDeclaration = Parsimmon.alt(
Parsimmon.string("var ")
.then(pVariable)
.then(Parsimmon.string("\n")),
Parsimmon.string("var ")
.then(pVariable)
.then(Parsimmon.string(";"))
);
pDeclaration.parse("var gummyBear;");
console.log(variableNames);
// => ['gummyBear', 'gummyBear']