ABNF for standard stack notation #171
Replies: 5 comments 9 replies
-
A stack notation grammar that allows multiple stacks in one stack diagram. ALPHA-UPPER = %x41-5A ; A-Z
ALPHA-LOWER = %x61-7A ; a-z
ALPHA = ALPHA-UPPER / ALPHA-LOWER ; A-Z / a-z
DIGIT = %x30-39 ; 0-9
DQUOTE = %x22 ; `"`
quoted-string = DQUOTE *( %x20-21 / %x23-7E ) DQUOTE ; e.g. `"foo bar"`
whitespace = %x01-20 ; any allowed white space character
ws = 1*( whitespace ) ; one or more white space characters
stack-diagram = "(" ws function-type-union ")"
function-type-union = function-type [ "|" ws function-type-union ] ; e.g. ( +n -- true | n -- false )
function-type = [ stack ] function-type-single-stack *( stack function-type-single-stack )
stack = stack-id ":" ws
stack-id = "S" / "R" / "C" / "F" ; an identifier of a stack
; There are the data stack, return stack, control-flow stack, floating-point stack.
; If a stack-id is absent, the data stack is implied.
function-type-single-stack = input-type "--" ws output-type
input-type = product-type [ parsed-text-abbreviation ] ; a type of the input tuple and optional parsed text
output-type = union-type ; a type of the output tuple
parsed-text-abbreviation = quoted-string ws [ parsed-text-abbreviation ]
union-type = product-type [ "|" ws union-type ] ; the long notation for a union type, e.g.: `xt -1 | 0`
product-type = data-type-union ws [ product-type ] ; the long notation for a product type
; NB the long notation of a union type cannot be used for the input tuple,
; it can be only used for the output tuple.
; The variant ( foo -- bar | baz ) is equivalent to ( foo -- bar | foo -- baz ).
; The variant ( foo | bar -- baz ) is incorrect. Either use ( foo -- baz | bar -- baz ),
; or use a short notation for a union type ( foo|bar -- baz )
; or introduce and employ a new data type symbol as ( foo_or_bar -- baz )
; where foo_or_bar = ( foo | bar )
data-type-union = data-type-product [ "|" data-type-union ] ; the short notation for a union type, e.g.: `u|n`, `ud|d`
data-type-product = [ tuple-length "*" ] data-type-element / "" ; the short notation for a product type, e.g. `i*x`, `2*n`
; NB: if a data-type-symbol is absent, the data type `0*x` whose only member is the empty tuple is implied,
; for example: `x| `, ( x -- x | )
tuple-length = tuple-length-variable / 1*DIGIT ; a variable name or non-negative integer number
tuple-length-variable = "i" / "j" / "k" / "l" / "m" / "n" ; may be other?
data-type-element = data-type-singleton / data-type-symbol [ element-label ]
; e.g. `2`, `-1`, `n1`, `d.2`, `n.minuend`
data-type-singleton = [ "-" ] 1*DIGIT ; a signed integer number (a literal)
data-type-symbol = ( ALPHA-LOWER / "+" / "_" ) [ *( "-" ) data-type-symbol ] ; e.g. `+n`, `colon-sys`
; NB: no digits are allowed in a `data-type-symbol` to avoid confusing with an `element-label-numeric`
element-label = element-separator element-label-any / element-label-numeric
element-separator = "."
element-label-any = ( ALPHA-LOWER / DIGIT / "_" ) [ *( "-" / "." ) element-label-any ]
element-label-numeric = 1*DIGIT ; one or more digits Usage example: : ?exit
\ Compilation: ( colon-sys i*x -- colon-sys i*x )
\ Run-time: ( S: 0 -- | S: nz -- never R: nest-sys -- )
postpone if postpone exit postpone then
; immediate Rationale: the single stack diagram Update: terminology has been corrected as noted below. |
Beta Was this translation helpful? Give feedback.
-
Currently, in the stack notation, the precedence among the operators « Technically, a syntax for sub-expressions can be introduced, for example, using curly brackets. Then the expression |
Beta Was this translation helpful? Give feedback.
-
What would zero return? I’d suggest false based on your description and the stack diagram, since zero is not a member of the set of positive integers.
…__
Alex
From: ruv ***@***.***>
Sent: Thursday, September 26, 2024 9:10 AM
To: ForthHub/discussion ***@***.***>
Cc: Subscribed ***@***.***>
Subject: Re: [ForthHub/discussion] ABNF for standard stack notation (Discussion #171)
What can be useful in a stack diagram is a notation for the relative complement (which is similar to that operation on sets <https://en.wikipedia.org/wiki/Complement_(set_theory)> ). If we use the character «\» to denote the relative complement, then the expression A\B means the relative complement of B with respect to A, and it represents a data type that includes all those and only those members of a data type A which are not members of a data type B.
This allows us to specify data types in a stack diagram more narrowly. For example, instead of ( +n -- true | n -- false ) we can write ( +n -- true | n\+n -- false ) that means that false is returned only for a negative input parameter.
—
Reply to this email directly, view it on GitHub <#171 (reply in thread)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAOB6OU76J3T5JVUVU6BWJLZYO6MJAVCNFSM6AAAAABN3PGF6KVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTANZWGA2DKMY> .
You are receiving this because you are subscribed to this thread. <https://github.com/notifications/beacon/AAOB6ORGSBATHQ25JSATBODZYO6MJA5CNFSM6AAAAABN3PGF6KWGG33NNVSW45C7OR4XAZNRIRUXGY3VONZWS33OINXW23LFNZ2KUY3PNVWWK3TUL5UWJTQAUQYQK.gif> Message ID: ***@***.*** ***@***.***> >
|
Beta Was this translation helpful? Give feedback.
-
An excerpt from the first post (updated):
It is important to distinguish between the empty data object and the empty data type. Empty data objectForth definitions consume and produce actual data objects on the stack during execution. Several data objects on the stack constitute a tuple of data objects. The empty data object is a pseudo data object that is a tuple of zero bits. So, the empty data objects occupies zero stack cells. If a Forth definition consumes nothing from the stack, it formally consumes the empty data object. If a Forth definition produces nothing on the stack, it formally produces the empty data object. There is a data type whose only member is the empty data object. This data type can be denoted as Obviously, any tuple of empty data objects is equivalent to the empty data object. Hence, any product of Empty data typeThe empty data type is a data type that has no members (it is uninhabited). Thus, this data type is a sub-type of any data type (this follows from the definition of the subtype relation, and this is similar to how the empty set is a subset of any set). Hence, this empty type is a bottom type, which follows from the definition of "bottom type". If we specify the empty data type as a type of an output parameter in a stack diagram for a definition, it effectively means that the definition does not return control to the caller. Because if it returns control, it returns some data object (possibly the empty data object). But it is impossible to return a data object that is a member of the empty data type. Then, it does not return control at all. Similar, if we specify the empty data type as a type of an input parameter, it means that the definition never gets control. This is impossible for actual definitions, but it is possible for unreachable code fragments. It is therefore useful to introduce a data type symbol for the empty data type, as in some cases it allows us to specify narrower stack diagrams or data types of smaller size. Let
Output of run-time semantics shows the input for a word that is compiled after these semantics (if any). An example of a user-defined word: : ?exit \ Run-time: ( 0 -- | x\0 -- never )
postpone if postpone exit postpone then
; immediate |
Beta Was this translation helpful? Give feedback.
-
Changes in terminologyI looked deeper into the type theory and realized that what I called a "sum type" in this thread should be called a "union type". A union type implies an untagged union, but a sum type usually implies a tagged union (aka disjoint union). Sometimes an untagged union is called a sum type, but this is confusing. The difference is as follows. Let the sum type operator be denoted by the symbol If a function expects an argument of type The consequence of this is that If a function expects an argument of type The consequence of this is that Discrimination of membersWhen using a parameter of union type References |
Beta Was this translation helpful? Give feedback.
-
Stack diagrams
Stack diagrams are a kind of documentation. They allow us to shorten the text description and quickly understand the parameters of a word.
Stack diagrams describe input and output stack parameters for a Forth definition (or some semantics of a named Forth definition). We can think of the actual input stack parameters as a tuple of parameters, and the actual output stack parameters as a tuple of parameters corresponding to the input tuple. A tuple of parameters can be an empty tuple.
Standard stack diagrams are written in special stack notation. This stack notation is a formal language.
An example of a standard stack diagram:
search-wordlist ( c-addr u wid -- 0 | xt 1 | xt -1 )
.A standard stack diagram describes data types of input and output tuples. Formally, it describes a function type using union types and product types.
Stack notation
The stack notation describes elements of a stack diagram, what they mean, and how they can be combined.
A stack diagram consists of data type symbols (with optional labels), operators, and markup characters.
An example of a stack diagram:
( S: foo -- bar )
. It specifies that a tuple of cells that has the typefoo
(as a whole) is consumed from the data stack, and a tuple of cells that has the typebar
(as a whole) is produced on the data stack. Typically,foo
andbar
are some expressions over data type symbols.A data type symbol identifies a data type. Data type symbols are defined in Table 3.1: Data types and in the corresponding section of each word set specification (if any). Examples:
x
,xt
,wid
,n
,r
.A literal numeric value denotes a singleton, a unit data type that is a subtype of n or r (depending on the stack and/or literal form), and whose only member is this value. Examples:
0
,1
,-1
,0e
,1e
.A concatenation of two data type symbols with white space characters between them represents a product type. So, a nonempty sequence of white space character denotes the binary product operator over data types. Examples:
( xt n )
,( c-addr u wid )
.There is a short notation for a product type over the same data type — n-power type. When n is a fixed number, this type is represented by concatenation of the number of power, the character «
*
» (star), and a data type symbol. For example:( 3*x )
, which is equivalent to( x x x )
. Note that( 0*x )
is equivalent to( )
, and this denotes a unit data type whose only member is the empty data object, which occupies zero cells on a stack. It means, all the following expressions are equivalent:( x 0*x 0*x )
,( x 0*x )
,( 0*x x 0*x )
,( x )
.A concatenation of two data type symbols with a separator «
|
» (vertical bar) between them represents a union type. So, a vertical bar denotes the binary union operator over data types. Examples:( n|u )
(in a short notation),( 0 | xt 1 | xt -1 )
(in a long notation). Note that a missed data type symbol denotes the unit type0*x
. For example all the following expressions are equivalent:( x| )
,( x|0*x )
,( x | )
,( x | 0*x )
.There is a notation for an n-power type when the power is not fixed. In this case a lowercase Latin letter like
i
,j
,k
is used instead of a number. For example( i*x )
,( k*r )
. Such an expression represents a union-type of n-power types from n = 0 to an implementation-defined limit (say, 32767). So,( i*x )
is equivalent to( 0*x | 1*x | 2*x | ... | 32767*x )
.Note that
( 0*x 1*x )
is equivalent to( 1*x )
, that is equivalent to( x )
, but( 0*x | 1*x )
are not equivalent to( 1*x )
.Stack notation grammar
The stack notation grammar can be formally specified using ABNF (see RFC5234), as follows.
(Please note that there may be some mistakes in this specification)
A
data-type-symbol
is a standard data type symbol or a user-defined data type symbol.An absent
data-type-symbol
implies the data type0*x
whose only member is the empty tuple of cells. It is a unit data type.Note that an
element-label
can be rendered without aseparator
as a subscript.Labels in data type symbols
In a text description we need to refer one or another particular stack parameter. There are few methods to do it.
A stack parameter can be referred to in a text description as follows:
We need labels along with data type symbols in a stack diagram only to unambiguously refer to the corresponding stack parameters in the text description when the same data type symbol is used in different elements of the stack diagram.
For example:
my-minus ( n -- n n )
— is a correct stack diagram. But we cannot use «n» to refer to a particular stack parameter.If we add the labels in the example above:
my-minus ( n.1 n.2 -- n.3 )
— then each stack parameter can be uniquely referenced in the text description using the corresponding data type symbol and label from the stack diagram: n.2 refers to the first input parameter, n.1 refers to the second input parameter, n.3 refers to the first output parameter. NB: the value of a numeric label says nothing about the position in the stack of the corresponding stack parameter.If a data type symbol is used only once in a stack diagram, or if we don't need to refer to the corresponding stack parameter in the text description, then we don't need to add a label to the data type symbol in the stack diagram. However, we can add a descriptive label to give additional clues what the parameter means.
An example of descriptive labels:
my-minus ( n.minuend n.subtrahend -- n.difference )
.The same data type symbol in two or more elements of a stack diagram means that the corresponding stack parameters belong to this (the same) data type. And nothing more. It does not mean that the parameters in the corresponding positions are equal.
The same data type symbol and the same label in two or more elements of a stack diagram means that the corresponding stack parameters are equal.
For example, in this case:
my-func ( n1 n2 -- n2 n3 )
, we can use n1, n2, and n3 to refer to the corresponding stack parameters. Also, this diagram means that the stack parameters in the positions that correspond to the element n2 of the stack diagram are equal in both type and value. When n2 is used in a text description, it refers to one actual stack parameter value, that is the first input parameter and the second output parameter.Update: terminology has been corrected as noted below.
Beta Was this translation helpful? Give feedback.
All reactions