Skip to content

Commit

Permalink
add signed and unsigned ranges
Browse files Browse the repository at this point in the history
  • Loading branch information
TUVIMEN committed Feb 20, 2025
1 parent 322a5cf commit 3c9668d
Show file tree
Hide file tree
Showing 11 changed files with 289 additions and 124 deletions.
96 changes: 70 additions & 26 deletions reliq.1
Original file line number Diff line number Diff line change
Expand Up @@ -92,12 +92,12 @@ If unquoted text is set to a single '*' character matching of text will be ommit
\fB>'TEXT'\fR
\fB>TEXT\fR - match TEXT, same as above
\fB[a-z]>TEXT\fR - match TEXT depending on options
\fB[a-z]>[RANGE]TEXT\fR
\fB[a-z]>[URANGE]TEXT\fR
\fB*\fR - match to everything
\fB>*\fR - same as above
\fB>[RANGE]"TEXT"\fR
\fB>[RANGE]\fR - match length of pattern
\fB>[RANGE]*\fR - same as above
\fB>[URANGE]"TEXT"\fR
\fB>[URANGE]\fR - match length of pattern
\fB>[URANGE]*\fR - same as above
.in
\&

Expand Down Expand Up @@ -158,30 +158,74 @@ Consists of \fBATTRIBUTE_NAME\fR followed by '=' and \fBPATTERN\fR of attribute'
.in
\&
.SS RANGE
Is always embedded in square brackets. Consists of groups of four numbers separated by ':', that can be practically endlessly separated by ','. Empty values will be complemented. Putting '-' before two first values (even if they are not specified) makes them subtracted from the maximal value. If '!' is found before the first value the matching will be inversed.

By default the minimal matching value is 0.
Is always embedded in square brackets. Consists of groups of four numbers separated by ':', that can be practically endlessly separated by ',' if any of the matching succeedes the matching will stop.

Specifying only one value equals to matching only to that value.

Specifying two values equals to matching range between and of them.
Specifying two values equals to matching range between, and of them.

If '!' is found before the first value the matching will be inversed.

Empty values will be treated as infinity.

Specifying three values additionally matches only values of which modulo of third value is equal to 0. Forth value is an offset to value from which modulo is calculated from.

.SS RRANGE

Relative range matches arrays. Putting '-' before two first values (even if they are not specified) makes them subtracted from the maximal value.

.nf
\&
.in +4m
\fB[x1,!x2,x3,x4]\fR - match to one of the values that is not x2
\fB[x1:y1,x2:y2,!x3:y4]\fR - match to one of the ranges that is not in x3:y4
\fB[-]\fR - match to last value subtracted by 0
\fB[-x]\fR - match to last value subtracted by x
\fB[x1,!x2,x3,x4]\fR - match to x1 or anything that isn't x2 or x3 or x4.
\fB[x1:y1,x2:y2,!x3:y4]\fR - match to values between, and x1 and y1 or ...
\fB[x:]\fR - match values that are x or higher
\fB[:y]\fR - match values that are y or lower
\fB[:]\fR - match everything
\fB[-]\fR - match to last index of the array
\fB[-x]\fR - match to last index of the array subtracted by x
\fB[:-y]\fR - match to range from 0 to y'th value from end
\fB[::w]\fR - match to values from which modulo of w is equal to 0
\fB[x:y:w]\fR - match to range from x to y from which modulo of w is equal to 0
\fB[x:y:w:z]\fR - match to range from x to y with value increased by z from which modulo of w is equal to 0
\fB[::2:1]\fR - match to uneven values
.in
\&

.SS URANGE

Unsigned range matches unsigned integers. Putting '-' before first value is the same as '0', before the second value is the same as matching to infinity.

.nf
\&
.in +4m

\fB[x1,x1:y]\fR - match to x1 or between, and x1 and y
\fB[-x]\fR - match to nothing
\fB[-x:-y]\fR - match to everything
\fB[-x:y]\fR - match from 0 to y
\fB[:y]\fR - match from 0 to y
\fB[x:-y]\fR - match from x to infinity
\fB[x:]\fR - match from x to infinity
\fB[x::2]\fR - match to even values starting from x
.in
\&

.SS SRANGE

Signed range matches signed integers.

.nf
\&
.in +4m
\fB[-x,-y]\fR - match from -x to -y
\fB[:-y]\fR - match from negative infinity to -y
\fB[:y]\fR - match from negative infinity to y
\fB[x:]\fR - match from x to infinity
\fB[:-1:2:1]\fR - match to uneven values until -1
.in
\&

.SS HOOK
Begins with a name of function followed by '@' and ended with argument which can be a \fBRANGE\fR, \fBEXPRESSION\fR, \fBPATTERN\fR or nothing.

Expand All @@ -208,32 +252,32 @@ Get tags with insides that match \fIPATTERN\fR, by default pattern flags set to
.BR M ", " tagmatch " " \fI"PATTERN"\fR
Get tags that match \fIPATTERN\fR, by default pattern flags set to "uWcas".
.TP
.BR a ", " attributes " " \fI[RANGE]\fR
Get tags with attributes that are within the \fIRANGE\fR.
.BR a ", " attributes " " \fI[URANGE]\fR
Get tags with attributes that are within the \fIURANGE\fR.
.TP
.BR L ", " level " " \fI[RANGE]\fR
Get tags that are on level within \fIRANGE\fR.
.BR L ", " level " " \fI[URANGE]\fR
Get tags that are on level within \fIURANGE\fR.
.TP
.BR l ", " levelrelative " " \fI[RANGE]\fR
Get tags that are on level relative to parent within the \fIRANGE\fR.
.BR l ", " levelrelative " " \fI[SRANGE]\fR
Get tags that are on level relative to parent within the \fISRANGE\fR.
.TP
.BR c ", " count " " \fI[RANGE]\fR
Get tags with descendant count that is within the \fIRANGE\fR.
.BR c ", " count " " \fI[URANGE]\fR
Get tags with descendant count that is within the \fIURANGE\fR.
.TP
.BR C ", " childmatch " " \fI"EXPRESSION"\fR
Get tags in which chained \fIEXPRESSION\fR (see \fB-F\fR option) matches at least one of its children.
.TP
.BR e ", " endmatch " " \fI"PATTERN"\fR
Get tags with insides of ending tag that match \fIPATTERN\fR, by default pattern flags set to 'tWcnfs'.
.TP
.BR P ", " position " " \fI[RANGE]\fR
Get tags with position within \fIRANGE\fR.
.BR P ", " position " " \fI[URANGE]\fR
Get tags with position within \fIURANGE\fR.
.TP
.BR p ", " positionrelative " " \fI[RANGE]\fR
Get tags with position relative to parent within \fIRANGE\fR.
.BR p ", " positionrelative " " \fI[SRANGE]\fR
Get tags with position relative to parent within \fISRANGE\fR.
.TP
.BR I ", " index " " \fI[RANGE]\fR
Get tags with starting index of tag in file that is within \fIRANGE\fR.
.BR I ", " index " " \fI[URANGE]\fR
Get tags with starting index of tag in file that is within \fIURANGE\fR.
.TP

Access hooks specify what nodes will be matched:
Expand Down
2 changes: 1 addition & 1 deletion src/edit.c
Original file line number Diff line number Diff line change
Expand Up @@ -379,7 +379,7 @@ cut_edit(const char *src, const size_t size, SINK *output, const void *arg[4], c
}

start = dend;
if (range_match(dcount,range,-1)^complement) {
if (range_match(dcount,range,RANGE_UNSIGNED)^complement) {
if (dprevendlength)
sink_write(output,src+dprevend,1);
if (dlength)
Expand Down
26 changes: 5 additions & 21 deletions src/hnode_print.c
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ print_text(const reliq *rq, const reliq_chnode *hnode, uint8_t flags, SINK *outf

uint8_t type = chnode_type(n);
if (type == RELIQ_HNODE_TYPE_TEXT || type == RELIQ_HNODE_TYPE_TEXT_ERR || type == RELIQ_HNODE_TYPE_TEXT_EMPTY) {
print_chars(data+n->all,n->all_len,flags,outfile);
print_chars(data+n->all,n->all_len,flags,outfile);
} else if (recursive && type == RELIQ_HNODE_TYPE_TAG)
print_text(rq,n,flags,outfile,recursive);

Expand Down Expand Up @@ -156,16 +156,8 @@ chnode_printf(SINK *outfile, const char *format, const size_t formatl, const rel
case 'i': print_chars(hnode.insides.b,hnode.insides.s,printflags,outfile); break;
case 't': print_text(rq,chnode,printflags,outfile,0); break;
case 'T': print_text(rq,chnode,printflags,outfile,1); break;
case 'l': {
uint16_t lvl = hnode.lvl;
if (parent) {
if (lvl < parent->lvl) {
lvl = parent->lvl-lvl; //happens when passed from ancestor
} else
lvl -= parent->lvl;
}
print_uint(lvl,outfile);
}
case 'l':
print_int((parent) ? hnode.lvl-parent->lvl : hnode.lvl,outfile);
break;
case 'L': print_uint(hnode.lvl,outfile); break;
case 'a':
Expand Down Expand Up @@ -205,16 +197,8 @@ chnode_printf(SINK *outfile, const char *format, const size_t formatl, const rel
print_chars(src,srcl,printflags|(endinsides ? 0 : PC_UNTRIM),outfile);
break;
case 'I': print_uint(hnode.all.b-rq->data,outfile); break;
case 'p': {
uint32_t pos = chnode-rq->nodes;
if (parent) {
if (chnode < parent) {
pos = parent-chnode;
} else
pos = chnode-parent;
}
print_uint(pos,outfile);
}
case 'p':
print_int(parent ? chnode-parent : chnode-rq->nodes,outfile);
break;
case 'P': print_uint(chnode-rq->nodes,outfile); break;
case 'n': sink_write(outfile,hnode.tag.b,hnode.tag.s); break;
Expand Down
96 changes: 47 additions & 49 deletions src/npattern.c
Original file line number Diff line number Diff line change
Expand Up @@ -45,21 +45,22 @@
#define A_VAL_MATTERS 0x2

//match_hook flags
#define H_RANGE 0x1
#define H_PATTERN 0x2
#define H_EXPRS 0x4
#define H_NOARG 0x8

#define H_ACCESS 0x10
#define H_TYPE 0x20
#define H_GLOBAL 0x40
#define H_MATCH_NODE 0x80
#define H_MATCH_COMMENT 0x100
#define H_MATCH_TEXT 0x200

#define H_MATCH_NODE_MAIN 0x400
#define H_MATCH_COMMENT_MAIN 0x800
#define H_MATCH_TEXT_MAIN 0x1000
#define H_RANGE_SIGNED 0x1
#define H_RANGE_UNSIGNED 0x2
#define H_PATTERN 0x4
#define H_EXPRS 0x8
#define H_NOARG 0x10

#define H_ACCESS 0x20
#define H_TYPE 0x40
#define H_GLOBAL 0x80
#define H_MATCH_NODE 0x100
#define H_MATCH_COMMENT 0x200
#define H_MATCH_TEXT 0x400

#define H_MATCH_NODE_MAIN 0x800
#define H_MATCH_COMMENT_MAIN 0x1000
#define H_MATCH_TEXT_MAIN 0x2000

reliq_error *reliq_exec_r(reliq *rq, const reliq_chnode *parent, SINK *output, reliq_compressed **outnodes, size_t *outnodesl, const reliq_expr *expr);

Expand Down Expand Up @@ -119,10 +120,7 @@ X(global_index) {

X(global_level_relative) {
if (parent) {
if (hnode->lvl < parent->lvl) {
*srcl = parent->lvl-chnode->lvl;
} else
*srcl = chnode->lvl-parent->lvl;
*srcl = chnode->lvl-parent->lvl;
} else
*srcl = chnode->lvl;
}
Expand All @@ -149,10 +147,7 @@ X(global_all_count) {

X(global_position_relative) {
if (parent) {
if (chnode < parent) {
*srcl = parent-chnode;
} else
*srcl = chnode-parent;
*srcl = chnode-parent;
} else
*srcl = chnode-rq->nodes;
}
Expand Down Expand Up @@ -180,38 +175,38 @@ X(text_all) {

const struct match_hook_t match_hooks[] = {
//global matching
{{"l",1},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_level_relative)},
{{"L",1},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_level)},
{{"c",1},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_desc_count)},
{{"cc",2},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_comments_count)},
{{"ct",2},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_text_count)},
{{"cC",2},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_all_count)},
{{"p",1},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_position_relative)},
{{"P",1},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_position)},
{{"I",1},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_index)},

{{"levelrelative",13},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_level_relative)},
{{"level",5},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_level)},
{{"count",5},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_desc_count)},
{{"countcomments",13},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_comments_count)},
{{"counttext",9},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_text_count)},
{{"countall",8},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_all_count)},
{{"positionrelative",16},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_position_relative)},
{{"position",8},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_position)},
{{"index",5},H_GLOBAL|H_RANGE,(uintptr_t)XN(global_index)},
{{"l",1},H_GLOBAL|H_RANGE_SIGNED,(uintptr_t)XN(global_level_relative)},
{{"L",1},H_GLOBAL|H_RANGE_UNSIGNED,(uintptr_t)XN(global_level)},
{{"c",1},H_GLOBAL|H_RANGE_UNSIGNED,(uintptr_t)XN(global_desc_count)},
{{"cc",2},H_GLOBAL|H_RANGE_UNSIGNED,(uintptr_t)XN(global_comments_count)},
{{"ct",2},H_GLOBAL|H_RANGE_UNSIGNED,(uintptr_t)XN(global_text_count)},
{{"cC",2},H_GLOBAL|H_RANGE_UNSIGNED,(uintptr_t)XN(global_all_count)},
{{"p",1},H_GLOBAL|H_RANGE_SIGNED,(uintptr_t)XN(global_position_relative)},
{{"P",1},H_GLOBAL|H_RANGE_UNSIGNED,(uintptr_t)XN(global_position)},
{{"I",1},H_GLOBAL|H_RANGE_UNSIGNED,(uintptr_t)XN(global_index)},

{{"levelrelative",13},H_GLOBAL|H_RANGE_SIGNED,(uintptr_t)XN(global_level_relative)},
{{"level",5},H_GLOBAL|H_RANGE_UNSIGNED,(uintptr_t)XN(global_level)},
{{"count",5},H_GLOBAL|H_RANGE_UNSIGNED,(uintptr_t)XN(global_desc_count)},
{{"countcomments",13},H_GLOBAL|H_RANGE_UNSIGNED,(uintptr_t)XN(global_comments_count)},
{{"counttext",9},H_GLOBAL|H_RANGE_UNSIGNED,(uintptr_t)XN(global_text_count)},
{{"countall",8},H_GLOBAL|H_RANGE_UNSIGNED,(uintptr_t)XN(global_all_count)},
{{"positionrelative",16},H_GLOBAL|H_RANGE_SIGNED,(uintptr_t)XN(global_position_relative)},
{{"position",8},H_GLOBAL|H_RANGE_UNSIGNED,(uintptr_t)XN(global_position)},
{{"index",5},H_GLOBAL|H_RANGE_UNSIGNED,(uintptr_t)XN(global_index)},

//node matching
{{"m",1},H_MATCH_NODE|H_PATTERN,(uintptr_t)XN(node_match_insides)},
{{"M",1},H_MATCH_NODE|H_PATTERN,(uintptr_t)XN(node_match_tag)},
{{"n",1},H_MATCH_NODE|H_PATTERN|H_MATCH_NODE_MAIN,(uintptr_t)XN(node_match_name)},
{{"a",1},H_MATCH_NODE|H_RANGE,(uintptr_t)XN(node_attributes)},
{{"a",1},H_MATCH_NODE|H_RANGE_UNSIGNED,(uintptr_t)XN(node_attributes)},
{{"C",1},H_MATCH_NODE|H_EXPRS,(uintptr_t)NULL},
{{"e",1},H_MATCH_NODE|H_PATTERN,(uintptr_t)XN(node_match_end)},

{{"match",5},H_MATCH_NODE|H_PATTERN,(uintptr_t)XN(node_match_insides)},
{{"tagmatch",8},H_MATCH_NODE|H_PATTERN,(uintptr_t)XN(node_match_tag)},
{{"namematch",8},H_MATCH_NODE|H_PATTERN,(uintptr_t)XN(node_match_name)},
{{"attributes",10},H_MATCH_NODE|H_RANGE,(uintptr_t)XN(node_attributes)},
{{"attributes",10},H_MATCH_NODE|H_RANGE_UNSIGNED,(uintptr_t)XN(node_attributes)},
{{"childmatch",10},H_MATCH_NODE|H_EXPRS,(uintptr_t)NULL},
{{"endmatch",8},H_MATCH_NODE|H_PATTERN,(uintptr_t)XN(node_match_end)},

Expand Down Expand Up @@ -285,7 +280,7 @@ static void
reliq_free_hook(reliq_hook *hook)
{
const uint16_t flags = hook->hook->flags;
if (flags&H_RANGE) {
if (flags&(H_RANGE_SIGNED|H_RANGE_UNSIGNED)) {
range_free(&hook->match.range);
} if (flags&H_EXPRS) {
reliq_efree(&hook->match.expr);
Expand Down Expand Up @@ -412,8 +407,11 @@ match_hook(const nmatch_state *st, const reliq_hook *hook)
if (arg)
((hook_func_t)arg)(rq,chnode,hnode,parent,&src,&srcl);

if (flags&H_RANGE) {
if ((!range_match(srcl,&hook->match.range,-1))^invert)
if (flags&H_RANGE_UNSIGNED) {
if ((!range_match(srcl,&hook->match.range,RANGE_UNSIGNED))^invert)
return 0;
} else if (flags&H_RANGE_SIGNED) {
if ((!range_match(srcl,&hook->match.range,RANGE_SIGNED))^invert)
return 0;
} else if (flags&H_PATTERN) {
if ((!reliq_regexec(&hook->match.pattern,src,srcl))^invert)
Expand Down Expand Up @@ -527,7 +525,7 @@ match_hook_unexpected_argument(const uint16_t flags)
return "hook \"%.*s\" expected pattern argument";
if (flags&H_EXPRS)
return "hook \"%.*s\" expected node argument";
if (flags&H_RANGE)
if (flags&(H_RANGE_SIGNED|H_RANGE_UNSIGNED))
return "hook \"%.*s\" expected list argument";
if (flags&H_NOARG)
return "hook \"%.*s\" unexpected argument";
Expand Down Expand Up @@ -692,7 +690,7 @@ match_hook_handle(const char *src, size_t *pos, const size_t size, reliq_hook *o
if (!firstchar || isspace(firstchar)) {
HOOK_EXPECT(H_NOARG);
} else if (src[p] == '[') {
HOOK_EXPECT(H_RANGE);
HOOK_EXPECT(H_RANGE_UNSIGNED|H_RANGE_SIGNED);
if ((err = range_comp(src,&p,size,&hook.match.range)))
goto ERR;
} else if (hflags&H_EXPRS) {
Expand Down
2 changes: 1 addition & 1 deletion src/pattern.c
Original file line number Diff line number Diff line change
Expand Up @@ -385,7 +385,7 @@ reliq_regexec(const reliq_pattern *pattern, const char *src, const size_t size)
{
uint16_t pass = pattern->flags&RELIQ_PATTERN_PASS;
uchar invert = (pattern->flags&RELIQ_PATTERN_INVERT) ? 1 : 0;
if ((!range_match(size,&pattern->range,-1)))
if ((!range_match(size,&pattern->range,RANGE_UNSIGNED)))
return invert;

if (pattern->flags&RELIQ_PATTERN_ALL)
Expand Down
Loading

0 comments on commit 3c9668d

Please sign in to comment.