Skip to content

Dynamic containers

IllidanS4 edited this page Aug 7, 2018 · 9 revisions

Pawn has limited support for compound types and pratically zero support for dynamically-sized container types. PawnPlus adds a couple of containers which are dynamically sized and able to store values of any type. Internally, these containers store variants.

Variants

Introduction

A variant is an object able to store values of multiple "types". When such an object is constructed from a value, not only is the numerical value stored in the object, but also the tag of the value as specified using Pawn:

new Variant:v = var_new(3.14);
assert var_tagof(v) == (tagof(Float:));

Using variants, functions can accept a value of any kind and decide how to process it based on the information stored within the object. Functions like var_sizeof and var_tagof are equivalent to the similarly-named operators in Pawn. As implied by the presence of var_sizeof, variants are also able to store single-dimensional arrays:

new Variant:v = var_new_arr(Float:{1.1, 2.2, 3.3});
assert var_tagof(v) == (tagof(Float:));
assert var_sizeof(v) == 3;
assert Float:var_get(v, 0) == 1.1;
assert Float:var_get(v, 1) == 2.2;
assert Float:var_get(v, 2) == 3.3;

var_new_arr automatically determines the size and tag of the array and copies its data into the variant.

The value inside the variant can be obtained using two methods. The first one, shown above, is type-unsafe. Calling var_get and retagging the value works if the tag of the variant is known beforehand, but assuming the tag may cause potential bugs in the code. Instead, the type-safe equivalent of this function is recommended:

new Variant:v1 = var_new(true);
new Variant:v2 = var_new(-1.0);
new c, bool:b, Float:f;
assert var_get_safe(v1, c) && c == 1;
assert var_get_safe(v1, b) && b == true;
assert var_get_safe(v2, f) && f == -1.0;
assert !var_get_safe(v2, c);

var_get_safe compares the tag of the variant with the tag of the variable, and obtains the value if it deems it compatible. The are two compatibility rules: weakly tagged value (with a tag not starting by an uppercase letter) can be stored in an untagged variable, and GlobalString and GlobalVariant can be stored in String and Variant, respectively.

In Pawn, variants are passed by reference. This means that a function accepting a variant may potentially cause side effects by modifying the value inside the variant. To prevent unintentional modifying of the value, non-array variants are immutable. However, it is possible to change the content of array variants, but not the tag or size:

new Variant:v = var_new_arr({1, 2, 3});
var_set_cell(v, 1, 55); //or type-safe var_set_cell_safe

Operations

For easier usage, variants support standard operators and treat them as if applied on their values. This includes arithmetical operators and ==:

new Variant:v = var_new(12);
v *= var_new(16);
assert v == var_new(192);

In the current version, variants must have compatible types in order to successfully use any operators on them. It is not possible to multiply a float variant by an integer variant and so on.

If the variants store a reference to a dynamic string or another variant, their value will be inspected when the variants are compared. To prevent this behaviour, you can tag the values Ref<String> or Ref<Variant> to signalize that only the references should be compared. However, this changes the tag of the value, and so if you want to obtain the value, you will have to obtain it as the correct tag.

new Variant:a = var_new(@("str"));
new Variant:b = var_new(@("str"));
assert a == b;
new Variant:c = var_new(Ref<String>:@("str"));
new Variant:d = var_new(Ref<String>:@("str"));
assert c != d;

Operators also work on array variants, if they have compatible tags and sizes. The method is the same regardless of the operator: elements in both arrays are paired and the operation is performed on all pairs, producing a new array of the same size.

new String:arr[3];
arr[0] = @("a"), arr[1] = @("b"), arr[2] = @("c");
new Variant:v = var_new_arr(arr);
v += var_new(@("x"));
assert str_val(v) == @("({ax, bx, cx})");

In this example, x is appended to all strings in the array.

Lifetime and garbage collection

Variants are pooled in the same way as dynamic strings. They can reside in one of two pools: the local pool or the global pool. Initially, every variant in Pawn is created in the local pool, and can be moved to the global pool by either calling var_to_global or by assigning to GlobalVariant:.

Variants in the local pool are garbage collected as soon as the top-level callback ends. If a variant's lifetime must extend the execution of a single callback (even if asynchronous functions like wait_ms are used), it must be first moved into the global pool, or it would be deleted. Variants in the global pool are never automatically deleted, only by calling var_delete or moving them back to the local pool by calling var_to_local.

Types of variants

A variant can be constructed from four different kinds of values: a single cell (with an arbitrary tag), a single-dimensional array (with an arbitrary tag), a string (untagged array), or another variant.

There is a difference between a variant constructed from an array and a variant constructed from a string. Internally, a string variant has a special char tag which identifies it and makes it incompatible with regular arrays. Since char is a keyword in Pawn (used for char arrays), it is impossible to tag any value with it. However, it is possible to take advantage of the tag compatibility rules and tag an array with any subtag of char, making it being treated as a string. If tagged this way, even single values are treated as characters:

assert var_new_str("test") != var_new_arr("test");
assert var_new_str("test") == var_new_arr(char@:"test");
assert str_val(var_new(char@:'A')) == @("(A)");

There is also a special type of a variant, the null variant, represented by VAR_NULL. It has zero size and cannot be deleted.

Lists

Introduction

Lists in PawnPlus use C++'s std::vector type to represent a linear mutable container with variable size. Lists are heterogenous, meaning they can store values of any kind (internally stored using by-value variants). Their size can be increased by inserting new elements, or decreased by removing existing elements. Lists are not stored in any pool and must be deleted explicitly with list_delete when no longer needed.

Operations

New values can be inserted into a list using the list_add family of functions, one for each type of a variant:

new List:l = list_new();
list_add(l, true);
list_add_arr(l, Float:{1.0, 2.0});
list_add_str(l, "test");
list_add_var(l, VAR_NULL);

Since the internal element type of a list is variant, a value of any kind may be inserted. The value is copied into the list.

Obtaining the values is akin to variants:

new bool:b, Float:f[2], s[5], c;
assert list_get_safe(l, 0, b);
assert list_get_arr_safe(l, 1, f);
assert list_get_arr_safe(l, 2, s);
assert !list_get_safe(l, 3, c); //null has no value

Unlike variants, you can modify any element in a list and overwrite it with a new value, using the list_set family of functions. It is not possible to add a new element this way; attempts to set elements past the end of the list will fail.

Iterating

Traversing over values stored in a list is possible in two ways. The first one uses a numerical index into the list, bounded by its size:

new List:l = list_new();
list_add(l, 3);
list_add(l, 4);
list_add(l, 5);
list_add(l, 6);
for(new i = 0; i < list_size(l); i++)
{
    printf("%d", list_get(l, i));
}

This way is fine, since accessing a list element by its index takes constant time. However, there is also a C++-like way, using iterator objects and a special piece of syntax:

for_list(i : l)
//for(new Iter:i = list_iter(l); iter_inside(i); iter_move_next(i))
{
    printf("%d", iter_get(i));
}

Here, a new iterator is created via list_iter, initially pointing to the first element in the list. An iterator can point to any element in the list, or to none, which can be checked using iter_inside.

Lifetime

Lists are not garbage collected, and will reside forever in memory if not deleted. Values stored inside lists are, even though variants, bound to the existence of the list and will not be deleted in the manner standard variants are. However, a list, variant, or string reference inside a list is not bound to the list instance and has its own lifetime.

new List:l = list_new();
new GlobalString:s = @("str");
list_add(l, s); //string is passed by reference
list_delete(l);
assert str_valid(s); //s still exists and must be deleted

Generic lists

In general, any value may be stored inside a list. However, it is possible to limit the usage of lists at compilation time and restrict it to a certain tag. First, a constructed List type must be defined, specifying its inner tag:

#if !defined List@Float
@define_List<Float>;
#endif

This defines all necessary natives and operators for this tag, it also defines a special symbol List@Float which can be used to check if the type is not already defined. After this, List<Float> can be used to refer to a list of Float values:

new List<Float>:l = list_new<Float>();
list_add<Float>(l, 1.0);
list_add<Float>(l, 1.1);
list_add<Float>(l, true); //tag mismatch
assert list_tagof(List:l, 2) == (tagof(Float:));
new Float:f = list_get<Float>(l, 2);

Even though an attempt to insert a bool value is made, the value will still be inserted tagged as Float. All sensible functions will be usable, and any attempt to call a generic function on a concrete list will fail. However, since the restriction is not present at run-time, retagging a list to its base tag will allow inserting any values to it:

new List<Float>:l = list_new<Float>();
list_add(List:l, false);
assert list_tagof(List:l, 0) != (tagof(Float:));

Notice the need to retag the list even when calling list_tagof. Since the tag of the value is implied by the explicitly specified tag, the function is not generic.

Generic lists allow storing both single values and arrays with the matching tag.

Maps

Introduction

Even though storing values in a linear fashion could be useful, sometimes there is need to pair a value with an arbitrary integer, or even a value with an arbitrary type. Maps (internally represented by std::unordered_map) contain a list of key-value pairs, with each all keys unique. They are optimised to allow for fast access using a key to obtain a value paired with that key. Similarly to lists, maps are created by map_new and must be destroyed by calling map_delete.

Operations

Inserting new values inside a map is done by using two families of functions: map_add and map_set. The former tries to insert a new key-value pair into the map, but fails is the key is already present. The latter either adds a new pair into the map, or sets the new value for an existing pair if the key is present.

Conversely, obtaining values from the map can be done using the map_get family of functions:

new Map:m = map_new();
map_add(m, false, 12);
map_add(m, 0, 13);
map_str_add(m, "key1", 14);
map_str_add_arr(m, "key2", {1, 2, 3});
assert map_get(m, false) == 12;
assert map_get(m, 0) == 13;
assert map_str_get(m, "key1") == 14;
new buffer[16];
assert map_str_get_arr(m, "key2", buffer) == 3;

The word before add, set or get specified the type of the key, and the word after it specifies the type of the value. Like for lists, applicable types are cells (no word), arrays (arr), strings (str), or variants (var). The tag of the key is taken into account for key comparisons and must be always correct. Since the key comparison has the same semantics as the == operator on variants, reference types have their value taken into account:

new Map:m = map_new();
map_add(m, @("str1"), 1);
map_add(m, @("str2"), 2);
assert map_get(m, @("str1")) == 1;
assert map_get(m, @("str2")) == 2;

If you want to use only the address of the strings for comparison, the same technique as for variants can be used:

new Map:m = map_new();
map_add(m, Ref<String>:@("str1"), 1);
map_add(m, Ref<String>:@("str2"), 2);
assert !map_get(m, Ref<String>:@("str1"));
assert !map_get(m, Ref<String>:@("str2"));

Since global strings and global variants are tagged using a subtag of String and Variant, the tags don't have to match exactly when comparing keys. Namely, you can index a map containing a global string key with a local string:

new Map:m = map_new();
map_add(m, str_to_global(@("str1")), 1);
assert map_get(m, @("str1")) == 1;

Iterating

Maps can be iterated in virtually the same fashion as lists:

new Map:m = map_new();
map_add(m, 1, 5.0);
map_add(m, 1.1, false);
map_add(m, true, 0);
for_map(i : m)
//for(new Iter:i = map_iter(m); iter_inside(i); iter_move_next(i))
{
    new c, Float:f;
    if(iter_get_key_safe(i, c)) printf("%d", c);
    else if(iter_get_key_safe(i, f)) printf("%f", f);
    print("=>");
    if(iter_get_value_safe(i, c)) printf("%d", c);
    else if(iter_get_value_safe(i, f)) printf("%f", f);
}

While a value in a pair can be modified, the key cannot, since that could cause inconsistencies in the map. It is possible to modify a key by changing the object it points to in case of a string or a variant, but doing so is not recommended. You should first remove the pair and then insert a new one.

Generic maps

Maps can be generic in the same fashion as lists. Doing so requires a similar importing, but this time there are two generic parameters:

#if !defined Map@_@Float
@define_Map<_,Float>;
#endif

Sensible functions that operate on cell or array values are then defined and usable in the code:

new Map<_,Float>:m = map_new<_,Float>();
map_add<_,Float>(m, 1, 1.0);
map_add<_,Float>(m, 2, false); //tag mismatch
map_add<_,Float>(m, 1.0, 6.0); //tag mismatch
new Float:f = map_get<_,Float>(m, 1);

As with lists, this will only insert an int-float pair in the map, but retagging it to Map could break it in a similar way.

Iterators

Introduction

Since 0.8, iterators are normal objects like variants, lists, and maps with their own data and lifetime, resulting in much safer behavior. All iterators share the same set of basic operations which don't depend on the source object, i.e. list iterators and map iterators use the same functions for manipulation.

An iterator points to a single item in a collection, and can be used to access its value, optionally also its key (map iterators). Iterators also support erasing the item they point to, and list iterators can also be used to insert an item in a place where the original item was.

new List:l = list_new();
list_add(l, 1);
list_add(l, 2);
list_add(l, 3);
new Iter:i = list_iter(l); // points to the first element
assert iter_get(i) == 1;
iter_move_next(i);
assert iter_get(i) == 2;
iter_erase(i);
assert iter_get(i) == 3 && list_size(l) == 2;
iter_insert(i, 4);
assert list_get(l, 1) == 4 && list_size(l) == 3;

Operations

Iterators support both relative and absolute movement (iter_move_next, iter_move_previous, iter_to_first, iter_to_last). In contrast with C++, pointing to the "end" of a collection is usually represented by an iterator being "inside" or "outside" it (iter_inside). Inserting an item via an "outside" iterator adds it to the end of the collection. iter_reset moves the iterator "outside" the collection.

Iterators also support the equality operation and cloning. Two iterators are equal if the point to the same position in the same collection.

Every iterator has access to its collection, and is notified when the collection is deleted. In this case, the iterator object still exists, but cannot be used to access the collection anymore.

new List:l = list_new();
new Iter:i = list_iter(l);
assert iter_valid(i) && iter_linked(i);
list_delete(l);
assert iter_valid(i) && !iter_linked(i);

Since modifying a collection in any way may break existing iterators, all iterators are automatically invalidated when a collection is altered (like iter_reset, so absolute movement is still possible).

new List:l = list_new();
list_add(l, 1);
new Iter:i = list_iter(l);
assert iter_inside(i);
list_add(l, 2);
assert !iter_inside(i);

Lifetime and garbage collection

The lifetime of iterators is handled similarly to strings and variants, since usually they are only used temporarily. When created, an iterator is in the "local object pool" which is cleared automatically and aggressively. If you use iter_to_global or assign the iterator to GlobalIter:, the iterator is moved to the global object pool where it stays forever, unless deleted (iter_delete) or moved back (iter_to_local).

If the collection where an iterator points to is deleted, the iterator's lifetime will stay unaffected.

Generic iterators

Generic lists and maps produce generic iterators which employ compile-time safety measures preventing assigning or retrieving incorrectly tagged values:

new List<Float>:l = list_new<Float>();
list_add<Float>(l, 1.0);
list_add<Float>(l, 2.0);
list_add<Float>(l, 3.0);
for_list_of<Float>(i : l)
{
    printf("%f", iter_get<Float>(i));
}
Clone this wiki locally