user-defined aggregate types #609

YashasSamaga · 2020-11-11T14:36:09Z

Issue description:

Most developers (ab)use enums to create aggregate user-defined types. There are few disadvantages to this method:

it's a hack! enum structs are handled in an unconventional way and has absurd rules and special cases to make it work
identifiers are spilled into global namespace

We can implement a new system to support user-defined aggregate types once in for all. To proceed we need to draft a spec for the syntax and semantics and also have consensus for the feature.

Here is my proposal (ad-hoc and not-well thought) to kick off discussions:

Tags are currently used for user-defined types. We can extend it to allow user-defined aggregate types.

/* static */ struct Player {
    id,
    playerid,
    Float:health,
    bool:spawned,
    
    bool:isValid() { return id != 0; }
    spawn(Float:pX, Float:pY, Float:pY) { /* do something */ }

    Player:setHealth(Float:hp) { SetPlayerHealth(playerid, hp); return this; }
    Player:setArmour(Float:ar) { SetPlayerArmor(playerid, ar); return this; }

    static Player:create() { /* no 'this' here */ } // just syntactic sugar: Player.create()? do we even need this in the first draft?
    static const Float:MAX_HEALTH = 100.0; // Player.MAX_HEALTH?
};

bool:operator==(Player:a, Player:b) { return a.id == b.id; } // operator overloads as member functions isn't required

sizeof(Player); // returns the size of the struct (number of cells)
tagof(Player:); // returns the tag id

new Player:x; // allocates a block of memory of sizeof(Player) -- practically equivalent to x[sizeof(Player)]
new Player:pInfo[MAX_PLAYERS]; // equivalent to x[MAX_PLAYERS][sizeof(Player)];

x.setHealth(100.0).setArmour(100.0);

doPlayerStuff(&Player:p) { } // passes the address
doPlayerStuff(Player:p) { } // passes a copy of the object in stack

We can have a new class for tags: aggregate-type tag and primitive-type tag. We can use the last but one bit to classify these (the MSB is currently used to mark STRONG and WEAK tags). Tag overrides b/w aggregate types might trigger a warning.

Questions:

what should be printed by the following code:

new Player:x;
main () { printf("%d", _:x); }

Does pawn allow references to be returned?
Does pawn allow operator overloads to accept references?
What is this? What should be the output of printf("%d", _:this);?
Is call-by-value as default newbie friendly?
What does new Player:a = Player:0; mean?
How to embed debug information?
What happens to an empty struct?

We could in principle implement the whole thing as an enum based struct internally and add special handling to prevent identifiers from being spilled out. This would mean passing an object by default would pass by reference.

Bonus: simulate namespaces with static members

struct Vector3f {
    Float:x,
    Float:y,
    Float:z

    Float:norm() { return VectorSize(x, y, z); }
};

struct Vehicle {
    static const INVALID_ID = 65535;

    static Create(modelid, Vector3f:pos) { return CreateVehicle(modelid, pos.x, pos.y, pos.z); }
    static Create(modelid, Float:x, Float:y, Float:z) { /* ... */ } // in case we support function overloading for non-publics some day
    static Destroy(id) { return DestroyVehicle(id); }

    static bool:IsValid(id) { }
};

new Vector3f:position;
position.x = 123.0;
position.y = 555.0;
position.z = 1234.0;

new id = Vehicle.Create(420, position);

While this does simulate a namespace, it uses a . instead of the _ convention for naming modules. I don't like this syntax but this is a possibility.

Bonus: unions with some tag overrides

struct TypeA {
    Float:x
};

struct TypeB {
    x
};

new TypeA:something;

new x = (TypeB:something).x;

We can similarly make variants by having a type member variable.

Bonus: abuse to support polymorphism (needs more work but it seems to be possible)

struct BaseClass {
    type,
    something() {
        switch(type) {
            case 0: { /* something for base class -- maybe crash if u want base to be abstract? very bad but ... */ }
            case 1: (DerivedClass:this).something();
        }
     }
};

struct DerivedClass {
    type,
    something() { }
}

Related issues:

Tag-based OO Tag-based OO-like code #234
Inheritance Feature proposal: enum expansion/inheritance #608

The text was updated successfully, but these errors were encountered:

Daniel-Cortez · 2020-11-11T17:14:57Z

While I like this idea (and would love to try implementing it when #234 is done), the draft itself definitely needs a few corrections.

/* static */ struct Player {

Obviously, we can't use struct as a keyword, as it's a valid identifier and thus can impact compatibility with existing code.
Maybe we could use __struct instead? (Or *struct, but the only keywords using the "*" prefix - *begin, *end and *then - may be removed in the future.)

doPlayerStuff(Player:p) { } // passes a copy of the object in stack

Not sure if this is a good idea. If I understand this correctly, internally p would be an array (as adding a new symbol type instead of reusing iARRAY/iREFARRAY would be a pain), and currently Pawn doesn't allow passing arrays by value.

bool:operator==(Player:a, Player:b) { return a.id == b.id; }

Not sure if this would be possible either, as operators can't have array arguments (except for destructors, but this is so multiple variables could be destroyed with a single destructor call).

what should be printed by the following code:
new Player:x;
main () { printf("%d", _:x); }

The address of x, probably? As internally x would be an array.

Does pawn allow references to be returned?
Does pawn allow operator overloads to accept references?

No for both.

What is this?

I think it must be the first (hidden) argument of a method, as described in #234.
Since we'll want to be able to map methods to existing functions, this would need to be of type iVARIABLE for single values, but for aggregate-tagged values we'll have to use iREFARRAY, as they are essentially arrays.

What should be the output of printf("%d", _:this);?

Same as for question 1.

Is call-by-value as default newbie friendly?

Can you please elaborate, what exactly do you mean by "call-by-value"?

What does new Player:a = Player:0; mean?

I think we should error when a primitive-tagged single value is overridden with an aggregate-type tag, as we can't convert a single value into an array.

What happens to an empty struct?

They probably shouldn't be allowed, as arrays can't have a zero length.

Y-Less · 2020-11-11T21:14:32Z

OK, I've not read too carefully yet - I'll get to that. But one thing is that I really don't think that whatever the feature is should hide the size of the data. For example:

bool:operator==(Player:a, Player:b) { return a.id == b.id; }

There's nothing in the function declaration there to expose the fact that Player is a structure, not a simple cell. Maybe enums have other problems, but at least they kept that clear and explicit.

Y-Less · 2020-11-11T21:16:22Z

what should be printed by the following code:

new Player:x;
main () { printf("%d", _:x); }

Without major reworking, that would print the value of the first field. The same thing that happens with an enum in that case. Anything else would be almost impossible and require printf to be aware of the parameter type.

Y-Less · 2020-11-11T21:18:44Z

Something to consider, as a middle-ground between abandoning enums entirely for totally new syntax, and keeping the good parts without issues like name scope:

https://www.nextptr.com/tutorial/ta1423015134/scoped-class-enums-fundamentals-and-examples

Y-Less · 2020-11-11T21:22:16Z

If we did have enum struct instead, we could have enum const as well for enums explicitly only for use as values, with the same scoping rules. I.e. these would be different:

enum const Tag1
{
    A, // 0
    B,  // 1
}

enum const Tag2
{
    _NONE, // 0
    A, // 1
    B,  // 2
}

main()
{
    new Tag1:a = A; // 0
    new Tag2:a = A; // 1
    new Tag3:a = A; // Error
}

Y-Less · 2020-11-11T21:23:02Z

Sort of inspired by a hybrid of C++ enum struct and Pawn 4 const structs.

Y-Less · 2020-11-11T21:31:31Z

Because C++ enum struct is more like what I just suggested for enum const, I'll make my idea for pawn enum struct more explicit.

Old code:

enum E_HOOK_NAME_REPLACEMENT_DATA
{
	E_HOOK_NAME_REPLACEMENT_SHORT[16],
	E_HOOK_NAME_REPLACEMENT_LONG[16],
	E_HOOK_NAME_REPLACEMENT_MIN,
	E_HOOK_NAME_REPLACEMENT_MAX
}

static stock
	YSI_g_sReplacements[MAX_HOOK_REPLACEMENTS][E_HOOK_NAME_REPLACEMENT_DATA];

		if ((pos = strfind(name, YSI_g_sReplacements[idx][E_HOOK_NAME_REPLACEMENT_SHORT], false, pos + 1)) == -1)
		{
			++i,
			idx = YSI_g_sReplacementsShortOrder[i];
		}

New code:

enum struct E_HOOK_NAME_REPLACEMENT_DATA
{
	E_HOOK_NAME_REPLACEMENT_SHORT[16],
	E_HOOK_NAME_REPLACEMENT_LONG[16],
	E_HOOK_NAME_REPLACEMENT_MIN,
	E_HOOK_NAME_REPLACEMENT_MAX
}

static stock
	YSI_g_sReplacements[MAX_HOOK_REPLACEMENTS][E_HOOK_NAME_REPLACEMENT_DATA];

		if ((pos = strfind(name, YSI_g_sReplacements[idx][E_HOOK_NAME_REPLACEMENT_SHORT], false, pos + 1)) == -1)
		{
			++i,
			idx = YSI_g_sReplacementsShortOrder[i];
		}

Idiomatic code:

enum struct E_HOOK_NAME_REPLACEMENT_DATA
{
	SHORT[16],
	LONG[16],
	MIN,
	MAX
}

static stock
	YSI_g_sReplacements[MAX_HOOK_REPLACEMENTS][E_HOOK_NAME_REPLACEMENT_DATA];

		if ((pos = strfind(name, YSI_g_sReplacements[idx][SHORT], false, pos + 1)) == -1)
		{
			++i,
			idx = YSI_g_sReplacementsShortOrder[i];
		}

99% of existing tutorials are kept exactly the same. What people know is the same. You can literally upgrade just by adding the word struct. But, if you want to go a step further you can remove all the decollision name mangling as in the final example with no worries.

As for the .field syntax, I'd say that's independent of this. You could have it or not, it doesn't really matter (.SHORT is only one character less than [SHORT]).

BitFros7y · 2020-11-20T03:36:24Z

OK, I've not read too carefully yet - I'll get to that. But one thing is that I really don't think that whatever the feature is should hide the size of the data. For example:
bool:operator==(Player:a, Player:b) { return a.id == b.id; }
There's nothing in the function declaration there to expose the fact that Player is a structure, not a simple cell. Maybe enums have other problems, but at least they kept that clear and explicit.

Why would we declare in function header if its structure or something else? Its tagged, that tag is used for structure and compiler should figure out that its that exact structure, its members and size. If anything we could use some other and/or additional special character
bool:operator==(Player::a, Player::b) { return a.id == b.id; } should be enough to tell everyone and everything its actually a structure.
Or we could even go like (just a stupid example, but still...)
bool:operator==(Player@a, Player@b) { return a.id == b.id; }
And to be honest i personally would not mind if it was like double tag
bool:operator==(Struct:Player:a, Struct:Player:b) { return a.id == b.id; }

Y-Less · 2020-11-20T10:25:30Z

Pawn is based on cells. One variable is one cell. Anything more than one cell is an array, using []. This syntax is confusing the situation, because now just looking at a variable you can no longer tell if it is a single cell or more than one, nor can you tell how the data is passed (by-value vs by-reference). Fortunately, we already have syntax to show this - []; no need for any new syntax like Struct: or ::.

BitFros7y · 2020-11-22T00:05:09Z

You yourself said that in pawn, anything more then one cell is an array, and arrays are passed by reference. That is said in pawn docs and thats how we know that. So why not just expand that documentation and define what structure is, and how its passed? For example, "Structure is custom decorated, named array, that can have one or more cell's called members. Members should be accessed with . operator. Structure can have other structures as members and its size directly depends on its members."

That is vague and probably wrong description, but i hope you get my point. All that is needed is to define syntax and its meaning. If we chose :: to use for structures, then every time user sees Player::a, he will know, that is a structure called, a, of type Player, he checks (or IDE tells him) what Player structure contains as members and he can access them with a.membername

By the way, you mentioned confusing syntax, but pawn already has confusing syntax by default. For example, without looking at what function does, how can you tell that it returns array?

stock pName(playerid)
{
	new PlayerName[MAX_PLAYER_NAME];
	GetPlayerName(playerid, PlayerName, MAX_PLAYER_NAME);
	return PlayerName;
}

Y-Less · 2020-11-22T09:22:39Z

"There's already confusing syntax" is not a good argument for adding more confusing syntax.

Y-Less · 2020-11-22T09:27:32Z

There's already three different ways to pass a variable by reference. You're proposing a fourth, which is no shorter than any of the others; I just don't think it is worth the overhead at all.

Y-Less · 2020-11-30T01:51:41Z

Another common convention that already exists is using <> for things that are almost, but not quite, array-like.

YashasSamaga added area: syntax state: discuss type: feature labels Nov 11, 2020

Daniel-Cortez mentioned this issue Nov 14, 2020

Almost out of free error IDs #615

Open

Daniel-Cortez mentioned this issue May 30, 2021

Tag-based OO-like code #234

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

user-defined aggregate types #609

user-defined aggregate types #609

YashasSamaga commented Nov 11, 2020 •

edited

Loading

Daniel-Cortez commented Nov 11, 2020 •

edited

Loading

Y-Less commented Nov 11, 2020

Y-Less commented Nov 11, 2020

Y-Less commented Nov 11, 2020

Y-Less commented Nov 11, 2020

Y-Less commented Nov 11, 2020

Y-Less commented Nov 11, 2020

BitFros7y commented Nov 20, 2020

Y-Less commented Nov 20, 2020

BitFros7y commented Nov 22, 2020

Y-Less commented Nov 22, 2020

Y-Less commented Nov 22, 2020

Y-Less commented Nov 30, 2020

user-defined aggregate types #609

user-defined aggregate types #609

Comments

YashasSamaga commented Nov 11, 2020 • edited Loading

Issue description:

Daniel-Cortez commented Nov 11, 2020 • edited Loading

Y-Less commented Nov 11, 2020

Y-Less commented Nov 11, 2020

Y-Less commented Nov 11, 2020

Y-Less commented Nov 11, 2020

Y-Less commented Nov 11, 2020

Y-Less commented Nov 11, 2020

BitFros7y commented Nov 20, 2020

Y-Less commented Nov 20, 2020

BitFros7y commented Nov 22, 2020

Y-Less commented Nov 22, 2020

Y-Less commented Nov 22, 2020

Y-Less commented Nov 30, 2020

YashasSamaga commented Nov 11, 2020 •

edited

Loading

Daniel-Cortez commented Nov 11, 2020 •

edited

Loading