Skip to content
This repository has been archived by the owner on Jun 29, 2022. It is now read-only.

Library design recommendations. #241

Merged
merged 3 commits into from
May 4, 2020
Merged

Conversation

warpfork
Copy link
Contributor

Starting a new folder for this. The content I'm imagining gathering
here isn't exactly specs, per se... but it's definitely content
that's worth gathering somewhere more centrally than in the comments
in the source code of any single implementing library and language.

First up: some remarks on Node and Kind, and how and why regarding Node
as an interface becomes systemically important.

I felt this is particularly important to write about because it's
quite non-obvious from the first couple of things a new library
author is likely to encounter first; it doesn't show up until you
start trying to implement some of the more advanced features...
but by then it may be too late to address without a painful refactor.
So knowing about it earlier is likely to save a great deal of work.

There's probably quite a lot of other content we can gather which is
useful recommendations for implementers, but not exactly "spec",
which can later flesh out this folder with more content;
this nodes-and-kinds doc is just the first thing I come up with.

Starting a new folder for this.  The content I'm imagining gathering
here isn't *exactly* specs, per se... but it's definitely content
that's worth gathering somewhere more centrally than in the comments
in the source code of any single implementing library and language.

First up: some remarks on Node and Kind, and how and why regarding Node
as an interface becomes systemically important.

I felt this is particularly important to write about because it's
quite non-obvious from the first couple of things a new library
author is likely to encounter first; it doesn't show up until you
start trying to implement some of the more advanced features...
but by then it may be too late to address without a painful refactor.
So knowing about it earlier is likely to save a great deal of work.

There's probably quite a lot of other content we can gather which is
useful recommendations for implementers, but not exactly "spec",
which can later flesh out this folder with more content;
this nodes-and-kinds doc is just the first thing I come up with.

`Kind` does not include the Schema layer's concept of "struct", etc.

`Kind` must be an enum, **and not a sum type**. Attempting to implement
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on previous conversation I assume you mean with "enum" something that can be extended and "sum type" something fixed. Then it doesn't make sense to me. I can see the Node should be open, but Kind is a fixed thing.

To me Node is an interface, Kind is a fixed list of items that describe the Data Model.

Copy link
Contributor Author

@warpfork warpfork Feb 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, other way around. Enum is fixed. Sumtypes aren't.

Enum has a cardinality of the count of its members.

Sumtype has a cardinality of the sum of the cardinality of whatever its members are... so it can easily become "countable infinity" if any one of the members is.

So,

To me Node is an interface, Kind is a fixed list of items that describe the Data Model.

^ this statement you conclude with is indeed correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(It seriously drives me nuts that the Rust syntax conflates these two things into one name. They have such very, very, categorically different properties... 👿 😠 🔥 )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really need to brush my programming language/type theory vocab and write it down for 5 year olds. Or perhaps here it doesn't really matter. Kind is a fixed set of things (whatever we call it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... Reading again in the light of the morning, I'm thinking "fixed" is probably not quite the word to disambiguate those two either. Enum and sum types both have fixed membership in terms of types; it's just that the cardinality of member values can easily be infinite in a sum type (it happens as soon as a single thing like 'int' or 'string' is part of one of the sum's member types).

Whereas what the Node type should be is an interface, because that can have a non-fixed and not-known-in-advance number of member types.

Copy link
Contributor Author

@warpfork warpfork Feb 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we actually should hammer out a quick type systems vocab primer/glossary for ourselves, here in this repo, just for clarity of reference and ease of linking? Good idea @vmx !

Perhaps something in the 'concepts' directory. The 'cardinality' doc over there already kind of gets close to flirting with this. Expanding on it to talk about sum types versus enums versus product types versus countably-infinite scalars etc could be really useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having some proper vocab/glossary would be great. I think I slowly grok what you mean with cardinality in regards to enums and sum types. Thinking of "enums" as what the name suggests: enumerable items, hence finite. A sum type can have recursion (that's probably not the right word)/wrap "invitite" (as in all ints) values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you've... got me on a roll with this kick in the shorts, actually, so thanks for that :D I'll probably hoist another PR with some content about this in a day or two.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really need to brush my programming language/type theory vocab and write it down for 5 year olds.

Don't feel bad, I need to engage a compsci translation layer when trying to absorb some of @warpfork's thoughts. This just argues for needing to be more clear when things get overly academic so we can talk to a broad enough audience. It's probably on us when we see things like this to highlight them as needing further carification.

@mikeal
Copy link
Contributor

mikeal commented Feb 19, 2020

Should I add a section on what a good Block interface should be?

@warpfork
Copy link
Contributor Author

Should I add a section on what a good Block interface should be?

I'd put it in another file. I kinda have one theme that I wanna hammer the crap out of in this file, which is that Node really needs to be an interface or there will be regrets.

@warpfork
Copy link
Contributor Author

There's a good idea for further followup in @vmx's comment thread -- but I don't wanna do it today / in this PR 😅

Any objections on moving this one towards a merge?

warpfork added a commit that referenced this pull request Feb 23, 2020
The goal here is to provide just enough discussion about type theory
and the basics of applied state counting that we can establish some
language-independent terminology clearly.

A lot of this exists in literature and theory already... but gathering
it in one place, written in one style, in a single page that can be
read top to bottom in one sitting... seems to provide value.
(Sending someone off on a quest to "read wikipedia and all the related
content around concept $X" is great and all, but, ehm.  It's a little
high latency, a little unreliable in outcome, etc.)

This subsumed and replaces the cardinality doc, so that file be yeet.

This was kicked off in large part by discussion over in
#241 (comment) ,
so thanks to @vmx for some of the kick in the shorts to start writing.

Also thanks to @bsunsrud and @BatmanAoD for some polishing and early
feedback on early drafts, and to @Reasonable-Solutions for some review
of the categorical bits, all of which was very helpful.
warpfork added a commit that referenced this pull request Feb 23, 2020
The goal here is to provide just enough discussion about type theory
and the basics of applied state counting that we can establish some
language-independent terminology clearly.

A lot of this exists in literature and theory already... but gathering
it in one place, written in one style, in a single page that can be
read top to bottom in one sitting... seems to provide value.
(Sending someone off on a quest to "read wikipedia and all the related
content around concept $X" is great and all, but, ehm.  It's a little
high latency, a little unreliable in outcome, etc.)

This subsumed and replaces the cardinality doc, so that file be yeet.

This was kicked off in large part by discussion over in
#241 (comment) ,
so thanks to @vmx for some of the kick in the shorts to start writing.

Also thanks to @bsunsrud and @BatmanAoD for some polishing and early
feedback on early drafts, and to @Reasonable-Solutions for some review
of the categorical bits, all of which was very helpful.

Some of the information expressed here comes down to opinions moreso than specification;
what is good ergonomics may vary wildly per language, so take these as
recommendations rather than strictures.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a preface here about the perspective this is written from would be appropriate. Some of the language below applies very specifically to a narrow set of languages (the para on Node as an interface for example). But that's OK if we make it clear up-front where this is coming from.

Something like:

These design guidelines are primarily based on the experience of building, and rebuilding IPLD libraries in Go and reflecting on the limitations of implementations that have existed in both Go and JavaScript and the implications of those limitations on the potential and feature-set of IPLD. The language used in these guidelines are reflective of a Go programming perspective but apply broadly to most strongly typed languages. Loosely and untyped languages will need to interpret these guidelines appropriately while extracting the key concepts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a large block of caveats to the top of the doc.

I almost wonder if we won't end up making a habit of repeating chunks of these at the top of most of our documents. Benedictions on one page or another don't really seem to carry over to even their most proximate siblings when readers jump into one doc or other without following any path to get there that we anticipated.


Transformations can be implemented in this way.

Codecs themselves can be implemented this way.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on this one? I can't even figure out what it would mean for a codec to take a Node and return a Node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some psuedocode to explore this. I'm worried it might be distracting line noise if not read tolernately, but I guess it's probably better than nothing.

@rvagg
Copy link
Member

rvagg commented Mar 6, 2020

Some small suggestions in my comments. I'm OK with merging this after addressing those.

The most valuable thing from the discussion you and I had about this @warpfork from the JavaScript perspective was the ideas around porcelain vs plumbing which we've totally messed up in JS which leads to a radically different design of ipld-prime vs the APIs we currently have available in JS.

This is a higher-level conern that what you've addressed here so maybe it doesn't belong in this file—or it could go into an intro section. It'd be something like "why we need a Node abstraction around IPLD data" and discuss metadata and state maintenance. That sometimes you just want a native instantiation of some blob of IPLD data (AsObject()) but in doing so, you lose valuable data that you may need for certain workflows and you miss out on efficiency and performance opportunities.

@rvagg
Copy link
Member

rvagg commented Mar 6, 2020

this also deserves an entry in the main README.md I reckon

warpfork added 2 commits May 4, 2020 14:03
More caveats and purpose preface at the top.

Remark on the possibility of Node/NodeBuilder split in a new section.

Expand upon some of the higher-level functions content with short
examples of possible code.  (I'm a little worried that putting any
syntax at all here might reduce the broadness of audience, but
apparently _something_ is necessary, per review comments in
#241 (comment) ...
so maybe this will do it.  I've chosen a syntax that simply doesn't
exist in any language at all (to my knowledge) to make sure everyone
in the world is _equally_ affronted.  Gesundheit.)
@warpfork
Copy link
Contributor Author

warpfork commented May 4, 2020

I addressed a bunch of comments, and this has been out for quite a while, so I'm gonna roll forward in accordance with that last ~"lgtm after suggestions" comment and merge this.

I didn't attempt to address that porcelain-vs-plumbing thing @rvagg brought up, but I think it'd be really really excellent to do that in the future.

@warpfork warpfork merged commit 819fcb0 into master May 4, 2020
@warpfork warpfork deleted the library-design-recommendations branch May 4, 2020 12:42
warpfork added a commit that referenced this pull request May 4, 2020
The goal here is to provide just enough discussion about type theory
and the basics of applied state counting that we can establish some
language-independent terminology clearly.

A lot of this exists in literature and theory already... but gathering
it in one place, written in one style, in a single page that can be
read top to bottom in one sitting... seems to provide value.
(Sending someone off on a quest to "read wikipedia and all the related
content around concept $X" is great and all, but, ehm.  It's a little
high latency, a little unreliable in outcome, etc.)

This subsumed and replaces the cardinality doc, so that file be yeet.

This was kicked off in large part by discussion over in
#241 (comment) ,
so thanks to @vmx for some of the kick in the shorts to start writing.

Also thanks to @bsunsrud and @BatmanAoD for some polishing and early
feedback on early drafts, and to @Reasonable-Solutions for some review
of the categorical bits, all of which was very helpful.
@rvagg
Copy link
Member

rvagg commented May 5, 2020

some of my recent work even switched to using the "porcelain" terminology .. it wasn't really part of my CS language (aside from git's use of it) but you've colonised my head on this one

prataprc pushed a commit to iprs-dev/ipld-specs that referenced this pull request Oct 13, 2020
More caveats and purpose preface at the top.

Remark on the possibility of Node/NodeBuilder split in a new section.

Expand upon some of the higher-level functions content with short
examples of possible code.  (I'm a little worried that putting any
syntax at all here might reduce the broadness of audience, but
apparently _something_ is necessary, per review comments in
ipld#241 (comment) ...
so maybe this will do it.  I've chosen a syntax that simply doesn't
exist in any language at all (to my knowledge) to make sure everyone
in the world is _equally_ affronted.  Gesundheit.)
prataprc pushed a commit to iprs-dev/ipld-specs that referenced this pull request Oct 13, 2020
prataprc pushed a commit to iprs-dev/ipld-specs that referenced this pull request Oct 13, 2020
The goal here is to provide just enough discussion about type theory
and the basics of applied state counting that we can establish some
language-independent terminology clearly.

A lot of this exists in literature and theory already... but gathering
it in one place, written in one style, in a single page that can be
read top to bottom in one sitting... seems to provide value.
(Sending someone off on a quest to "read wikipedia and all the related
content around concept $X" is great and all, but, ehm.  It's a little
high latency, a little unreliable in outcome, etc.)

This subsumed and replaces the cardinality doc, so that file be yeet.

This was kicked off in large part by discussion over in
ipld#241 (comment) ,
so thanks to @vmx for some of the kick in the shorts to start writing.

Also thanks to @bsunsrud and @BatmanAoD for some polishing and early
feedback on early drafts, and to @Reasonable-Solutions for some review
of the categorical bits, all of which was very helpful.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants