Library design recommendations. #241

warpfork · 2020-02-19T14:40:07Z

Starting a new folder for this. The content I'm imagining gathering
here isn't exactly specs, per se... but it's definitely content
that's worth gathering somewhere more centrally than in the comments
in the source code of any single implementing library and language.

First up: some remarks on Node and Kind, and how and why regarding Node
as an interface becomes systemically important.

I felt this is particularly important to write about because it's
quite non-obvious from the first couple of things a new library
author is likely to encounter first; it doesn't show up until you
start trying to implement some of the more advanced features...
but by then it may be too late to address without a painful refactor.
So knowing about it earlier is likely to save a great deal of work.

There's probably quite a lot of other content we can gather which is
useful recommendations for implementers, but not exactly "spec",
which can later flesh out this folder with more content;
this nodes-and-kinds doc is just the first thing I come up with.

Starting a new folder for this. The content I'm imagining gathering here isn't *exactly* specs, per se... but it's definitely content that's worth gathering somewhere more centrally than in the comments in the source code of any single implementing library and language. First up: some remarks on Node and Kind, and how and why regarding Node as an interface becomes systemically important. I felt this is particularly important to write about because it's quite non-obvious from the first couple of things a new library author is likely to encounter first; it doesn't show up until you start trying to implement some of the more advanced features... but by then it may be too late to address without a painful refactor. So knowing about it earlier is likely to save a great deal of work. There's probably quite a lot of other content we can gather which is useful recommendations for implementers, but not exactly "spec", which can later flesh out this folder with more content; this nodes-and-kinds doc is just the first thing I come up with.

vmx · 2020-02-19T16:03:31Z

design/libraries/nodes-and-kinds.md

+
+`Kind` does not include the Schema layer's concept of "struct", etc.
+
+`Kind` must be an enum, **and not a sum type**.  Attempting to implement


Based on previous conversation I assume you mean with "enum" something that can be extended and "sum type" something fixed. Then it doesn't make sense to me. I can see the Node should be open, but Kind is a fixed thing.

To me Node is an interface, Kind is a fixed list of items that describe the Data Model.

No, other way around. Enum is fixed. Sumtypes aren't.

Enum has a cardinality of the count of its members.

Sumtype has a cardinality of the sum of the cardinality of whatever its members are... so it can easily become "countable infinity" if any one of the members is.

So,

To me Node is an interface, Kind is a fixed list of items that describe the Data Model.

^ this statement you conclude with is indeed correct.

(It seriously drives me nuts that the Rust syntax conflates these two things into one name. They have such very, very, categorically different properties... 👿 😠 🔥 )

I really need to brush my programming language/type theory vocab and write it down for 5 year olds. Or perhaps here it doesn't really matter. Kind is a fixed set of things (whatever we call it).

... Reading again in the light of the morning, I'm thinking "fixed" is probably not quite the word to disambiguate those two either. Enum and sum types both have fixed membership in terms of types; it's just that the cardinality of member values can easily be infinite in a sum type (it happens as soon as a single thing like 'int' or 'string' is part of one of the sum's member types).

Whereas what the Node type should be is an interface, because that can have a non-fixed and not-known-in-advance number of member types.

Maybe we actually should hammer out a quick type systems vocab primer/glossary for ourselves, here in this repo, just for clarity of reference and ease of linking? Good idea @vmx !

Perhaps something in the 'concepts' directory. The 'cardinality' doc over there already kind of gets close to flirting with this. Expanding on it to talk about sum types versus enums versus product types versus countably-infinite scalars etc could be really useful.

Having some proper vocab/glossary would be great. I think I slowly grok what you mean with cardinality in regards to enums and sum types. Thinking of "enums" as what the name suggests: enumerable items, hence finite. A sum type can have recursion (that's probably not the right word)/wrap "invitite" (as in all ints) values.

Yeah, you've... got me on a roll with this kick in the shorts, actually, so thanks for that :D I'll probably hoist another PR with some content about this in a day or two.

I really need to brush my programming language/type theory vocab and write it down for 5 year olds.

Don't feel bad, I need to engage a compsci translation layer when trying to absorb some of @warpfork's thoughts. This just argues for needing to be more clear when things get overly academic so we can talk to a broad enough audience. It's probably on us when we see things like this to highlight them as needing further carification.

mikeal · 2020-02-19T19:45:13Z

Should I add a section on what a good Block interface should be?

warpfork · 2020-02-20T10:50:12Z

Should I add a section on what a good Block interface should be?

I'd put it in another file. I kinda have one theme that I wanna hammer the crap out of in this file, which is that Node really needs to be an interface or there will be regrets.

warpfork · 2020-02-20T11:00:19Z

There's a good idea for further followup in @vmx's comment thread -- but I don't wanna do it today / in this PR 😅

Any objections on moving this one towards a merge?

@vmx

The goal here is to provide just enough discussion about type theory and the basics of applied state counting that we can establish some language-independent terminology clearly. A lot of this exists in literature and theory already... but gathering it in one place, written in one style, in a single page that can be read top to bottom in one sitting... seems to provide value. (Sending someone off on a quest to "read wikipedia and all the related content around concept $X" is great and all, but, ehm. It's a little high latency, a little unreliable in outcome, etc.) This subsumed and replaces the cardinality doc, so that file be yeet. This was kicked off in large part by discussion over in #241 (comment) , so thanks to @vmx for some of the kick in the shorts to start writing. Also thanks to @bsunsrud and @BatmanAoD for some polishing and early feedback on early drafts, and to @Reasonable-Solutions for some review of the categorical bits, all of which was very helpful.

@vmx

The goal here is to provide just enough discussion about type theory and the basics of applied state counting that we can establish some language-independent terminology clearly. A lot of this exists in literature and theory already... but gathering it in one place, written in one style, in a single page that can be read top to bottom in one sitting... seems to provide value. (Sending someone off on a quest to "read wikipedia and all the related content around concept $X" is great and all, but, ehm. It's a little high latency, a little unreliable in outcome, etc.) This subsumed and replaces the cardinality doc, so that file be yeet. This was kicked off in large part by discussion over in #241 (comment) , so thanks to @vmx for some of the kick in the shorts to start writing. Also thanks to @bsunsrud and @BatmanAoD for some polishing and early feedback on early drafts, and to @Reasonable-Solutions for some review of the categorical bits, all of which was very helpful.

design/libraries/nodes-and-kinds.md

rvagg · 2020-03-05T06:31:11Z

design/libraries/README.md

+
+Some of the information expressed here comes down to opinions moreso than specification;
+what is good ergonomics may vary wildly per language, so take these as
+recommendations rather than strictures.


I think a preface here about the perspective this is written from would be appropriate. Some of the language below applies very specifically to a narrow set of languages (the para on Node as an interface for example). But that's OK if we make it clear up-front where this is coming from.

Something like:

These design guidelines are primarily based on the experience of building, and rebuilding IPLD libraries in Go and reflecting on the limitations of implementations that have existed in both Go and JavaScript and the implications of those limitations on the potential and feature-set of IPLD. The language used in these guidelines are reflective of a Go programming perspective but apply broadly to most strongly typed languages. Loosely and untyped languages will need to interpret these guidelines appropriately while extracting the key concepts.

Added a large block of caveats to the top of the doc.

I almost wonder if we won't end up making a habit of repeating chunks of these at the top of most of our documents. Benedictions on one page or another don't really seem to carry over to even their most proximate siblings when readers jump into one doc or other without following any path to get there that we anticipated.

rvagg · 2020-03-06T02:23:05Z

design/libraries/nodes-and-kinds.md

+
+Transformations can be implemented in this way.
+
+Codecs themselves can be implemented this way.


Can you expand on this one? I can't even figure out what it would mean for a codec to take a Node and return a Node.

Added some psuedocode to explore this. I'm worried it might be distracting line noise if not read tolernately, but I guess it's probably better than nothing.

rvagg · 2020-03-06T02:29:24Z

Some small suggestions in my comments. I'm OK with merging this after addressing those.

The most valuable thing from the discussion you and I had about this @warpfork from the JavaScript perspective was the ideas around porcelain vs plumbing which we've totally messed up in JS which leads to a radically different design of ipld-prime vs the APIs we currently have available in JS.

This is a higher-level conern that what you've addressed here so maybe it doesn't belong in this file—or it could go into an intro section. It'd be something like "why we need a Node abstraction around IPLD data" and discuss metadata and state maintenance. That sometimes you just want a native instantiation of some blob of IPLD data (AsObject()) but in doing so, you lose valuable data that you may need for certain workflows and you miss out on efficiency and performance opportunities.

rvagg · 2020-03-06T02:31:05Z

this also deserves an entry in the main README.md I reckon

More caveats and purpose preface at the top. Remark on the possibility of Node/NodeBuilder split in a new section. Expand upon some of the higher-level functions content with short examples of possible code. (I'm a little worried that putting any syntax at all here might reduce the broadness of audience, but apparently _something_ is necessary, per review comments in #241 (comment) ... so maybe this will do it. I've chosen a syntax that simply doesn't exist in any language at all (to my knowledge) to make sure everyone in the world is _equally_ affronted. Gesundheit.)

warpfork · 2020-05-04T12:39:53Z

I addressed a bunch of comments, and this has been out for quite a while, so I'm gonna roll forward in accordance with that last ~"lgtm after suggestions" comment and merge this.

I didn't attempt to address that porcelain-vs-plumbing thing @rvagg brought up, but I think it'd be really really excellent to do that in the future.

@vmx

The goal here is to provide just enough discussion about type theory and the basics of applied state counting that we can establish some language-independent terminology clearly. A lot of this exists in literature and theory already... but gathering it in one place, written in one style, in a single page that can be read top to bottom in one sitting... seems to provide value. (Sending someone off on a quest to "read wikipedia and all the related content around concept $X" is great and all, but, ehm. It's a little high latency, a little unreliable in outcome, etc.) This subsumed and replaces the cardinality doc, so that file be yeet. This was kicked off in large part by discussion over in #241 (comment) , so thanks to @vmx for some of the kick in the shorts to start writing. Also thanks to @bsunsrud and @BatmanAoD for some polishing and early feedback on early drafts, and to @Reasonable-Solutions for some review of the categorical bits, all of which was very helpful.

rvagg · 2020-05-05T04:05:53Z

some of my recent work even switched to using the "porcelain" terminology .. it wasn't really part of my CS language (aside from git's use of it) but you've colonised my head on this one

More caveats and purpose preface at the top. Remark on the possibility of Node/NodeBuilder split in a new section. Expand upon some of the higher-level functions content with short examples of possible code. (I'm a little worried that putting any syntax at all here might reduce the broadness of audience, but apparently _something_ is necessary, per review comments in ipld#241 (comment) ... so maybe this will do it. I've chosen a syntax that simply doesn't exist in any language at all (to my knowledge) to make sure everyone in the world is _equally_ affronted. Gesundheit.)

Library design recommendations.

@vmx

The goal here is to provide just enough discussion about type theory and the basics of applied state counting that we can establish some language-independent terminology clearly. A lot of this exists in literature and theory already... but gathering it in one place, written in one style, in a single page that can be read top to bottom in one sitting... seems to provide value. (Sending someone off on a quest to "read wikipedia and all the related content around concept $X" is great and all, but, ehm. It's a little high latency, a little unreliable in outcome, etc.) This subsumed and replaces the cardinality doc, so that file be yeet. This was kicked off in large part by discussion over in ipld#241 (comment) , so thanks to @vmx for some of the kick in the shorts to start writing. Also thanks to @bsunsrud and @BatmanAoD for some polishing and early feedback on early drafts, and to @Reasonable-Solutions for some review of the categorical bits, all of which was very helpful.

vmx reviewed Feb 19, 2020

View reviewed changes

warpfork mentioned this pull request Feb 23, 2020

Introduce a type theory glossary. #242

Merged

rvagg reviewed Mar 5, 2020

View reviewed changes

design/libraries/nodes-and-kinds.md Show resolved Hide resolved

rvagg reviewed Mar 5, 2020

View reviewed changes

rvagg reviewed Mar 6, 2020

View reviewed changes

warpfork added 2 commits May 4, 2020 14:03

Add links to new content from root readme.

965b927

warpfork merged commit 819fcb0 into master May 4, 2020

warpfork deleted the library-design-recommendations branch May 4, 2020 12:42

prataprc pushed a commit to iprs-dev/ipld-specs that referenced this pull request Oct 13, 2020

Merge pull request ipld#241 from ipld/library-design-recommendations

c8e0880

Library design recommendations.

warpfork mentioned this pull request Oct 21, 2020

CBOR: Data model serialization in Rust #323

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Library design recommendations. #241

Library design recommendations. #241

warpfork commented Feb 19, 2020

vmx Feb 19, 2020

warpfork Feb 19, 2020 •

edited

Loading

warpfork Feb 19, 2020

vmx Feb 19, 2020

warpfork Feb 20, 2020

warpfork Feb 20, 2020 •

edited

Loading

vmx Feb 20, 2020

warpfork Feb 20, 2020

rvagg Mar 5, 2020

mikeal commented Feb 19, 2020

warpfork commented Feb 20, 2020

warpfork commented Feb 20, 2020

rvagg Mar 5, 2020

warpfork May 4, 2020

rvagg Mar 6, 2020

warpfork May 4, 2020

rvagg commented Mar 6, 2020

rvagg commented Mar 6, 2020

warpfork commented May 4, 2020

rvagg commented May 5, 2020


		`Kind` does not include the Schema layer's concept of "struct", etc.

		`Kind` must be an enum, and not a sum type. Attempting to implement


		Transformations can be implemented in this way.

		Codecs themselves can be implemented this way.

Library design recommendations. #241

Library design recommendations. #241

Conversation

warpfork commented Feb 19, 2020

Choose a reason for hiding this comment

warpfork Feb 19, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

warpfork Feb 20, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mikeal commented Feb 19, 2020

warpfork commented Feb 20, 2020

warpfork commented Feb 20, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rvagg commented Mar 6, 2020

rvagg commented Mar 6, 2020

warpfork commented May 4, 2020

rvagg commented May 5, 2020

warpfork Feb 19, 2020 •

edited

Loading

warpfork Feb 20, 2020 •

edited

Loading