Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: aes encryption #59

Closed
wants to merge 5 commits into from
Closed

feat: aes encryption #59

wants to merge 5 commits into from

Conversation

mikeal
Copy link
Contributor

@mikeal mikeal commented Jan 11, 2021

Followup from the IPLD weekly call.

This adds basic block layer encryption codecs to IPLD and implementations of the standard AES ciphers along with them.

@Gozala you’re going to need to make the type checker happy cause I have no idea what it wants.

@mikeal mikeal requested a review from Gozala January 11, 2021 23:53
@Gozala
Copy link
Contributor

Gozala commented Jan 12, 2021

@Gozala you’re going to need to make the type checker happy cause I have no idea what it wants.

From a brief glance it seems to be unhappy due to the fact that some parameters had been added without corresponding jsdoc annotations. I'll try to propose some changes when I'll get a chance.

return { cid, bytes }
}
const encrypt = async ({ key, cid, bytes }) => {
const len = enc32(cid.bytes.byteLength)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encoding this len is sort of redundant, isn't it, because the CID is self-delimiting?

Granted, it's more work to figure out the length of a CID, because it involves parsing multiple varints. But it's possible. Do we want to make the tradeoff of larger encoded/encrypted size in exchange for simpler implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, but it defers the parsing of those varints and keeps it a lot cleaner. I’ve implemented the algorithm for parsing a CID without knowing its length and, because of CIDv0, it’s not as simple as I’d like so I figured it’s better to just eat 4 bytes and simplify the parsing rules.

return { cid, bytes }
}
const encrypt = async ({ key, cid, bytes }) => {
const len = enc32(cid.bytes.byteLength)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this was a varint rather than uint32 then [len, cid.bytes, bytes] would be the same as the CAR format. I don't know if that will actually afford us anything but we might find interesting things if we pursue a consistent binary format for cid+bytes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I take that back, this is the length of the CID which @warpfork is pointing out is redundant, the length in CAR is for the whole thing, cid+bytes. We could copypasta this to remove the length: https://github.com/ipld/js-car/blob/16d11873a33ae1935beb17847de6fbb90faba1d3/lib/decoder.js#L75-L95

@mikeal
Copy link
Contributor Author

mikeal commented Jan 13, 2021

I’m open to dropping the length of the CID if we think it’s worth shaving 4 bytes. We will need to finally write up the parsing rules for this somewhere, which we probably should have already done anyway. Moved to spec ipld/specs#349

@mikeal
Copy link
Contributor Author

mikeal commented Jan 20, 2021

@Gozala i cleaned up most of the type errors but there are some left that I can’t figure out how to deal with because they involve nested values in the function declaration and the type builder says i can’t use the obvious syntax for that

@Gozala
Copy link
Contributor

Gozala commented Jan 26, 2021

@Gozala i cleaned up most of the type errors but there are some left that I can’t figure out how to deal with because they involve nested values in the function declaration and the type builder says i can’t use the obvious syntax for that

Sorry for the not getting around to this sooner, had a lot on my plate. Just pushed a commit that resolves remaining typings.

Looks like the problem was that TS wanted you an object param to have no other param definition before it's nested children.

@mikeal
Copy link
Contributor Author

mikeal commented Jan 26, 2021

awesome, thanks!

@rvagg any final review comments before merging?

Copy link
Contributor

@Gozala Gozala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provides some feedback from the the perspective of someone less up to date on what it's about. Broadly speaking I'd love if things were given names so it's easier to talk about them and conceptualize them.

* @param {Object} options
* @param {Uint8Array} options.bytes
* @param {Uint8Array} options.iv
* @param {Code} options.code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What' the code here ? I'm bit confused because as far as I understand 0x1400 code for encrypted codec, but then there is this other code which I'm not quite sure what that one represents.

Some comments would be really helpful here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it should be cipher instead of code, that's what it's for, a generic encrypted block that uses an iv with the cipher specified by looking up this integer in the multicodec table

const iv = bytes.subarray(offset, offset + ivsize)
offset += ivsize
bytes = bytes.slice(offset)
return { iv, code, bytes }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be great to give {iv, code, bytes} struct some name (and type def) that way type annotations for encode / decode are more straight forwards and it's also easier to conceptualize.

* @param {Uint8Array} options.value.bytes
* @param {Uint8Array} options.value.iv
*/
const decrypt = async ({ key, value: { iv, bytes } }) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears to me that it would make a lot of sense to give {iv, bytes, code} struct a name (and a type) so it's easier to conceptualize and talk about it.

I would also expect decrypt to expect value.code and assert it against the code instead of just ignoring it all together.

That way decrypt turns key+namedStruct into cid+bytes and encrypt turns key+cid-bytes into namedStruct.

Odd obsession with encrypt/decrypt symmetry also makes me wonder keys should be included in return types as well which would make output of encrypt input of decrypt and vice versa.

Copy link
Contributor Author

@mikeal mikeal Jan 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there’s more utility in having the implementation just use this destructing because then you don’t need to load the type in order to call the method.

If i know that I’m going to call the “decrypt AES function” I can call this with:

const iv = Buffer.from(something1)
const bytes = Buffer.from(something2)
const key = Buffer.from(something3)
decrypt({ key, value: { iv, bytes }})

But also, if I just have a block and I know that it’s encrypted I can spread it into the call.

const key = Buffer.from(something)
descrypt({ key, ...block })

This makes the decrypt function’s interface work with any block based decryption function. It may or may not have an iv, it may have other information in the value, cid, or bytes of the block it wants, and all of that will “just work” if I’m using block spreading.

@rvagg
Copy link
Member

rvagg commented Jan 27, 2021

@Gozala what's the best way to "give them names"? Should we put an interface.ts in the directory to define interfaces for these objects to name them, or can we do that inline here?

@mikeal this needs to have the length removed and the new CID.decodeFirst() used before it's merged.

@Gozala
Copy link
Contributor

Gozala commented Jan 27, 2021

@Gozala what's the best way to "give them names"? Should we put an interface.ts in the directory to define interfaces for these objects to name them, or can we do that inline here?

Both are available options and mostly a matter of preference. I prefer former e.g.:

src/crypto/interface.ts

export interface Secret {
  iv: Uint8Array
  bytes: Uint8Array
}

....

But you could also do inline JSDoc style, which I like less because it more verbose and reads not as clearly IMO:

/**
 * @typedef {Object} Secret
 * @property {Uint8Array} iv
 * @property {Uint8Array} bytes
 */

* @param {Uint8Array} options.bytes
* @param {CID} options.cid
*/
const encrypt = async ({ key, cid, bytes }) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is yet another time where I wish CID+block bytes had a proper name to go by.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually find this to be a much better pattern because there are all kinds of ways that you may arrive at a CID + Bytes (car files reads, blocks, encrypted block decodes, etc) and destructuring them into only the parts these functions need is a great practice that gives the function maximal utility. Once you “name” it you end up with a type, and with blocks that tends to mean that you also have that codec and a decoded state, which is only useful when it is and is a burden when it’s not needed.

@mikeal
Copy link
Contributor Author

mikeal commented Jan 27, 2021

@rvagg ah, yes, thanks for reminding me about that.

@rvagg
Copy link
Member

rvagg commented Jan 29, 2021

OK, I've pushed a version that removes the length prefix and uses CID.decodeFirst() instead.

But here's my proposal for moving forward:

  • This should be its own codec, @ipld/encrypted @ https://github.com/ipld/js-encrypted, since it's adding dependency weight here and is now really just a single codec. Once we do that we could also add ChaCha20 (this seems to be a decent pure JS implementation we could use for that).
  • The AES ciphers in the multicodec table should be renamed to include the key length, aes-128-gcm, aes-192-gcm, aes-256-gcm, etc.
  • We might want to ask someone more qualified whether we should just remove CBC entirely. I think it might be possible to construct a system using encrypted IPLD blocks whereby a communicating with such blocks could expose a padding oracle attack which CBC would be susceptible to. I'm also not sure CCM adds much here either other than proving extensibility.

Thoughts? I do some of this legwork if others agree this is the right path.

/cc @vmx

@carsonfarmer
Copy link
Contributor

While slightly off topic for this thread, I'd recommend checking out stablelib for pure js/ts implementations of just about everything crypto: https://github.com/StableLib/stablelib

@lidel
Copy link
Member

lidel commented Apr 12, 2021

Closing until protocol/web3-dev-team#49 is picked up again.

@lidel lidel closed this Apr 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants