Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove serde_jcs dependency #4641

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

ernoc
Copy link
Contributor

@ernoc ernoc commented Jan 10, 2024

And make Rekor signature verification more robust.

#4616

WIP WIP WIP - feedback wanted:

For the purpose of verifying the Rekor LogEntry signedEntryTimestap field, we don't really care about the contests of the rest of the JSON. That is, we extract root[0].verification.signedEntryTimestamp, but the rest of root[0] (having removed verification, what we really want is its verbatim text.

At the moment we parse and re-render the entire thing: parse JSON bytes into Rust structs --> process --> render to JSON bytes. Parsing will break if Rekor changes the structure in any way, including changes to parts we don't care about .

We render RFC 8785 compliant text as suggested by Rekor here. If there were any differences (e.g. Rekor's payload didn't really comform to RFC 8785, or our output doesn't, or somehow there's a yet undiscovered ambiguity in the standard such that 2 different byte arrays are compliant), then this part breaks. The only solution tough would be doing complicated string manipulation which I'm not suggesting.

I'm proposing:

  1. Don't fully parse the structure as we don't need most of it. Thus our code becomes robust to changes in the bits we don't care about.

  2. Try rendering the rest of JSON using generic JSON structures and serde_json::to_string. Given the structure only contains integers and strings, the only bit that would be likely to cause problems is field ordering, which may be solvable with serde_json. So I'd write a bunch of tests and try to disprove this. If it fails, then I'd add serde_canonical_json as a dependency.

Thoughts?

@ernoc ernoc requested a review from thmsbinder January 10, 2024 15:33
@@ -1 +1,35 @@
{"24296fb24b8ad77a51d549703a3a1c2dd2639ba49617fc563854031cb93e6d354e7b005065c334a8":{"body":"eyJhcGlWZXJzaW9uIjoiMC4wLjEiLCJraW5kIjoicmVrb3JkIiwic3BlYyI6eyJkYXRhIjp7Imhhc2giOnsiYWxnb3JpdGhtIjoic2hhMjU2IiwidmFsdWUiOiI4ZTA2Nzk1ODA5NjM3ZTI3MjNmNjQzODE5MTQ3NzU4NGRhOTI2MjQ2MTZmMTI2MDViODIwZjg1NjUzMDcyYzA5In19LCJzaWduYXR1cmUiOnsiY29udGVudCI6Ik1FVUNJUUQvaHJ3OWVjVlpHRE0zMUIycXE1dEdaNFZtSGNuRytDVml0NW93VURjK2RRSWdUQVI2V2FuY0ZaZUtuNzgwRmRTNkIxZ0cxakNlejFsTXZsTHFtdFBVc28wPSIsImZvcm1hdCI6Ing1MDkiLCJwdWJsaWNLZXkiOnsiY29udGVudCI6IkxTMHRMUzFDUlVkSlRpQlFWVUpNU1VNZ1MwVlpMUzB0TFMwS1RVWnJkMFYzV1VoTGIxcEplbW93UTBGUldVbExiMXBKZW1vd1JFRlJZMFJSWjBGRlVqUmlOUzlNWlZsNE9WZHpOMm93TVVZelNEUlJRVk5rYm1sVVF3cHhaakpJV1cxNEsyOVNLeklyVms1SmRtRllWRTVtVEU1WldHUTVTMVo0YW5OcllqRlVhMHMzU0VjeGVFVTVSMXA0ZW1waWQwWkRkWGxCUFQwS0xTMHRMUzFGVGtRZ1VGVkNURWxESUV0RldTMHRMUzB0Q2c9PSJ9fX19","integratedTime":1691754247,"logID":"c0d23d6ad406973f9559f3ba2d1ca01f84147d8ffc5b8445c224f98b9591801d","logIndex":30891523,"verification":{"inclusionProof":{"checkpoint":"rekor.sigstore.dev - 2605736670972794746\n26728094\nUMgdlhClBSzBbqefL5wyKzDrSb/Wf1yic2lWuqc490o=\nTimestamp: 1691754248454344594\n\n— rekor.sigstore.dev wNI9ajBFAiEAuhZGCMHAXSu1zARzPjKeUQF4i1jY+45EmKodcfQofE4CICxsQ/OxAP0r+9wLT+1PzDuF/kZ5LJ44k6f0oueozZBf\n","hashes":["030b9e9fb6a20219790c620da0677ae9a9d551300d5d53677d2e889b18f93408","665e92da8bb3ecfec55d7084c2e680da9627140ceebd5b90443194974478aaae","d493bd198b0273aaadd90b15daae59f8c437ad7669b1ef0f35ee3bdfcccb0c1c","1c4f5d27f667cf8fdfab11719cb2700c43b6ec0699c0e906582f81cc5bbe627f","6f77ba99b9061179f7cf7d94ad3fafe88137ec4939f7a2855caf7a25b6b4c3eb","f72f831d5e9f5c86157f56bc850d0c505d0baa8af389c91689b5c002b83e47c3","85bbefe750579844c4ef01ba7e50ee147867768adf376df59a7d46d9061a0529","e9f1cc1f52ef6fafa3c87d2c2031f14ef16da2ac47c267601a97c1671307c313","91e4eaeb84796946c5ad1570ea06f4fd07ee9261526b696c251119d888e641e2","e4a5b55c06b38419780a1a1b34b7e1ab1329b55948a105df191ec58325ae6220","1127441051032e9e2b9a7ff43ac6a8d4133438354f8b295c37b23f6292569ff5","fda678e668dd9896f1cdbf160943a690da123917de48afe85edd6c494d08e1b9","5cf299a407ce2c41b16dc87bd3bc7396ba9426d1b5e43ba70bbd979e417c45e6","ba60819f9a3f9ddabeb6ec73d1ba79d04fd2ad69ce8d95c777ab485a9fadb36b","8d152ae03f0ef85238ed66f0f7ab3bc870aee2acd6531a4855fc5011ea6b0e67","ad712c98424de0f1284d4f144b8a95b5d22c181d4c0a246518e7a9a220bdf643"],"logIndex":26728092,"rootHash":"50c81d9610a5052cc16ea79f2f9c322b30eb49bfd67f5ca2736956baa738f74a","treeSize":26728094},"signedEntryTimestamp":"MEYCIQCCN9ip/cW7QfS4EbLyigCs4OKz4wcWUQThuQY00i3PZAIhAKsTz7epe3Gh/9XGLzh4L1yPqcGUCETPPckPvMIZbL/7"}}}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pretty printing so it's easier to refer to this while reading test code.

@ernoc
Copy link
Contributor Author

ernoc commented Jan 10, 2024

On second thoughts, even if with the current structure, serde_json::to_vec produces an output that is canonical, if Rekor adds/changes fields in the future (e.g. to include a date), then our verifier will stop working. So I'll try using serde_canonical_json now.

@@ -159,7 +161,7 @@ impl TryFrom<&LogEntry> for RekorSignatureBundle {
.context("couldn't decode Base64 signature")?;

Ok(Self {
canonicalized,
signed_data: signed_data,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just signed_data should be enough?

@ernoc ernoc marked this pull request as draft January 10, 2024 17:19
@ernoc
Copy link
Contributor Author

ernoc commented Jan 10, 2024

Ok so I tried with serde_canonical_json and it seems to work. Also ryu-js dep is gone.

2 things I'd like your opinion on are:

  • I believe it's good to keep structures like LogEntry for the cases where we use specific fields from it. But we only use body and verification. I'd remove all the others to avoid being fragile to changes in fields we don't care about, but leave a note that there exist more fields and a link to Rekor docs.

  • How much testing around rekor.rs do you think we need, and how much variation of samples of log entry JSON payloads can we get? It seems to me that the one example we have is quite representative, although I'd add a couple of "bad" examples to test that we are resilient to them / get the expected error messages.

Copy link
Collaborator

@tiziano88 tiziano88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, this seems much more resilient to future changes

@@ -21,6 +21,10 @@ use alloc::{collections::BTreeMap, string::String, vec::Vec};
use anyhow::Context;
use base64::{prelude::BASE64_STANDARD, Engine as _};
use serde::{Deserialize, Serialize};
use serde_json::Value;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) remove blank line

#[serde(skip_serializing_if = "Option::is_none")]
#[serde(rename = "verification")]
pub verification: Option<LogEntryVerification>,
// #[serde(rename = "integratedTime")]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting, I guess these fields were just here for the canonicalization but we didn't actually need them for anything? In that case, do we think we may need any of them in the future (I cannot think of a reason, apart from perhaps logging the decoded log entry somewhere as part of the verification, which may actually be nice)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that a little more convenience is better, be it only for logging. The trouble with the canonicalisation shouldn't lead to removing this fields.

.as_object_mut()
.context("Artifact metadata expected to be a JSON object")?;

let verification: Value = log_entry_artifact_object
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you did the hard work of figuring out the actual logic upstream, it may be useful to document that more clearly here.

"Expected exactly 1 entry in log entry root JSON object"
);

let log_entry_artifact_object = log_entry_root_object
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the method chain is quite long here, could you add a type annotation to help readers understand what type this is?

.context("'verification' key not found in artifact JSON")?;

// TODO does this always equal canonicalized?
// Note on preserving order: https://users.rust-lang.org/t/how-to-keep-order-after-using-serde-json-from-str-to-deserialize-a-string-to-struct/97727
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question -- if the fields are reordered, does this still work? Either way, it may be useful to add a few more test cases to show that (some valid , some not)


let signed_entry_timestamp_base64_encoded = verification
.as_object()
.context("Expected 'verification' entry to contain a JSON object")?["signedEntryTimestamp"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will panic if it does not contain that field, which is probably undesirable

// signature,
// })
// }
// }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented code (assuming it's just the original version kept around for reference).

@@ -118,52 +120,53 @@ pub struct LogEntryVerification {
/// be obtained from the `/api/v1/log/publicKey` Rest API. For `sigstore.dev`, it is a PEM-encoded
/// x509/PKIX public key.
pub struct RekorSignatureBundle {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type RekorSignatureBundle is dispensable, it is just a pair of two relatively unrelated things. Also not used outside this file. NB: This is just my personal taste, keep if you think differently.

#[serde(skip_serializing_if = "Option::is_none")]
#[serde(rename = "verification")]
pub verification: Option<LogEntryVerification>,
// #[serde(rename = "integratedTime")]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that a little more convenience is better, be it only for logging. The trouble with the canonicalisation shouldn't lead to removing this fields.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to use the pretty printed JSON I think (so that's good). Saving a few bytes of whitespace is of no importance.


let signed_entry_timestamp_base64_encoded = verification
.as_object()
.context("Expected 'verification' entry to contain a JSON object")?["signedEntryTimestamp"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stumbled over the naming signedEntryTimestamp just now. It is not a timestamp (as the name suggests), but a signature. Maybe at least reflect in the variable names/comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, SG! It's signed entry timestamp as in signed (entry timestamp), not (signed entry) timestamp 😂 . I'll add a comment.

.serialize(&mut serializer)
.expect("Failed to serialize Rekor signed payload to JSON");

let signed_json_bytes = serializer.into_inner();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make a private function which encapsulates Serializer and CanonicalFormatter?

let mut serializer = Serializer::with_formatter(Vec::new(), CanonicalFormatter::new());
log_entry_artifact_object
.serialize(&mut serializer)
.expect("Failed to serialize Rekor signed payload to JSON");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expect panics and does not return an error. Here the message should be returned as Result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants