-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix scvals to print in json string #305
base: master
Are you sure you want to change the base?
Conversation
// printScValJSON is used to print json parsable ScVal output instead of | ||
// the ScVal.String() output which prints out %v values that aren't parsable | ||
// Example: {"type":"Map","value":"[{collateral [{1 2386457777}]} {liabilities []} {supply []}]"} | ||
// | ||
// This function traverses through the ScVal structure by calling removeNulls which recursively removes | ||
// the null pointers from the given ScVal. | ||
// | ||
// The output is then: {"type":"Map","value":"[{\"collateral\": \"[{\"1\": \"2386457777\"}]\"} {\"liabilities\": \"[]\"} {\"supply\": \"[]\"}]\"} | ||
// where "value" is a valid JSON string | ||
func printScValJSON(scVal xdr.ScVal) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts or opinions on the way this is done?
In theory json.Marshal
would have been enough but it would output a JSON string with a null value for each null pointer in ScVal (which is a lot of bloat). The options I thought of were to regex away the nulls (bad) or traverse and remove the nulls recursively (thanks chatGPT).
The only quirk I see is that the Types
embedded in the JSON string are no longer transformed into their string representation. For example it won't say type: Byte
but will be type: 0
in reference to the xdr.ScVal.Type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue that this also belongs in the xdr/xdrill instead of here in stellar-etl but I think that can be redone when the changes/contract events xdrill is written and stellar-etl is refactored
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
➕ agree, this ultimately should be abstracted outside of stellar-etl. I think it's fine in here for now.
Regex will be fairly hard to maintain, and I don't think we should materialize a null value for each null pointer. I think that will make it equally hard to traverse the JSON.
What are the implications for Types
no longer represented as the xdr.ScVal.Type
? Will this make it harder for developers to infer anything about the value
? I suspect no since we are parsing the data as string but want to confirm that this won't impact flexibility for end users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it will impact downstream users. I think it is pretty obvious from the value string JSON what the type will be. In any case the user can always take a look at the enum for the ScVal.Type
const (
ScValTypeScvBool ScValType = 0
ScValTypeScvVoid ScValType = 1
ScValTypeScvError ScValType = 2
ScValTypeScvU32 ScValType = 3
...
Like all the standard types (bool, int, string, map, vec) will be instantly obvious
The weirder ones are
ScValTypeScvContractInstance ScValType = 19
ScValTypeScvLedgerKeyContractInstance ScValType = 20
ScValTypeScvLedgerKeyNonce ScValType = 21
But I would argue these are already special cases and only power users (us) will be doing stuff with them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahhhhhhh if it's the enum that's fine! if needed, we can always create a lookup in dbt to translate back to the string val representation.
@@ -57,7 +57,7 @@ func makeContractEventTestOutput() (output [][]ContractEventOutput, err error) { | |||
topicsDecoded["topics_decoded"] = []map[string]string{ | |||
{ | |||
"type": "B", | |||
"value": "true", | |||
"value": "{\"B\":true,\"Type\":0}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example of the change output. The value is the full ScVal conversion instead of just the relevant ScVal type. I did it this way because I think it's nicer to have full ScVal context when parsing in BigQuery with parse_json instead of looking at both "type" and "value" fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this negate the need for type
then? Would the key of type always be included in value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically yes. It is duplicative
I think the "type": "B"
(or any other type) is useful for filtering specific types of values. Like I only want maps and vecs. Filter out ScValContractInstances or vice versa
I think then having the "Type" embedded in the value will then also be helpful if you filtered for both vecs and maps and after reading the string JSON want to apply different logic for vecs vs maps (because they would be parsed differently).
I can see it both ways though. Maybe it is easier with the explicit type but I personally feel this is easier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amishas157 thoughts on this? i'd like a second opinion. I think you're right, we maintain type
as a standalone key for filtering purposes. i'm fine with the schema as defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am imagining to get the actual value, one may need to perform
topicsDecoded["topics_decoded"]["value"][topicsDecoded["topics_decoded"]["type"]]
Basically use Type
outside value as key. In here Type
inside value is not much helpful. But think it does not hurt to keep it either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass looks good, just a couple comments
@@ -57,7 +57,7 @@ func makeContractEventTestOutput() (output [][]ContractEventOutput, err error) { | |||
topicsDecoded["topics_decoded"] = []map[string]string{ | |||
{ | |||
"type": "B", | |||
"value": "true", | |||
"value": "{\"B\":true,\"Type\":0}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this negate the need for type
then? Would the key of type always be included in value?
// printScValJSON is used to print json parsable ScVal output instead of | ||
// the ScVal.String() output which prints out %v values that aren't parsable | ||
// Example: {"type":"Map","value":"[{collateral [{1 2386457777}]} {liabilities []} {supply []}]"} | ||
// | ||
// This function traverses through the ScVal structure by calling removeNulls which recursively removes | ||
// the null pointers from the given ScVal. | ||
// | ||
// The output is then: {"type":"Map","value":"[{\"collateral\": \"[{\"1\": \"2386457777\"}]\"} {\"liabilities\": \"[]\"} {\"supply\": \"[]\"}]\"} | ||
// where "value" is a valid JSON string | ||
func printScValJSON(scVal xdr.ScVal) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
➕ agree, this ultimately should be abstracted outside of stellar-etl. I think it's fine in here for now.
Regex will be fairly hard to maintain, and I don't think we should materialize a null value for each null pointer. I think that will make it equally hard to traverse the JSON.
What are the implications for Types
no longer represented as the xdr.ScVal.Type
? Will this make it harder for developers to infer anything about the value
? I suspect no since we are parsing the data as string but want to confirm that this won't impact flexibility for end users.
There is a canonical XDR-JSON format that more tooling is using, the Lab uses it, the CLI uses it, the Soroban Rust SDK uses it, and this could use it too. It's implemented in the Rust XDR lib, and that lib is also published as a Wasm powered NPM package. You could take the Wasm and experiment with powering the JSON generation with it too so that you're using the same JSON format. If the JSON format is largely internal, then this doesn't matter, but has much more value if users see it. Rust XDR Lib: https://github.com/stellar/rs-stellar-xdr If this is of interest we can build a Wasm-powered Go lib that does same. |
Ooo it would be nice if there was the Go lib equivalent. Not sure how much interest there would be for it though. As it pertains to this PR and change it is purely for stellar-etl (basically internal). I don't believe it is worth porting (or calling rust or js) for this specific case.
Is there an existing example of this JSON format? I assume it is largely the same because it is just ScVals but doesn't hurt to check. |
Install the The CLI manual is at: https://developers.stellar.org/docs/tools/developer-tools/cli/stellar-cli#stellar-xdr-decode The following commands example encodes and decodes, they work in reverse with the
If you're not sure what a type is, use If there are any ideas for improvement to the Some things worth calling out: The JSON output will include 64-bit integers, so you need to use a custom JSON parser, which is easy, in JavaScript runtimes which only support integers up to 53-bits. This won't impact Go usage, or most non-JS runtimes, that supports 64-bit integers in JSON. Strings are escaped ascii UTF-8 encoded, not simply UTF-8 encoded. These two attributes aren't well documented in the dev docs, but added them now: |
The JSON can also be explored at: |
PR Checklist
PR Structure
otherwise).
Thoroughness
Release planning
semver, and I've changed the name of the BRANCH to major/_ , minor/_ or patch/* .
What
github issue
Fix printing of ScVals for contract_data and history_contract_events to print in a JSON parsable format
Why
The ScVal.String() function uses the %v to print out the ScVal struct which causes some oddities in how nested maps, arrays, and contract instances were converted to where they did not conform to a normal map/json structure. By fixing ScVals to print in a JSON parsable format we can more easily dive into decoded keys/vals using parse_json() in BigQuery
Known limitations
N/A