Formalize schema, standardize on normalization process, and improve solc support #3

gnidan · 2017-05-16T01:02:00Z

"Don't avert your eyes; normalize!"

Ref: #1

Herein:

N.B. this work involves several breaking changes and will require a major release.

Define json-schema specification of contract objects
Match bytecode/deployedBytecode convention from newer solc, replacing runtimeBinary, srcmapRuntime, etc.
Rename snake_case properties to their camelCase equivalents:
- contract_name -> contractName
- updated_at -> updatedAt
- schema_version -> schemaVersion
Update Schema module to expose the following behaviors:
- validate(obj) returns bool
- normalize(objDirty) returns obj
(Remove generateObject)
Ensure contract objects are safely and correctly generated from all kinds of different sources ("dirty" objects), namely:
- Previous versions' contract object JSON files (with different keys, etc.)
- truffle-contract abstraction classes or instances
- solc output either from .compile or .compileStandard
- Contract objects that already match the standard
Use ISO8601 timestamp for updatedAt

Normalization approach:

Reimplement contract object normalization with declarative description of translation:
- Define properties as single canonical output key, sourced from one of multiple alternative input keys.
- Ensure that normalization happens as part of a single-pass operation over this translation description.
Eliminate "flat-object"-style input (no more top-level address, network_id, links, events!)

- Add runtimeBytecode -> deployedBytecode normalization - Ensure deployedBytecode starts with 0x in generated output

gnidan

(some questions / thoughts)

gnidan · 2017-05-16T01:03:15Z

spec/contract.spec.json

+  "definitions": {
+    "ContractName": {
+      "type": "string",
+      "pattern": "^[a-zA-Z_][a-zA-Z0-9_]*$"


Not sure if contract names should be so restricted?

This restriction seems fine. Might be worth seeing what Solidity restricts them to.

gnidan · 2017-05-16T01:04:01Z

spec/contract.spec.json

+    },
+    "Address": {
+      "type": "string",
+      "pattern": "^0x[a-fA-F0-9]{40}$"


There's bound to be lots of these specific-number-of-bytes-in-hex regexes... not sure if there's any way around that.

Not sure if there needs to be. This looks good to me.

gnidan · 2017-05-16T01:05:15Z

spec/contract.spec.json

+      "type": "string"
+    },
+    "AST": {
+      "type": "object"


is the AST representation in scope for something to be part of this spec?

Definitely not in scope - at least, don't write your own validator for it. I'd be surprised if solidity doesn't already have one. If they do, treat it like the abi above; if they don't, leave it alone for now.

gnidan · 2017-05-16T01:06:03Z

spec/contract.spec.json

+      "pattern": "^0x[a-fA-F0-9]{40}$"
+    },
+    "ABI": {
+      "type": "array"


Is the ABI part of the spec? ethpm/ethpm-spec#20 indicates that the recommendation is "opaque JSON"

"doing what ethpm does" is probably a good strategy. Our validator within the truffle-contract-schema can include the abi schema and validate that as well.

gnidan · 2017-05-16T01:06:52Z

spec/contract.spec.json

+      }
+    },
+    "Event": {
+      "type": "object"


Each event is a subset of the ABI; we probably want to include that in the spec here? So would we want to include the ABI in the spec, if so?

I wouldn't, see above. Although you could copy the event portion of the abi json schema since we're using that within our own. I think either way (either copying the abi's schema for events, or doing nothing) is good for now.

Wait I'm confused - I don't believe there is a json-schema for the ABI, so there's no event portion. I will double-check in EthPM's Gitter to see what Piper concluded on.

Oh I guess @tcoulter the question I would have is "how much does truffle treat the ABI/events as a black box, how much does truffle need to know the internals?"

I would guess "very needs to know the internals", and so I would think it's probably worth specing those out fully, even if there's some risk they'll change upstream.

It looks like Piper linked to a python version of the json schema. That's what I was referring to. Honestly I didn't click in. If there's no javascript json schema, then lets ignore it.

I would guess "very needs to know the internals", and so I would think it's probably worth specing those out fully, even if there's some risk they'll change upstream.

If you think you can spec those out without taking too much time, go for it. If you think it'll take awhile, leave it as a black box.

If you think you can spec those out without taking too much time, go for it. If you think it'll take awhile, leave it as a black box.

I'll look into that after doing code stuff. I don't think it will take a lot of time, but I'm treating this pass at the schema as a first version anyway, so certainly can leave things out of scope.

gnidan · 2017-05-16T17:37:26Z

@tcoulter – wanted to follow-up to mention one other thing with this, namely, code-generating the json-schema.

Not pushing for it right now, naturally, but I think there is potentially some value in making the canonical json-schema be the generated output of a build process (like what @pipermerriam is doing in Python.)

One of the main reasons I would suggest this is because of all the byte regexes. Unfortunately, there does not seem to be a way to deduplicate object keys, like there is for deduplicating object values ("$ref": ...). Writing some basic preprocessing would make these regexes a lot less error-prone, and probably remove a lot of the cruft required with allOf inheritence (see bytecode/deployedBytecode)

YAML does the kind of preprocessing I'm imagining, through it's notion of anchors. For example:

definitions:
  Bytecode: &bytecode
    type: string
    pattern: "^...$"
properties:
  bytecode:
    <<: *bytecode
    description: Contract creation bytecode
  deployedBytecode:
    <<: *bytecode
    description: Contract bytecode on chain

YAML also does nice multiline strings with various kinds of escaping, which I like, and even supports comments. But I don't know, is YAML a bad word? Obviously we'd generate JSON from the YAML as the actual schema artifact.

Can skip YAML and just do JSON manipulation also, probably a million different pre-built ways, too, but I include YAML specifically because of the features I listed.

Really, I don't feel much urgency in doing this kind of thing; just wanted to include these thoughts somewhere transparent, and curious what your initial thoughts might be.

gnidan · 2017-05-16T17:44:28Z

(Oh actually it seems that json-schema has some notion of custom formats. That might be an easier thing to do before introducing preprocessing)

gnidan · 2017-05-16T18:20:19Z

@tcoulter Does it make sense to call [an object represented by a contract JSON file] a "Contract Object"?

(Distinguished from the "[Contract] Abstraction", because it doesn't have methods or behavior?)

- Use separate methods for different sources (solc, abstraction, regular opts) - Make `networks`, `address`, etc. (properties related to instance or truffle specifically) explicitly not a part of normalization process - Move code around in `generateObject` to ensure truffle-option copying happens (now that it's gone from normalize) - Switch to camelCase for fields in schema - Add initial test for solc standard output - Remove `x-` check for `normalizeInput`, as that's no longer part of that.

tcoulter · 2017-05-16T21:47:20Z

I'm into code generation of the schema if the schema becomes unwieldy or hard to manage, or if we encounter bugs that code generation solves. I'd say as a first pass do it by hand unless code generation saves you time initially, rather than taking more time.

(Oh actually it seems that json-schema has some notion of custom formats. That might be an easier thing to do before introducing preprocessing)

👍

Does it make sense to call [an object represented by a contract JSON file] a "Contract Object"?

I like that.

tcoulter · 2017-05-16T21:55:21Z

index.js

+        reject(validate.errors);
+      }
+    });
+  },


I almost wonder if this is too explicit (and I would counter my own argument by asking if it matters one way or another). Still, I like validate() over validateContractObject(), especially in the case where there's only one type of thing that ever gets validated. If there are two types of things, I would definitely call for the explicit function name. What's your take on simply naming it validate()?

Additionally, why is this wrapped inside a promise? It looks as though Ajv's validate function is synchronous, meaning you can have it throw instead of promisified. I recommend only wrapping it in a promise if it's doing asynchronous work or if there's a high likelihood we'll add asynchronous code to the function in the future.

Still, I like validate() over validateContractObject()

Agreed, I'll switch it.

Additionally, why is this wrapped inside a promise? It looks as though Ajv's validate function is synchronous, meaning you can have it throw instead of promisified. I recommend only wrapping it in a promise if it's doing asynchronous work or if there's a high likelihood we'll add asynchronous code to the function in the future.

I figured that I didn't want to make any assumptions about the underlying schema validation library. If Ajv turns out to not be the best choice and we have to change it to something that uses Promises, we'd already be covered.

But it might be heavy handed, I'm kind of just playing around to see what looks good / makes sense. Your call!

- Use property compatibility / data transformation mapping to identify where data should come from in a uniform manner.

- Drop notion of "additional sources" - define properties to list sources explicitly including their canonical name, or default to source only canonical name when `"sources":` is omitted - Allow string values for sources in addition to functions. For string values, assume possible chain operation (e.g. "evm.bytecode.object") and split/chain accordingly

- Change `objectable` (whatever that means) to `objDirty` - Move `properties` to the top and give it a docstring - Move getter/chain combinators lower, also document

Improve `solc` compatibility

- Rename schema_version import to pkgVersion - Inline copyCustomOptions (not needed separately) - Eliminate contractName default

tcoulter · 2017-05-19T23:13:00Z

#pushit

Great work @gnidan

gnidan · 2017-05-19T23:13:15Z

wheeeee!!!!

gnidan added 3 commits May 15, 2017 14:36

Update object generation re: deployedBytecode

d1014f3

- Add runtimeBytecode -> deployedBytecode normalization - Ensure deployedBytecode starts with 0x in generated output

Fix test for updated Schema interface

a4c39c5

Add initial pass writing json-schema, basic smoketests

022b56d

gnidan commented May 16, 2017

View reviewed changes

gnidan requested a review from tcoulter May 16, 2017 01:07

Define Schema.validateContractObject

724323d

gnidan mentioned this pull request May 16, 2017

Fix tests on develop #4

Closed

gnidan changed the base branch from develop to fix-develop-tests May 16, 2017 18:27

gnidan added 2 commits May 16, 2017 14:32

Migrate solc test from truffle-contract

e1723c0

tcoulter reviewed May 16, 2017

View reviewed changes

gnidan added 7 commits May 18, 2017 15:13

Refactor data normalization process

ed7f0d9

- Use property compatibility / data transformation mapping to identify where data should come from in a uniform manner.

Clean up properties descriptions a bit

c4b4913

Remove address from top-level schema def

c65b110

Clean-up property transformation magic

e3ddc1d

- Change `objectable` (whatever that means) to `objDirty` - Move `properties` to the top and give it a docstring - Move getter/chain combinators lower, also document

Remove meaningless comment block

206e412

Remove manual getter invocations from properties

a7327c1

gnidan mentioned this pull request May 19, 2017

Improve solc compatibility #2

Merged

Merge pull request #2 from trufflesuite/solc-compat

0267103

Improve `solc` compatibility

gnidan changed the title ~~Define json-schema for contract abstractions~~ Formalize schema, standardize on normalization process, and improve solc support May 19, 2017

gnidan changed the base branch from fix-develop-tests to develop May 19, 2017 20:42

gnidan added 4 commits May 19, 2017 17:32

Minor cleanup for normalize

f8db923

- Rename schema_version import to pkgVersion - Inline copyCustomOptions (not needed separately) - Eliminate contractName default

Rename TruffleSchema to TruffleContractSchema

8c05941

Remove usage of generateObject in tests

d74a420

Delete generateObject

598172e

gnidan added 2 commits May 19, 2017 18:09

Make abi as required

aeb6f8a

Make validate synchronous

32cb65f

gnidan merged commit ee6baba into develop May 19, 2017

gnidan deleted the json-schema branch May 19, 2017 23:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Formalize schema, standardize on normalization process, and improve solc support #3

Formalize schema, standardize on normalization process, and improve solc support #3

gnidan commented May 16, 2017 •

edited

Loading

gnidan left a comment

gnidan May 16, 2017

tcoulter May 16, 2017

gnidan May 16, 2017

tcoulter May 16, 2017

gnidan May 16, 2017

tcoulter May 16, 2017

gnidan May 16, 2017

tcoulter May 16, 2017

gnidan May 16, 2017

tcoulter May 16, 2017

gnidan May 16, 2017

gnidan May 16, 2017

tcoulter May 16, 2017

tcoulter May 16, 2017

gnidan May 16, 2017

gnidan commented May 16, 2017 •

edited

Loading

gnidan commented May 16, 2017

gnidan commented May 16, 2017 •

edited

Loading

tcoulter commented May 16, 2017

tcoulter May 16, 2017

gnidan May 16, 2017

tcoulter commented May 19, 2017

gnidan commented May 19, 2017

Formalize schema, standardize on normalization process, and improve solc support #3

Formalize schema, standardize on normalization process, and improve solc support #3

Conversation

gnidan commented May 16, 2017 • edited Loading

Herein:

Normalization approach:

gnidan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gnidan commented May 16, 2017 • edited Loading

gnidan commented May 16, 2017

gnidan commented May 16, 2017 • edited Loading

tcoulter commented May 16, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tcoulter commented May 19, 2017

gnidan commented May 19, 2017

gnidan commented May 16, 2017 •

edited

Loading

gnidan commented May 16, 2017 •

edited

Loading

gnidan commented May 16, 2017 •

edited

Loading