Big rewrite

Lots of stuff here. The main goal is to support the newer JSON Schema drafts (2020-12 and 2019-09) including output formats and annotations. The biggest change is pulling individual keywords into separate classes which contain parsing and validation logic. All drafts now use the same `Schema` class with the new "vocabularies" concept handling behavior differences. Each draft has its own meta schema (meta.rb), vocabularies (vocab.rb), and, if necessary, keyword classes (vocab/*.rb). Most keywords are defined in the latest draft with previous drafts removing/adding things from there. Old drafts (4, 6, and 7) only have a single vocabulary because they predate the concept. `Schema` contains some logic but I tried to keep as much as possible in keyword classes. `Schema` and `Keyword` have a similar interface (`value`, `keyword`, `parent`, etc) and share some code using the `Output` module because it didn't feel quite right to have `Schema` be a subclass of `Keyword`. There are two basic methods for schemas and keywords: `#parse`: parses the provided definition (generates relevant subschemas, side effects, etc). Basically anything that can be done before data validation. `#validate`: iterates through the parsed schema/keywords, validates data, and returns a `Result` object (possibly with nested results). One exception is `Ref`, which doesn't resolve refs at parse time because of a circular dependency when generating meta schemas. Output formats (introduced in 2019-09) are supported via `Result`. I think the only tricky thing there is that nested results are returned as enumerators instead of arrays for performance reasons. This matches the "classic" behavior as well. 2019-09 also introduced "annotations" which are used for some validations (`unevaluatedProperties`, `unevaluatedItems`, etc) and are returned with successful results in a similar format to errors. The "classic" output format drops them to match existing behavior. Notes: - `Location` is used for performance reasons so that JSON pointer resolution can be cached and deferred until output time. - `instance_location` isn't cached between validations because it's possibly unbounded. - `ref_resolver` and `regexp_resolver` are lazily created for performanc reasons. Known breaking changes (so far): - Custom keyword output - `not` and `dependencies` output - Property validation hooks (`before_property_validation` and `after_property_validation`) are now called immediately surrounding `properties` validation. Previously, `before_property_validation` was called before all "object" validations (`dependencies`, `patternProperties`, `additionalProperties`, etc) and `after_property_validation` was called after. Related: - #27 - #44 - #116
davishmcclurg · Aug 19, 2023 · 2c09eb9 · 2c09eb9
1 parent be345af
commit 2c09eb9
Show file tree

Hide file tree

Showing 51 changed files with 4,024 additions and 1,578 deletions.
diff --git a/README.md b/README.md
@@ -2,6 +2,26 @@
 
 JSON Schema validator. Supports drafts 4, 6, and 7.
 
+## Next
+
+- [ ] fixme
+- [ ] readme
+- [ ] coverage
+- [ ] argument passing (instance, instance_location, keyword_location, dynamic_scope, etc)
+- [ ] readOnly, writeOnly: https://github.com/davishmcclurg/json_schemer/issues/55
+- [ ] insert_property_defaults: https://github.com/davishmcclurg/json_schemer/issues/94
+- [ ] short-circuit
+- [ ] openapi
+- [ ] api
+- [ ] breaking changes
+  - [ ] format
+  - [ ] formats
+  - [ ] keywords
+  - [ ] before_property_validation, after_property_validation
+  - [ ] insert_property_defaults
+  - [ ] ref_resolver
+  - [ ] output formats
+
 ## Installation
 
 Add this line to your application's Gemfile:

diff --git a/lib/json_schemer.rb b/lib/json_schemer.rb
@@ -14,35 +14,119 @@
 require 'simpleidn'
 
 require 'json_schemer/version'
+require 'json_schemer/format/duration'
 require 'json_schemer/format/hostname'
+require 'json_schemer/format/json_pointer'
 require 'json_schemer/format/uri_template'
 require 'json_schemer/format/email'
 require 'json_schemer/format'
 require 'json_schemer/errors'
 require 'json_schemer/cached_resolver'
 require 'json_schemer/ecma_regexp'
-require 'json_schemer/schema/base'
-require 'json_schemer/schema/draft4'
-require 'json_schemer/schema/draft6'
-require 'json_schemer/schema/draft7'
+require 'json_schemer/location'
+require 'json_schemer/result'
+require 'json_schemer/output'
+require 'json_schemer/keyword'
+require 'json_schemer/draft202012/meta'
+require 'json_schemer/draft202012/vocab/core'
+require 'json_schemer/draft202012/vocab/applicator'
+require 'json_schemer/draft202012/vocab/unevaluated'
+require 'json_schemer/draft202012/vocab/validation'
+require 'json_schemer/draft202012/vocab/format_annotation'
+require 'json_schemer/draft202012/vocab/format_assertion'
+require 'json_schemer/draft202012/vocab/content'
+require 'json_schemer/draft202012/vocab'
+require 'json_schemer/draft201909/meta'
+require 'json_schemer/draft201909/vocab/core'
+require 'json_schemer/draft201909/vocab/applicator'
+require 'json_schemer/draft201909/vocab'
+require 'json_schemer/draft7/meta'
+require 'json_schemer/draft7/vocab/validation'
+require 'json_schemer/draft7/vocab'
+require 'json_schemer/draft6/meta'
+require 'json_schemer/draft6/vocab'
+require 'json_schemer/draft4/meta'
+require 'json_schemer/draft4/vocab/validation'
+require 'json_schemer/draft4/vocab'
+require 'json_schemer/schema'
 
 module JSONSchemer
   class UnsupportedMetaSchema < StandardError; end
   class UnknownRef < StandardError; end
   class UnknownFormat < StandardError; end
+  class UnknownVocabulary < StandardError; end
+  class UnknownContentEncoding < StandardError; end
+  class UnknownContentMediaType < StandardError; end
+  class UnknownOutputFormat < StandardError; end
   class InvalidRefResolution < StandardError; end
   class InvalidRegexpResolution < StandardError; end
   class InvalidFileURI < StandardError; end
   class InvalidSymbolKey < StandardError; end
   class InvalidEcmaRegexp < StandardError; end
 
-  DEFAULT_SCHEMA_CLASS = Schema::Draft7
-  SCHEMA_CLASS_BY_META_SCHEMA = {
-    'http://json-schema.org/schema#' => Schema::Draft4, # Version-less $schema deprecated after Draft 4
-    'http://json-schema.org/draft-04/schema#' => Schema::Draft4,
-    'http://json-schema.org/draft-06/schema#' => Schema::Draft6,
-    'http://json-schema.org/draft-07/schema#' => Schema::Draft7
-  }.freeze
+  VOCABULARIES = {
+    'https://json-schema.org/draft/2020-12/vocab/core' => Draft202012::Vocab::CORE,
+    'https://json-schema.org/draft/2020-12/vocab/applicator' => Draft202012::Vocab::APPLICATOR,
+    'https://json-schema.org/draft/2020-12/vocab/unevaluated' => Draft202012::Vocab::UNEVALUATED,
+    'https://json-schema.org/draft/2020-12/vocab/validation' => Draft202012::Vocab::VALIDATION,
+    'https://json-schema.org/draft/2020-12/vocab/format-annotation' => Draft202012::Vocab::FORMAT_ANNOTATION,
+    'https://json-schema.org/draft/2020-12/vocab/format-assertion' => Draft202012::Vocab::FORMAT_ASSERTION,
+    'https://json-schema.org/draft/2020-12/vocab/content' => Draft202012::Vocab::CONTENT,
+    'https://json-schema.org/draft/2020-12/vocab/meta-data' => Draft202012::Vocab::META_DATA,
+
+    'https://json-schema.org/draft/2019-09/vocab/core' => Draft201909::Vocab::CORE,
+    'https://json-schema.org/draft/2019-09/vocab/applicator' => Draft201909::Vocab::APPLICATOR,
+    'https://json-schema.org/draft/2019-09/vocab/validation' => Draft201909::Vocab::VALIDATION,
+    'https://json-schema.org/draft/2019-09/vocab/format' => Draft201909::Vocab::FORMAT,
+    'https://json-schema.org/draft/2019-09/vocab/content' => Draft201909::Vocab::CONTENT,
+    'https://json-schema.org/draft/2019-09/vocab/meta-data' => Draft201909::Vocab::META_DATA,
+
+    'json-schemer://draft7' => Draft7::Vocab::ALL,
+    'json-schemer://draft6' => Draft6::Vocab::ALL,
+    'json-schemer://draft4' => Draft4::Vocab::ALL
+  }
+  VOCABULARY_ORDER = VOCABULARIES.transform_values.with_index { |_vocabulary, index| index }
+
+  DRAFT202012 = Schema.new(
+    Draft202012::SCHEMA,
+    :base_uri => Draft202012::BASE_URI,
+    :ref_resolver => Draft202012::Meta::SCHEMAS.to_proc,
+    :regexp_resolver => 'ecma'
+  )
+
+  DRAFT201909 = Schema.new(
+    Draft201909::SCHEMA,
+    :base_uri => Draft201909::BASE_URI,
+    :ref_resolver => Draft201909::Meta::SCHEMAS.to_proc,
+    :regexp_resolver => 'ecma'
+  )
+
+  DRAFT7 = Schema.new(
+    Draft7::SCHEMA,
+    :vocabulary => { 'json-schemer://draft7' => true },
+    :base_uri => Draft7::BASE_URI,
+    :regexp_resolver => 'ecma'
+  )
+
+  DRAFT6 = Schema.new(
+    Draft6::SCHEMA,
+    :vocabulary => { 'json-schemer://draft6' => true },
+    :base_uri => Draft6::BASE_URI,
+    :regexp_resolver => 'ecma'
+  )
+
+  DRAFT4 = Schema.new(
+    Draft4::SCHEMA,
+    :vocabulary => { 'json-schemer://draft4' => true },
+    :base_uri => Draft4::BASE_URI,
+    :regexp_resolver => 'ecma'
+  )
+
+  META_SCHEMAS_BY_BASE_URI_STR = [DRAFT202012, DRAFT201909, DRAFT7, DRAFT6, DRAFT4].each_with_object({}) do |meta_schema, out|
+    out[meta_schema.base_uri.to_s] = meta_schema
+  end
+  META_SCHEMAS_BY_BASE_URI_STR['http://json-schema.org/schema#'] = DRAFT4 # version-less $schema deprecated after Draft 4
+  META_SCHEMAS_BY_BASE_URI_STR.freeze
 
   WINDOWS_URI_PATH_REGEX = /\A\/[a-z]:/i
 
@@ -55,7 +139,7 @@ class InvalidEcmaRegexp < StandardError; end
   end
 
   class << self
-    def schema(schema, default_schema_class: DEFAULT_SCHEMA_CLASS, **options)
+    def schema(schema, meta_schema: DRAFT202012, **options)
       case schema
       when String
         schema = JSON.parse(schema)
@@ -70,23 +154,18 @@ def schema(schema, default_schema_class: DEFAULT_SCHEMA_CLASS, **options)
           ref_resolver.call(base_uri)
         end
       end
-
-      schema_class = if schema.is_a?(Hash) && schema.key?('$schema')
-        meta_schema = schema.fetch('$schema')
-        SCHEMA_CLASS_BY_META_SCHEMA[meta_schema] || raise(UnsupportedMetaSchema, meta_schema)
-      else
-        default_schema_class
+      unless meta_schema.is_a?(Schema)
+        meta_schema = META_SCHEMAS_BY_BASE_URI_STR[meta_schema] || raise(UnsupportedMetaSchema, meta_schema)
       end
-
-      schema_class.new(schema, **options)
+      Schema.new(schema, :meta_schema => meta_schema, **options)
     end
 
-    def valid_schema?(schema, default_schema_class: DEFAULT_SCHEMA_CLASS)
-      schema(schema, default_schema_class: default_schema_class).valid_schema?
+    def valid_schema?(schema, **options)
+      schema(schema, **options).valid_schema?
     end
 
-    def validate_schema(schema, default_schema_class: DEFAULT_SCHEMA_CLASS)
-      schema(schema, default_schema_class: default_schema_class).validate_schema
+    def validate_schema(schema, **options)
+      schema(schema, **options).validate_schema
     end
   end
 end